lundslaktare@[EMAIL PROTECTED]
said:
> Maybe this is the wrong group, if so I would like to be pointed
> to a better group.
If you were writing such a program yourself in the C language, and there
were some aspect of C that was puzzling you, this would be the right
group. It seems, however, that you want to find an existing program that
already meets your requirements, rather than write it yourself.
There's nothing wrong with that, but - as you suspected - this group isn't
intended to meet that need. There are many groups in the comp.sources
hierarchy, however, and it may well be that one of those can supply your
need.
If you /were/ planning to write such a program yourself, you would want to
start off by tackling the most difficult problems first. These are:
1) sorting out the logic behind punctuation. You want to treat
ques
-tion
as one hit for question and one hit for the hyphen, which is clear enough
and not too difficult. But what about "didn't"? Does the ' character count
as a separate hit? And what about "fo'c'sle"? (That's a single word,
according to the dictionary.) These decisions aren't difficult to make,
but they do have to be made. Once you've made the decisions, you would
need to write a parser that can enact them.
2) storage. You appear to want your output to be sorted, so you'd need to
think about a container that can regurgitate data in sorted order after
processing. A binary search tree is the obvious choice, but a hash table
might be faster if you didn't actually need the sorting. You also need to
think about whether you want to be able to handle arbitrarily long words.
If so, you'll need to handle the memory requirements yourself, using
malloc.
Although it sounds like a simple enough program, you have introduced
enough
complications to make it quite an interesting programming exercise for
someone with a year or two of C experience.
If you change your mind and decide to write it yourself in C, we can
certainly help you with it. Otherwise, good luck in comp.sources.* (and
you're going to need it, partly because I believe those groups aren't very
popular with source providers, and partly because your requirements are
sufficiently different from a normal concordance program to make it quite
unlikely that someone can meet your needs without actually writing a
program just for you).
HTH. HAND.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www.
+rjh@[EMAIL PROTECTED]
users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999


|