![]() |
|
#1
|
|||
|
|||
Opinions re best way to approach this?I'm doing a project that is to implement a dictionary text file, each word in alphabetical order on one line. When the user enters text, the program should compare each word to the dictionary and if the word is not encountered, show the user the two nearest words (before and after if possible) and give the user the choice of adding it to the file. I can see two basic ways of implementing this (note - I'm not up to classes, links, nodes, trees, queues, all that advanced stuff yet.).
One way would be to have the program first compare the new word to the text file only for equality. If the word was not found, the program would then go to the beginning of the list to compare for <, to find where in the list to insert the new word. The other way would be to do only one comparison pass, checking each word first for equality and then for <. Assuming that there's an equal chance of the word falling anywhere in the alphabet, and using as an example a text file consisting of only 10 words, the first way would do a comparison pass for == of 10 words, and then do a comparison of 5 words on average for <. The last pass would also involve swapping the last word checked into a temp variable (to keep it in memory for display). So a total of 15 comparisons and 5 exchanges if the word was not found, or an average of 5 comparisons if the word was found. The second way would involve an average of 5 comparisons for ==, 5 comparisons for <, and 5 exchanges whether the word was found or not. So (finally), my question is - how to decide which approach is better? I'm leaning towards the first, as my gut tells me that most words in use are fairly common, and that while the program may be slower initially, as it adds more words, the speed would be faster. I'm sure there must be a ton of research on this subject which quite honestly isn't worth my investigating for this minor project. Just curious as to your opinions on how to approach a problem like this. |
|
#2
|
||||
|
||||
Re: Opinions re best way to approach this?Are you allowed to read the entire dictionary file into memory before testing words?
__________________
Age is unimportant -- except in cheese |
|
#3
|
|||
|
|||
Re: Opinions re best way to approach this?Quote:
Quote:
|
|
#4
|
||||
|
||||
Re: Opinions re best way to approach this?Then I'd read the entire dict file into a string array.
Then use a binary search to find the word. If the word is not found, leave index into the dict array on the word following the word being tested. It will now be easy to rewrite the file and add the word to the list. __________________
Age is unimportant -- except in cheese |
|
#5
|
|||
|
|||
Re: Opinions re best way to approach this?Quote:
To read the file into an array, I have to know the number of words, yes? So I would have to either test the file for number of lines and use that to set the array size, or write the number of lines at the beginning of the text file? |
|
#6
|
|||
|
|||
Re: Opinions re best way to approach this?Quote:
|
|
#7
|
||||
|
||||
Re: Opinions re best way to approach this?Or have an idea what the maximum number of words you need should be. Maybe 10% or 20% more than in the dictionary file. Guesstimate the size of the file, round up, and add more empty space for adding new words. Make the string array large to be safe.
__________________
Age is unimportant -- except in cheese |
|
#8
|
|||
|
|||
Re: Opinions re best way to approach this?OK, feast your eyes on this gorgeous piece of codeage. It works, with one odd misbehaviour. Walt, as I said earlier, I'm not up to trees or binary searches or all that good stuff, so this is a kludgy beginner's way of implementing it.
Basically, I have both a main dictionary file and a temp dictionary file. I read from the main file and compare my new word with the words in the file. If I find it, I get the next word. If I don't, I open the temp file for output and start comparing the new word with the main file again and find the two closest words. If the user wants to add the word, I start writing from the main file to the temp file, inserting the new word in its correct position. After the temp file has been written, I close it for output, reopen it for input, close the main file for input, reopen it for output, transfer the temp file to the main file, close the temp file, close the main file for output and reopen it for input. The one odd piece of misbehaviour is that if and only if I'm adding a new word that's higher than the last word in the dictionary, the error checking at the very end reports that the main file has not reopened for input. Yet I can continue to add words to the dictionary and everything else seems fine. Any ideas on why? Any comments (besides "learn about binary searches and rewrite this crap!!!)? CPP / C++ / C Code:
Quote:
|
Recent GIDBlog
Observations of Iraq by crystalattice
| Thread Tools | Search this Thread |
| Rate This Thread | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Opinions on "Idiots Guide to a career as a computer programmer"? | earachefl | Miscellaneous Programming Forum | 0 | 12-Mar-2006 20:59 |
| New Site-Looking for your opinions | blelisa | Websites Reviewed Forum | 6 | 12-Jun-2005 10:12 |
| Seeking ideas and opinions for a new site | Div | Web Design Forum | 1 | 13-Jan-2005 12:43 |
| Any opinions on the Brother HL-1435 laser printer? | Div | Computer Hardware Forum | 0 | 24-Feb-2004 14:23 |
Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The