GIDForums  

Go Back   GIDForums > Computer Programming Forums > C++ Forum
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #1  
Old 23-Apr-2006, 12:57
earachefl earachefl is offline
Member
 
Join Date: Feb 2006
Posts: 178
earachefl is on a distinguished road

Opinions re best way to approach this?


I'm doing a project that is to implement a dictionary text file, each word in alphabetical order on one line. When the user enters text, the program should compare each word to the dictionary and if the word is not encountered, show the user the two nearest words (before and after if possible) and give the user the choice of adding it to the file. I can see two basic ways of implementing this (note - I'm not up to classes, links, nodes, trees, queues, all that advanced stuff yet.).

One way would be to have the program first compare the new word to the text file only for equality. If the word was not found, the program would then go to the beginning of the list to compare for <, to find where in the list to insert the new word.

The other way would be to do only one comparison pass, checking each word first for equality and then for <.

Assuming that there's an equal chance of the word falling anywhere in the alphabet, and using as an example a text file consisting of only 10 words, the first way would do a comparison pass for == of 10 words, and then do a comparison of 5 words on average for <. The last pass would also involve swapping the last word checked into a temp variable (to keep it in memory for display). So a total of 15 comparisons and 5 exchanges if the word was not found, or an average of 5 comparisons if the word was found.

The second way would involve an average of 5 comparisons for ==, 5 comparisons for <, and 5 exchanges whether the word was found or not.

So (finally), my question is - how to decide which approach is better? I'm leaning towards the first, as my gut tells me that most words in use are fairly common, and that while the program may be slower initially, as it adds more words, the speed would be faster. I'm sure there must be a ton of research on this subject which quite honestly isn't worth my investigating for this minor project. Just curious as to your opinions on how to approach a problem like this.
  #2  
Old 23-Apr-2006, 13:00
WaltP's Avatar
WaltP WaltP is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Midwest US
Posts: 3,243
WaltP is a name known to allWaltP is a name known to allWaltP is a name known to allWaltP is a name known to allWaltP is a name known to allWaltP is a name known to all

Re: Opinions re best way to approach this?


Are you allowed to read the entire dictionary file into memory before testing words?
__________________

Age is unimportant -- except in cheese
  #3  
Old 23-Apr-2006, 13:05
earachefl earachefl is offline
Member
 
Join Date: Feb 2006
Posts: 178
earachefl is on a distinguished road

Re: Opinions re best way to approach this?


Quote:
Originally Posted by WaltP
Are you allowed to read the entire dictionary file into memory before testing words?
Here's the entire assignment.

Quote:
Write a program to manage a dictionary. You dictionary should be stored on a text file and consist of an alphabetized list of words, one per line. When a user enters a word, scan the dict. looking for the word. If the word is in the dictionary say so (<not sure why you'd ever want to do that except for error chdeking>). If not, display the dict. word immediately preceding and immediately following. Then ask whether the user wants to add this new word to the dict. If yes, do so and go back to request the next word.

To insert a word into a file in alpha order, copy the file to a new, temp file and move words one at a time from this temp file back to the orig. file, inserting the new word when you reach its correct position alphabetically.
  #4  
Old 23-Apr-2006, 13:18
WaltP's Avatar
WaltP WaltP is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Midwest US
Posts: 3,243
WaltP is a name known to allWaltP is a name known to allWaltP is a name known to allWaltP is a name known to allWaltP is a name known to allWaltP is a name known to all

Re: Opinions re best way to approach this?


Then I'd read the entire dict file into a string array.
Then use a binary search to find the word.
If the word is not found, leave index into the dict array on the word following the word being tested.
It will now be easy to rewrite the file and add the word to the list.
__________________

Age is unimportant -- except in cheese
  #5  
Old 23-Apr-2006, 13:22
earachefl earachefl is offline
Member
 
Join Date: Feb 2006
Posts: 178
earachefl is on a distinguished road

Re: Opinions re best way to approach this?


Quote:
Originally Posted by WaltP
Then I'd read the entire dict file into a string array.
Then use a binary search to find the word.
If the word is not found, leave index into the dict array on the word following the word being tested.
It will now be easy to rewrite the file and add the word to the list.

To read the file into an array, I have to know the number of words, yes? So I would have to either test the file for number of lines and use that to set the array size, or write the number of lines at the beginning of the text file?
  #6  
Old 23-Apr-2006, 13:30
earachefl earachefl is offline
Member
 
Join Date: Feb 2006
Posts: 178
earachefl is on a distinguished road

Re: Opinions re best way to approach this?


Quote:
Originally Posted by WaltP
Then I'd read the entire dict file into a string array.
Then use a binary search to find the word.
If the word is not found, leave index into the dict array on the word following the word being tested.
It will now be easy to rewrite the file and add the word to the list.
Walt, as I said in my original post, I'm not up to trees (and therefore binary searches) - that's covered in chapter 13(Pointers and dynamic structures) of my textbook, and I'm finishing chapter 8(Streams and files). So I have a ways to go....
  #7  
Old 23-Apr-2006, 13:30
WaltP's Avatar
WaltP WaltP is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Midwest US
Posts: 3,243
WaltP is a name known to allWaltP is a name known to allWaltP is a name known to allWaltP is a name known to allWaltP is a name known to allWaltP is a name known to all

Re: Opinions re best way to approach this?


Or have an idea what the maximum number of words you need should be. Maybe 10% or 20% more than in the dictionary file. Guesstimate the size of the file, round up, and add more empty space for adding new words. Make the string array large to be safe.
__________________

Age is unimportant -- except in cheese
  #8  
Old 23-Apr-2006, 21:58
earachefl earachefl is offline
Member
 
Join Date: Feb 2006
Posts: 178
earachefl is on a distinguished road

Re: Opinions re best way to approach this?


OK, feast your eyes on this gorgeous piece of codeage. It works, with one odd misbehaviour. Walt, as I said earlier, I'm not up to trees or binary searches or all that good stuff, so this is a kludgy beginner's way of implementing it.
Basically, I have both a main dictionary file and a temp dictionary file. I read from the main file and compare my new word with the words in the file. If I find it, I get the next word. If I don't, I open the temp file for output and start comparing the new word with the main file again and find the two closest words. If the user wants to add the word, I start writing from the main file to the temp file, inserting the new word in its correct position. After the temp file has been written, I close it for output, reopen it for input, close the main file for input, reopen it for output, transfer the temp file to the main file, close the temp file, close the main file for output and reopen it for input.

The one odd piece of misbehaviour is that if and only if I'm adding a new word that's higher than the last word in the dictionary, the error checking at the very end reports that the main file has not reopened for input. Yet I can continue to add words to the dictionary and everything else seems fine.
Any ideas on why? Any comments (besides "learn about binary searches and rewrite this crap!!!)?

CPP / C++ / C Code:
//A program to manage a dictionary
#include <iostream>
#include <fstream>
#include <string>
#include <cctype>
using namespace std;

#define inFile "/Users/me/Documents/diction.txt"          //Main dictionary file
#define outFile "/Users/me/Documents/diction.txt"         //for both reading and writing
#define tempInFile "/Users/me/Documents/tempDict.txt"     //Temporary dictionary file
#define tempOutFile "/Users/me/Documents/tempDict.txt"    //for both reading and writing

//Function prototypes
void checkEqual (ifstream&, const string);
void replaceWord (ifstream&, const string);

int main () {

    string nextWord;                                //input - new word
    ifstream ins;                                   //create ifstream object ins
    ins.open(inFile);                               //associate ins with inFile (main dictionary)
    if (ins.fail()) {                               //error check
        cerr << "Error opening input file." << endl;
        return EXIT_FAILURE;
    }
    cout << "Enter text.  If the dictionary finds a word, you will be informed."
         << endl << "If the dictionary does not find a word, the nearest words will"
         << endl << "be displayed and you will be asked if you want to add the word"
         << endl << "to the dictionary. Enter 'Q' to quit." << endl;
    cin >> nextWord;                                //Get new word
    while (nextWord != "Q") {                       //while user doesn't enter 'Q'
        checkEqual (ins, nextWord);                 //call function checkEqual
        cout << "Enter next word" << endl;          //on return of function,
        cin >> nextWord;                            //enter next word
    }                                               //end while loop
    //Assertion: user has entered 'Q'
    ins.close();                                    //close inFile

    return 0;
}

//Check new word for match in dictionary file
//Pre: new word has been entered
void checkEqual (ifstream& ins, const string nextWord)
{
    string savedWord;                               //input - words in dictionary
    bool wordFound = false;                         //Is there a match?
    ins.clear();                                    //Reset eof flag and failbit
    ins.seekg(0);                                   //Reset pointer to start of ins
    while (ins >> savedWord) {                      //While there are words in ins, get the next word
        if (nextWord == savedWord) {                //If there is a match
            wordFound = true;                       //set wordFound to true
            cout << "Word " << nextWord << " found in dictionary." << endl;
            break;                                  //break out of loop - word has been found
        }
    }                                               //end of while loop - no more words in dictionary
    if (wordFound == false) {                       //If word is not found
        replaceWord(ins, nextWord);                 //call replaceWord function
    }
}   //End of function checkEqual, return control to main

//Find nearest words, ask user whether to add word, replace word if answer is yes
//Pre: new word has not been found in dictionary
void replaceWord (ifstream& ins, const string newWord)
{
    string preWord = "aaaaa", postWord;             //Temporary variables to hold closest words to new word
    ins.clear();                                    //Reset eof flag and failbit in ins
    ins.seekg(0);                                   //Reset pointer to start of ins
    
    while (ins >> postWord) {                       //While words are in dictionary stream, get the next
        if (newWord < postWord) {                   //If new word smaller than first word in dictionary
            break;                                  //break out of loop - position has been found
        }
        else {                                      //If new word not smaller than first word
            preWord = postWord;                     //put first word in preWord
        }                                           //get another word and compare new word to it
    }                                               //end of while
    //Assertion: no more words in dictionary stream to be compared
    //Ask user if he wants to add new word to dictionary
    //if new word is either before or after last word in dictionary
    if (preWord == "aaaaa" || preWord == postWord) {
        cout << "Word not found.  Nearest word is " 
             << postWord << ". Add " << newWord << " to dictionary?" << endl;
    }
    else {                                          //Otherwise
        cout << "Word not found. Nearest words are "
             << preWord << " and " << postWord << ". Add "
             << newWord << " to dictionary?" << endl;
    }
    
    char agree;                                     //Input
    cin >> agree;                                   //Get character 
    agree = toupper(agree);                         // 'y' or 'Y'
    
    if (agree == 'Y') {                             //If answer is yes
        ofstream tempDict;                          //create output file stream object tempDict
        tempDict.open(tempOutFile);                 //associate tempDict with tempOutFile
        
        if (tempDict.fail()) {                      //error check
            cerr << "Error opening temporary Dictionary file" << endl;
        }
                    
        string tempString;                          //Temporary storage of string
        ins.clear();                                //Reset eof flag and failbit of insstream
        ins.seekg(0);                               //Reset pointer to start of ins
        ins >> postWord;                            //Get first word of dictionary, assign to postWord
        do
        {
            if (newWord < postWord) {               //If the new word is less than first word
                tempDict << newWord;                //Insert new word into temp dictionary file
                tempDict << '\r';                   //Insert newline character
                tempDict << postWord;               //Insert previous first word into temp dictionary
                getline (ins, tempString, '\0');    //Get the rest of the dictionary file
                tempDict << tempString;             //Insert the rest of the file into the temp dictionary
                break;                              //Break out of loop - entire file has been copied
            }                                       
            else {                                  //If new word is not less than first word compared,
                //preWord = postWord;               //put first word in preWord
                tempDict << postWord;               //Insert first word in temp dictionary
                tempDict << '\r';                   //Insert newline character in temp dictionary
            }                                       //Get next word in dictionary, assign to postWord
        } while (ins >> postWord);                  //end do-while loop
        //special case -
        if (newWord > postWord) {                   //if the new word is greater than the last word in dictionary
            tempDict << newWord;                    //add new word to end of dictionary
            tempDict << '\r';                       //add new line char (not sure if necessary)
            cout << newWord << " is new last word in dictionary." << endl;
        }
        //Assertion: all files have been copied to temp dictionary file,
        //new word has been inserted at its proper place alphabetically
        tempDict.close();                           //Close the temporary file for output
        ifstream tempDictionary(tempInFile);        //Reopen temp file for input
        
        if (tempDictionary.fail()) {                //Error check
            cerr << "Error opening temporary Dictionary file" << endl;
        }
        
        ins.close();                                //Close main dictionary file for input
        ofstream outs(outFile);                     //Reopen main file for output
        
        if (outs.fail()) {                          //Error check
            cerr << "Error opening output file." << endl;
        }
                    
        //copy temp dictionary to main dictionary
        getline (tempDictionary, tempString, '\0'); //Get all contents of temp dictionary file, assign to temp string
        outs << tempString;                         //Insert contents of string into main dictionary file
        
        //Assertion: contents of temp dictionary have been copied to main dictionary file
        tempDictionary.close();                     //Close temp dictionary file for input
        outs.close();                               //Close main dictionary file for output
        ins.open(inFile);                           //reopen main dictionary file for input
        
        if (ins.fail()) {                           //error check
            cerr << "<Error re-opening input file>" << endl;
        }
        else {
            cout << "<input file re-opened>" << endl;
        }
    }                                               //end if 
    //Assertion: either new word has been inserted into main dictionary file
    //or user entered any character other than 'y' or 'Y'
}   //End of function replaceWord - return control to function checkEqual
Sample output:

Quote:
[Session started at 2006-04-23 23:52:09 -0400.]
Enter text. If the dictionary finds a word, you will be informed.
If the dictionary does not find a word, the nearest words will
be displayed and you will be asked if you want to add the word
to the dictionary. Enter 'Q' to quit.
jig
Word not found. Nearest words are failure and zanzibar. Add jig to dictionary?
y
<input file re-opened>
Enter next word
zzzzz
Word not found. Nearest word is zanzibar. Add zzzzz to dictionary?
y
zzzzz is new last word in dictionary.
<Error re-opening input file>
Enter next word
zowie
Word not found. Nearest words are zanzibar and zzzzz. Add zowie to dictionary?
y
<input file re-opened>
Enter next word
Q

C8PP9c has exited with status 0.
 
 

Recent GIDBlogObservations of Iraq by crystalattice

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Opinions on "Idiots Guide to a career as a computer programmer"? earachefl Miscellaneous Programming Forum 0 12-Mar-2006 20:59
New Site-Looking for your opinions blelisa Websites Reviewed Forum 6 12-Jun-2005 10:12
Seeking ideas and opinions for a new site Div Web Design Forum 1 13-Jan-2005 12:43
Any opinions on the Brother HL-1435 laser printer? Div Computer Hardware Forum 0 24-Feb-2004 14:23

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 17:05.


vBulletin, Copyright © 2000 - 2008, Jelsoft Enterprises Ltd.