GIDForums  

Go Back   GIDForums > Computer Forums > Computer Software Forum - Windows
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #1  
Old 20-Jun-2009, 17:30
The_Carlos The_Carlos is offline
New Member
 
Join Date: Jun 2009
Posts: 2
The_Carlos is on a distinguished road

Program for doing multiple find and replaces on Microsoft's Word


Hi there

I want to find or make a program that can do multiple find and replace searches for a microsoft word document.

The documents will be written in Latin and i want to partially translate them to english. It is not a narrative but a catelog of expenditures, generally in the form of x gives y to z, with x and z being people/towns and y being a number.

Because the same names/towns appear frequently and the document is very formulaic and not narrative, word order and syntax should not be an issue.

So how do i create a program which goes somewhere along the lines of

1. Find all Ricardus' and Replace with Richard
2. Find all Londiniums and Replace with London
3. etc etc

with potentially hundreds of these searches. Or is there a program that already does this?

Also i would like to do this for more documents after this first one so thats why a program would be better than just using the find and replace function found in word.

Many Thanks for any and all replies
  #2  
Old 21-Jun-2009, 03:03
crystalattice's Avatar
crystalattice crystalattice is offline
Aspiring author
 
Join Date: Apr 2004
Location: Japan (again)
Posts: 1,627
crystalattice is just really nicecrystalattice is just really nicecrystalattice is just really nicecrystalattice is just really nicecrystalattice is just really nice

Re: Program for doing multiple find and replace's on microsoft word


I'm sure several programs exist but I can't think of any offhand. Most people simply use the "Find/Replace All" feature of Word.

However, I understand how much easier it would be to have a program that could do it for you. The problem comes with how it works: is it interactive (requiring the user to enter the search term) or programmatic (using a built-in dictionary of terms)?

Obviously, Word uses an interactive method. If you want to control the words that are being searched for, this is probably your easiest bet.

If you need/want to create a dictionary, then making a program might be easier. In general terms, you would create the dictionary (either as a list or database) and use the programming language's search function to scan the file for the desired words, automatically replacing each found value. If you want more control, you can use regular expressions to "fine tune" the search, depending on whether different spellings may occur.

If you want to program it yourself, we can get down to nitty-gritty features of the actual code.
__________________
Start Programming with Python-A beginner's guide to programming and the Python language.
-------------
Common Sense v2.0-Striving to make the world a little bit smarter.
  #3  
Old 21-Jun-2009, 06:20
The_Carlos The_Carlos is offline
New Member
 
Join Date: Jun 2009
Posts: 2
The_Carlos is on a distinguished road

Re: Program for doing multiple find and replace's on microsoft word


Thanks crystalattice

I think from what you describe i am looking for a dictionary type program.

I want to create a list of latin words/sentences which should be found and replaced by their english equivalent.

But once a word/sentence has been put into the program, the next time you want to use the program you dont need to put in that word/sentence again, it should remember it.

Here is a simplistic example:

I create a program that does the following find and replaces

1. Find Ricardus and Replace with Richard
2. Find dat and replace with gives
3. Find xx and replaces with 20
4. Find In. tres lib. and replace with to the treasury and is quit.

I then use the program to translate the following sentence

Ricardus dat xx marks In. tres lib.

After the program has been used it should read

Richard gives 20 marks to the treasury and is quit

Now i then want to translate the following sentence

Ricardus de hexham dat xx marks In tres lib.

So all i want to do is add to the program

5. Find de hexham and replace with of hexham.

I then click on the translate button on the program and it gives me

Richard of Hexham gives 20 marks to the treasury and is quit.

This is an extremely simplistic example as i would never just be translating one sentence. The idea is to have a 60+ page word document so that function 1. Find Ricardus and replace with Richard will do this change to all the ricardus' in the entire 60+ pages. Once i put in a large number of functions (mainly place names, peoples names and numbers) a large part of the document should be translated. I would then like to use the program on many other 60+ word documents with the program retaining all the functions that i put into it.

I would gladly like to get into the nitty-gritty of the programming but bear in mind i have never really done any programming before, so you will need to really dumb down any technical language used.

Thanks for replying.
  #4  
Old 25-Jun-2009, 04:11
crystalattice's Avatar
crystalattice crystalattice is offline
Aspiring author
 
Join Date: Apr 2004
Location: Japan (again)
Posts: 1,627
crystalattice is just really nicecrystalattice is just really nicecrystalattice is just really nicecrystalattice is just really nicecrystalattice is just really nice

Re: Program for doing multiple find and replaces on Microsoft's Word


Before you start programming, you need to figure out which programming language you want to use. My personal preference is Python, but you can use nearly any language you want.

You'll want to get comfortable with the language nuances before you try programming this program; just do some simple problems to figure out how everything works, like textbook problems. Make sure you have a good handle on text processing, since that will be the main purpose of your program.

In a general sense, you'll make a database of words/phrases, with the Latin version and its corresponding English version. In Python, this would be a dictionary (a hash in other languages). The program would run through the file, looking for a word/phrase and automatically replace it with the dictionary's equivalent value.

Making a dictionary this way is good, because it's an inherent part of Python and has been optimized over the years. It's also easy to add, remove, and otherwise modify values; you could even make it interactive if you didn't want to keep digging into the source code every time.

When working with the actual document file, it will be easiest to use when it's a raw text file rather an a Word or other "formatted" document. This is because you won't have to worry about messing with the word processors programming interfaces to get to the document; you can work directly with the file.

You also want to devise a method so the entire document isn't read into memory at once. Python has a function that will read each line of a file into memory at a time, processing as it goes. If you have enough RAM, you can read it into memory but it's safer to not do that, unless the file is sufficiently small.

That's most of the big things I can think of right now; I'm sure there are some things I'm not thinking of. Right now though, you just need to pick a language and start practicing until you have a good enough grasp to start on your program.
__________________
Start Programming with Python-A beginner's guide to programming and the Python language.
-------------
Common Sense v2.0-Striving to make the world a little bit smarter.
 
 

Recent GIDBlogReview: Gel laptop cooling pad by crystalattice

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 17:04.


vBulletin, Copyright © 2000 - 2009, Jelsoft Enterprises Ltd.