GIDForums  

Go Back   GIDForums > Computer Programming Forums > C++ Forum
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #1  
Old 01-Dec-2003, 15:44
captnsaj captnsaj is offline
New Member
 
Join Date: Dec 2003
Posts: 1
captnsaj is an unknown quantity at this point

Speed up compare algorithm


I am trying to compare lists from two files so I can extract strings that are in the second list, but NOT in the first list. I have it working, but its sooooo slow. The reason is because I compare each string in the first list with each in the second. To make matters worst, I want to be able to eliminate partial matches (i.e. 27 out of 30 characters). If it were a matter of exact matches, I would use binary search and have it done within seconds. But its not.
This is what I have so far:
CPP / C++ / C Code:
int mismatches = 0; 
int found = 0; 
inf >> seq; // from file1 

for(int i = 0; i < list_n; i++) 
{ 
if(list_pos[i]) //list_pos used to keep track of matches 
{ 
for(int j = 0; j < SEQ_SIZE; j++) 
{ 
if(list[i][j] != seq[j]) 
// list is array of strings from file2 
{ 
mismatches++; 
} 

if(mismatches == (max_matches+1)) 
{ 
goto CONTINUE; 
} 
} 

list_pos[i] = 0; 
found++; 

CONTINUE: 
mismatches = 0; 
} 
} 

return found; 

The program goes on to output to a third file the strings from file2 that still have 1 in list_pos (0 means that an exact or partial match was found).
Anybody have any ideas on an alogorithm I can use to speed this up? I am planning on using this algorithms with DNA sequences, which means lists in the millions.
Thanks
  #2  
Old 27-Jan-2004, 21:21
dsmith's Avatar
dsmith dsmith is offline
Senior Member
 
Join Date: Jan 2004
Location: Utah, USA
Posts: 1,351
dsmith is a glorious beacon of lightdsmith is a glorious beacon of lightdsmith is a glorious beacon of lightdsmith is a glorious beacon of lightdsmith is a glorious beacon of light
Did you ever find a solution to speed up your routine? I wouldn't mind taking a stab at it, although I can't promise anything. In order to look at it though, it would be helpful to have a small file just to compare times.

Let me know.
  #3  
Old 04-Feb-2004, 15:19
tay's Avatar
tay tay is offline
Junior Member
 
Join Date: Jan 2004
Posts: 77
tay will become famous soon enough
i got do 1 assign is similar to ur program
but it is read from text file1 compare to another text file2
see whether they are same is same save it at another text file3

this assign is call search engine
the only different with u program is i read from text file

cause here canot upload file so i cannot show u
if u want it i can email to u
leave ur email here i will email to u

or u message me
taychinghwa@hotmail.com

or yahoomessager id: taychinghwa
  #4  
Old 04-Feb-2004, 15:44
JdS's Avatar
JdS JdS is offline
Senior Member
 
Join Date: Aug 2001
Location: KUL, Malaysia
Posts: 3,371
JdS will become famous soon enough
Quote:
Originally Posted by tay
...cause here canot upload file so i cannot show u...

Hello tay,

What seems to be the problem exactly when you try to upload a file? You should be able to this (I just checked the permissions). What kind of file were you trying to upload? To be certain your upload will work, try naming or having the file extension as .txt
  #5  
Old 04-Feb-2004, 15:50
tay's Avatar
tay tay is offline
Junior Member
 
Join Date: Jan 2004
Posts: 77
tay will become famous soon enough
u all can try it out
Attached Files
File Type: zip file.zip (27.2 KB, 120 views)
 
 

Recent GIDBlogWriting a book by crystalattice

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Alexa testing site speed - Desilva.biz in top 1% ! jrobbio Web Hosting Forum 5 17-Jan-2004 19:30
STL (algorithm): for_each problem yosep C++ Forum 1 03-Nov-2003 12:36
algorithm question calculus87 C Programming Language 1 11-Oct-2003 10:24
GeoURL ICBM Address Server aka where is your site based and compare it to others jrobbio Advertising & Affiliates Forum 0 04-Apr-2003 15:28
Could you test the speed of this domain for me jrobbio Websites Reviewed Forum 8 21-Jan-2003 15:24

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 08:37.


vBulletin, Copyright © 2000 - 2008, Jelsoft Enterprises Ltd.