GIDForums  

Go Back   GIDForums > Computer Programming Forums > Miscellaneous Programming Forum
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #1  
Old 19-Aug-2007, 00:39
harris2107 harris2107 is offline
New Member
 
Join Date: Aug 2007
Posts: 6
harris2107 is on a distinguished road

extraction using AWK or SED


Hi all,
I have a file (tnsnames.ora in Oracle) from which i have to extract the values of two fields. Suppose following is the content of my file called "inputfile.txt"

(ADDRESS = ( PROTOCOL = TCP ) ( HOST = 1.1 ) ( PORT = 1521 ))
(ADDRESS=(PROTOCOL=TCP)(HOST=2.1)(PORT=1521))
(ADDRESS = ( PROTOCOL = TCP ) ( HOST = 3.1 ) ( PORT = 1521 ))

I want to extract the value of HOST from this file and assin it to an array. For eg. from the above file, i need to get the HOST values and store in the array like

ip[1]=1.1
ip[2]=2.1
ip[3]=3.1

That is i want to extract just the values of HOST from all the lines in the file given. Remember the format is the same but there are spaces in the first and third row but not in the second row. Any help would be greatly appreciated.

Thank you,
Harris.
  #2  
Old 19-Aug-2007, 09:24
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,310
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold

Re: extraction using AWK or SED


Quote:
Originally Posted by harris2107

I want to extract the value of HOST from this file and assin it to an array.

The reason that I like problems like this is that there are about a million ways (maybe more) of doing it, but if you can't figure out the first way, well...you're stuck.

On the other hand, if someone points you in the direction of a way of doing it, you will (probably) immediately say, "OK, but I think a "better" way would be..."

All of the ways that come to (my) mind involve regular expressions. I really like regular expressions, especially used in shell command lines. (See footnote.)

The title of this thread indicates your predilection for awk or sed, so I am sure that you are aware of regular expressions.

Awk is good at regular expressions and so is sed. I like awk for certain applications, like when I want to select arguments that are predictably set off by spaces. In this case they are not, so you could just use awk's regular expression matching and substitution on whole lines of the file.

Or, you could just use sed, as I will show. (My examples are for bash, which is the default shell for my Linux distribution.)

I did it in two steps (maybe you would like to try to combine them, but I'll keep it the way that I thought of it):

1. Eliminate everything up to and including the value of the HOST field.

Put the following in a script file (or just enter the sed command line at a bash shell command prompt):

Code:
echo 'First step:' echo ' Use sed to eliminate everything before the values of HOST =' echo 'sed -e "s/.*HOST *= *//" inputfile.txt' sed -e "s/.*HOST *= *//" inputfile.txt

With "inputfile.txt" containing the three lines of your example, here is the result:
Code:
First step: Use sed to eliminate everything before the values of HOST = sed -e "s/.*HOST *= *//" inputfile.txt 1.1 ) ( PORT = 1521 )) 2.1)(PORT=1521)) 3.1 ) ( PORT = 1521 ))

2. Now, eliminate everything after the numeric field:
Code:
echo 'Second step:' echo ' Use sed with a second "-e" to eliminate everything after the numerical value' echo 'sed -e "s/.*HOST *= *//" -e "s/ *).*$//" inputfile.txt' sed -e "s/.*HOST *= *//" -e "s/ *).*$//" inputfile.txt

Output:
Code:
Second step: Use sed with a second "-e" to eliminate everything after the numerical value sed -e "s/.*HOST *= *//" -e "s/ *).*$//" inputfile.txt 1.1 2.1 3.1

Now if you want to create some textual output as you indicated in your post, you could do something like:
Code:
echo 'Third step:' echo ' Make a loop that extracts them one at a time' count=1 for i in $(sed -e "s/.*HOST *=//" -e "s/ *).*$//" inputfile.txt -e "s/ *//") do echo "ip[$count] = $i" count=$[$count+1] done

Output:
Code:
Third step: Make a loop that extracts them one at a time ip[1] = 1.1 ip[2] = 2.1 ip[3] = 3.1

Regards,

Dave

Footnote: Everyone has heard of the "thought experiment" where one would put an infinite number of monkeys in a room with typewriters (it's a very old experiment). The idea was that there is a non-zero probability that at least one would end up typing the complete works of Shakespeare.

In fact an experiment was carried out recently with only 50 monkeys . They were put into a room with keyboards connected to computers running Linux. Of course there wasn't much hope of seeing the complete works, but maybe one would come up with, perhaps, a single line of some Shakespeare play or sonnet.

Upon checking after about 10 minutes: no Shakespeare was evident, but every single one had come up with several valid Linux shell commands.
---davekw7x
  #3  
Old 19-Aug-2007, 19:50
harris2107 harris2107 is offline
New Member
 
Join Date: Aug 2007
Posts: 6
harris2107 is on a distinguished road

Re: extraction using AWK or SED


Thank you so much sir, you have been of great help.

You are the saviour of a newbie job.

Thanks a zillion!!!!!!!!!!

Harris.
 
 

Recent GIDBlogProblems with the Navy (Officers) by crystalattice

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
? re input from stringstream earachefl C++ Forum 3 01-May-2006 15:54
How to count lines in a file without extraction? earachefl C++ Forum 6 21-Apr-2006 15:46

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 14:56.


vBulletin, Copyright © 2000 - 2010, Jelsoft Enterprises Ltd.