Welcome to Tech Support Forum home to more then 136,000 problems solved. Issues have included: Spyware, Malware, Virus Issues, Windows, Microsoft, Linux, Networking, Security, Hardware, and Gaming Getting your problem solved is as easy as:
1. Registering for a free account
2. Asking your question
3. Receiving an answer

Registered members:
* Get free support
* Communicate privately with other members (PM).
* Removal of this message
* See fewer ads.
* And much more..

 





Want to know how to post a question? click here Having problems with spyware and pop-ups? First Steps
Go Back   Tech Support Forum > Alternative Computing > Linux Support
User Name
Password
Site Map Register Donate Rules Blogs Mark Forums Read

Linux Support Linux - Operating Systems and Applications Support

Reply
 
Thread Tools
Old 05-08-2008, 09:04 PM   #1 (permalink)
Registered User
 
Join Date: Oct 2007
Posts: 45
OS: Windows Vista Business


awk script help

hello. i'm not sure if this is the right place to ask my question, but i want to make an awk script to scan an html file and output all the links (e.g .html, .htm, .jpg, .doc, .pdf, etc..) inside and how many times each one occurs in the file. please anyone who knows help me!
__________________
"Football is not just a game; it is a weapon of the Revolution."
kyris is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Bookmark on Thread SoupReddit!
Reply With Quote
Old 05-09-2008, 09:24 PM   #2 (permalink)
Registered User
 
Join Date: Oct 2007
Location: Littleton, Colorado USA
Posts: 343
OS: xp 64 sp2 Fedora Core 8 (vmware xp core 8 x32) Minix


Re: awk script help

It will be in the form of:

awk '/.html/ {HTML=HTML+1}
/.htm/ {HTM=HTM+1}
/.doc/ {DOC=DOC+1}
END {
printf("HTML=%s; HTM=%s; DOC=%s\n", HTML, HTM, DOC);}'


It might be better to use perl. Perl has a lot of libraries that know about correct syntax html files. The libraries can parse the document and return various pieces of it.

The code snipit above came from a shell script call "dvddup.sh". Go and google for it. It has a lot of awk in it. Also go to the directory "/usr/bin" and "grep awk *". There are a lot of files distributed with my Fedora Core 8 that have awk scripts in them.

Hope this helps.
lensman3 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Bookmark on Thread SoupReddit!
Reply With Quote
Old 05-09-2008, 09:45 PM   #3 (permalink)
Registered User
 
Join Date: Oct 2007
Location: Littleton, Colorado USA
Posts: 343
OS: xp 64 sp2 Fedora Core 8 (vmware xp core 8 x32) Minix


Re: awk script help

Another place to look for awk examples is to go to the /etc directory in a text command window and type in "grep -R awk *". (The -R tells grep to recursively decend through the underlying directories). There are lots of startup config scripts that use awk (and everything else). This command also prints a lot of garbage.
lensman3 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Bookmark on Thread SoupReddit!
Reply With Quote
Old 05-09-2008, 09:48 PM   #4 (permalink)
Registered User
 
Join Date: Oct 2007
Posts: 45
OS: Windows Vista Business


Re: awk script help

i want to do specifically what i described above. its part of an assignment for my university and i dont know much about awk, that's why, and i don't have time to bother...!

i think your solution would take all occurences of /.html/ and in an html code there are links with .html and text with .html. so it would have to be something like /href=/ or /src=/ and then RS="<" and then split up the record to fields and "clean" the link! but i don't know exactly how to do it!

is it more clear now?
__________________
"Football is not just a game; it is a weapon of the Revolution."
kyris is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Bookmark on Thread SoupReddit!
Reply With Quote
Old 06-09-2008, 11:45 PM   #5 (permalink)
Registered User
 
Join Date: Jun 2008
Posts: 4
OS: linux


Re: awk script help

Can anyone please help me with this awk script?

rsh sim1 vmstat -m | awk 'NR == 12 {print "Sim1","\t", $6}' >> /net/home/linux_users/Phyoe/scripts/man_u.txt && rup -l | egrep sim1 | awk '{print "\t","\t", $8}' >> /net/home/linux_users/Phyoe/scripts/man_u.txt

The above output will be Sim1 2981
1.08

I wish to get the output in one line as following Sim1 2981 1.08


and also how do i print the field immediately after the regex?
awk '/regexp/{getline;print}' <-this does not work for me as i need to print the field, not line.


Please Help

Best Regards,
Aung Phyoe

Last edited by Saosin1984 : 06-09-2008 at 11:46 PM.
Saosin1984 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Bookmark on Thread SoupReddit!
Reply With Quote
Old 06-10-2008, 10:40 AM   #6 (permalink)
Registered User
 
Join Date: Oct 2007
Location: Littleton, Colorado USA
Posts: 343
OS: xp 64 sp2 Fedora Core 8 (vmware xp core 8 x32) Minix


Re: awk script help

I think, and I'm not very good at this, is that you are sending two commands to machine sim1

1)rsh sim1 vmstat -m | awk 'NR == 12 {print "Sim1","\t", $6}' >> /net/home/linux_users/Phyoe/scripts/man_u.txt
and
2) && rup -l | egrep sim1 | awk '{print "\t","\t", $8}' >> /net/home/linux_users/Phyoe/scripts/man_u.txt

I don't know what the rup command is. It is not in my man pages.

Why dont you try something like
sh sim1 vmstat -m | awk 'NR == 12 {print "Sim1","\t", $6,$8}' >> /net/home/linux_users/Phyoe/scripts/man_u.txt

That way the vmstat gets piped thru the awk and if the line has 12 fields, then the 6th and the 8th get printed.

A suggestion, substitute ssh for rsh. The security is better. And you don't have to play with those pesky .rsh and .rexec files. ssh will let you set up the transparent login privileges using a long script without having to actually login to the remote machine. (And you can remote login as root).
lensman3 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Bookmark on Thread SoupReddit!
Reply With Quote
Old 06-10-2008, 06:24 PM   #7 (permalink)
Registered User
 
Join Date: Jun 2008
Posts: 4
OS: linux


Re: awk script help

Thank you very much for the reply.

rup is a command for showing the up time for servers and also the load average of the servers.

you are sending two commands to machine sim1<-- no, Sim1 is the server and i am just printing the word "Sim1"

What i am trying to do is that, i try to grep the load average of sim1 which is rup command and also memory usage of sim1 which is vmstat command. From the 2 above command, using awk to get the right field and print it into a file but i have some trouble.


rsh sim1 vmstat -m | awk 'NR == 12 {print "Sim1","\t", $6}' >> /net/home/linux_users/Phyoe/scripts/man_u.txt && rup -l | egrep sim1 | awk '{print "\t","\t", $8}' >> /net/home/linux_users/Phyoe/scripts/man_u.txt

This is the thing that i wrote in a file and the result is:
Sim1 2764
2.6

The result that i wish to get is in one sigle line:
Sim1 2764 2.6

I hope this is easier to understand now. :)
Saosin1984 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Bookmark on Thread SoupReddit!
Reply With Quote
Old 06-10-2008, 07:29 PM   #8 (permalink)
Registered User
 
Join Date: Jun 2008
Posts: 4
OS: linux


Re: awk script help

Thank you very much for the reply.

rup is a command for showing the up time for servers and also the load average of the servers.

you are sending two commands to machine sim1<-- no, Sim1 is the server and i am just printing the word "Sim1"

What i am trying to do is that, i try to grep the load average of sim1 which is rup command and also memory usage of sim1 which is vmstat command. From the 2 above command, using awk to get the right field and print it into a file but i have some trouble.


rsh sim1 vmstat -m | awk 'NR == 12 {print "Sim1","\t", $6}' >> /net/home/linux_users/Phyoe/scripts/man_u.txt && rup -l | egrep sim1 | awk '{print "\t","\t", $8}' >> /net/home/linux_users/Phyoe/scripts/man_u.txt

This is the thing that i wrote in a file and the result is:
Sim1 2764
2.6

The result that i wish to get is in one sigle line:
Sim1 2764 2.6

I hope this is easier to understand now. :)
Saosin1984 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Bookmark on Thread SoupReddit!
Reply With Quote
Old 06-10-2008, 08:41 PM   #9 (permalink)
Registered User
 
Join Date: Jun 2008
Posts: 4
OS: linux


Re: awk script help

I managed to solve it!

rsh sim1 vmstat -m | awk -v ORS='' 'NR == 12 {print "Sim1","\t", $6}' >> filename && rup -l | egrep sim1 | awk '{print "\t", $8}' >> filename

The above command will give me Sim1 2856 2.5

Thank you so much!
Saosin1984 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Bookmark on Thread SoupReddit!
Reply With Quote
Old 06-10-2008, 09:31 PM   #10 (permalink)
Registered User
 
Join Date: Oct 2007
Location: Littleton, Colorado USA
Posts: 343
OS: xp 64 sp2 Fedora Core 8 (vmware xp core 8 x32) Minix


Re: awk script help

You killed the ORS default from a newline into a double quote. Somehow the double quote disappeared into a pipe or filename.

Clever.
lensman3 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Bookmark on Thread SoupReddit!
Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off



All times are GMT -7. The time now is 07:21 AM.



Copyright 2001 - 2008, Tech Support Forum

Search Engine Friendly URLs by vBSEO

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81