GIDForums  

Go Back   GIDForums > Computer Programming Forums > C Programming Language
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #1  
Old 13-Sep-2007, 02:42
jestges jestges is offline
New Member
 
Join Date: Sep 2007
Posts: 2
jestges is on a distinguished road

Read HTML File


Hi is there anyway to read a html file using c program.


Thank you,
  #2  
Old 13-Sep-2007, 04:49
Uten Uten is offline
New Member
 
Join Date: Jan 2007
Location: Norway
Posts: 17
Uten is on a distinguished road

Re: Read HTML File


Yes, html files are text files and most tutorials in C includes a section on reading such files.

But my guess is that you should refrain your question a bit to get the answer you want..
  #3  
Old 13-Sep-2007, 06:16
jestges jestges is offline
New Member
 
Join Date: Sep 2007
Posts: 2
jestges is on a distinguished road

Re: Read HTML File


thank you for your quick reply can you give me a simple source code to do the thing.

Here is my html file "test.html"

HTML Code:
<html> <head> <title>Test docment</title> </head> <body> <div style="width: 200px; height: 200px; border: 1px solid red;"> </div> </body> </html>

and i want to read all tags by using c program

Thank you
Last edited by admin : 13-Sep-2007 at 11:30. Reason: Please insert your markup between [html] & [/html] tags
  #4  
Old 15-Sep-2007, 09:51
shalombi shalombi is offline
Junior Member
 
Join Date: Feb 2007
Posts: 47
shalombi is on a distinguished road

Re: Read HTML File


well with html it is rather simple since tags are clearly delimited from the rest by <> so depending on what you want to do you can count pairs to know how many tags are present or make a list of which tags appear etc...

Open the file and process char by char, compare and sort.

Max
  #5  
Old 16-Sep-2007, 20:15
davis
 
Posts: n/a

Re: Read HTML File


Quote:
Originally Posted by jestges
thank you for your quick reply can you give me a simple source code to do the thing.

Here is my html file "test.html"

HTML Code:
<html> <head> <title>Test docment</title> </head> <body> <div style="width: 200px; height: 200px; border: 1px solid red;"> </div> </body> </html>

and i want to read all tags by using c program

Thank you

If you read through the entire file, you will have read all of the tags in the file with the C program!

...it is only WHAT you want to do with the tags after/during reading them that matters! Let's say that you want to print out all of the tags:

CPP / C++ / C Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void read_html_file( char* filename, char** p_data, int* bytes_read )
{
    long len = 0;
    FILE* p_infile = fopen( filename, "r" );
    if( p_infile )
    {
        if( *p_data != 0 )
        {
            free( *p_data );
        }
        fseek( p_infile, 0L, SEEK_END );
        len = ftell( p_infile );
        rewind( p_infile );
        *p_data = (char*)malloc( len +1 );
        if( *p_data )
        {
            *bytes_read = fread( *p_data, 1, len, p_infile );
            (*p_data)[len] = 0;
        }
        fclose( p_infile );
    }
}

void print_tags( char* p_data )
{
    char* p;
    char start_tag = 0;
    for( p = p_data; *p != 0; p++ )
    {
        switch( *p )
        {
            case '<':
                start_tag = 1;
                putchar( '<' );
                break;
            case '\n':
            case '\r':
            case '\t':
                break;
            case '>':
                putchar( '>' );
                printf( "\n" );
                start_tag = 0;
                break;
            default:
                if( start_tag )
                {
                    printf( "%c", *p );
                }
                break;
        }
    }
}

void usage()
{
    printf( "Usage: read_html <filename>\n\n" );
}

int main( int argc, char* argv[] )
{
    char* p_data = 0;
    int   bytes_read = 0;

    if( argc < 2 )
    {
        usage();
    }
    else
    {
        read_html_file( argv[1], &p_data, &bytes_read );
        if( p_data != 0 )
        {
            if( bytes_read > 0 )
            {
                print_tags( p_data );
            }
            free( p_data );
        }
    }
    return 0;
}

Output:

Code:
$ ./read_html Usage: read_html <filename> $ ./read_html /tmp/test.html <html> <head> <title> </title> </head> <body> <div style="width: 200px; height: 200px; border: 1px solid red;"> </div> </body> </html>

...might get you started.


:davis:
 
 

Recent GIDBlogPython ebook by crystalattice

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Airport Log program using 3D linked List : problem reading from file batrsau C Programming Language 11 29-Feb-2008 08:44
How to detect end of file with read() function call? nkhambal C Programming Language 6 12-Oct-2004 02:08
CD burner wont burn!! robertli55 Computer Hardware Forum 1 18-Jun-2004 11:53
Yet another CD burner problem: Lite-On LSC-24082K Erwin Computer Hardware Forum 1 22-May-2004 12:28
Binary file: hitting eof too soon? Alotau C++ Forum 20 18-May-2004 16:36

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 06:24.


vBulletin, Copyright © 2000 - 2008, Jelsoft Enterprises Ltd.