html parser

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Gustavo G. Rondina

    html parser

    Hi all

    I am using libcurl to grab an html file from a remote http site. How
    can I parse this file in order to produce a "formatted" output? Is
    there any lib around that performs this action?

    Thanks
    --
    Gustavo G. Rondina

  • Cedric LEMAIRE

    #2
    Re: html parser

    gustgr@brlivre. org (Gustavo G. Rondina) wrote in message news:<87y8k9bab 1.fsf@fingolfin .arda.org>...[color=blue]
    > I am using libcurl to grab an html file from a remote http site. How
    > can I parse this file in order to produce a "formatted" output? Is
    > there any lib around that performs this action?[/color]

    You can use CodeWorker, a universal parsing tool and a versatile
    source code generator, freeware available at
    "http://www.codeworker. org".

    You describe how you want to parse the HTML page via an extended-BNF
    script, which will extract only the data you are interested in. Then,
    you save the resulting data in a file, writing a template-based script
    for the code (text here) generation.

    It is highly declarative, well-adapted to the data extraction from
    HTML pages.

    If you don't want to call the interpreter of CodeWorker as an external
    tool, it is available as a C++ library too. But I don't know if it is
    easy to link a C++ library to a C program.

    Comment

    Working...