liburl cant load webpage with Javascript

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Uwe Mayer

    liburl cant load webpage with Javascript

    Hi,

    I want do use liburl to scan a webpage which is only accessible from within
    my LAN environment. While mozilla manages to load the target URL properly
    neither wget, nor liburl or liburl2 does.
    I had a closer look at the html source and discovered a lot of Javascript,
    including Cookies.

    My suspicion is that the Javascript code needs to be executed for the page
    to work properly. Also I don't know how liburl deals with Cookies, but
    since they are handled by the Javascript in the source code they are
    probably not considered at all.

    In any case I get an IOError: connection refused, Error Code 111.

    Does anyone know a way out of this?

    Thanks for any hints,
    Ciao
    Uwe
  • Lorenzo Gatti

    #2
    Re: liburl cant load webpage with Javascript

    Uwe Mayer <merkosh@hadiko .de> wrote in message news:<c80oqk$t7 $1@news2.rz.uni-karlsruhe.de>.. .[color=blue]
    > Hi,
    >
    > I want do use liburl to scan a webpage which is only accessible from within
    > my LAN environment. While mozilla manages to load the target URL properly
    > neither wget, nor liburl or liburl2 does.
    > I had a closer look at the html source and discovered a lot of Javascript,
    > including Cookies.
    >
    > My suspicion is that the Javascript code needs to be executed for the page
    > to work properly. Also I don't know how liburl deals with Cookies, but
    > since they are handled by the Javascript in the source code they are
    > probably not considered at all.
    >
    > In any case I get an IOError: connection refused, Error Code 111.
    >
    > Does anyone know a way out of this?
    >
    > Thanks for any hints,
    > Ciao
    > Uwe[/color]

    Mozilla is a web browser, and it implements cookies, DOM for HTML
    pages, and a Javascript interpreter with objects representing browser
    automation.
    It's unlikely and inappropriate for low level HTTP implementations
    like wget and liburl to have that kind of support for advanced web
    features; maybe you can support cookies and Javascript in your
    application.

    In the specific case of "IOError: connection refused, Error Code 111",
    however, the failure seems to happen at a lower protocol level: wrong
    host names or port numbers, unavailable servers and maybe proxy
    authentication requirements are the usual causes of refused
    connections.

    Lorenzo Gatti

    Comment

    • John J. Lee

      #3
      Re: liburl cant load webpage with Javascript

      gatti@dsdata.it (Lorenzo Gatti) writes:[color=blue]
      > Uwe Mayer <merkosh@hadiko .de> wrote in message news:<c80oqk$t7 $1@news2.rz.uni-karlsruhe.de>.. .[/color]
      [...][color=blue][color=green]
      > > I had a closer look at the html source and discovered a lot of Javascript,
      > > including Cookies.[/color][/color]
      [...][color=blue]
      > Mozilla is a web browser, and it implements cookies, DOM for HTML
      > pages, and a Javascript interpreter with objects representing browser
      > automation.
      > It's unlikely and inappropriate for low level HTTP implementations
      > like wget and liburl to have that kind of support for advanced web[/color]
      [...]

      JavaScript support is rare, but many libraries and tools support
      cookies (including wget and my library, ClientCookie -- essentially a
      drop-in replacement for urllib2). For JS, see my FAQ here (under
      "Embedded script is messing up my web-scraping. What do I do?"):



      [color=blue]
      > In the specific case of "IOError: connection refused, Error Code 111",
      > however, the failure seems to happen at a lower protocol level: wrong[/color]
      [...]

      Right.


      John

      Comment

      Working...