urllib2.HTTPError: HTTP Error 204: NoContent

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • silk.odyssey

    urllib2.HTTPError: HTTP Error 204: NoContent

    I am getting the following error trying to download an html page using
    urllib2.

    urllib2.HTTPErr or: HTTP Error 204: NoContent

    The url is of this type:



    I can open it in my browser without problems.Any ideas on a solution?
  • Philip Semanchuk

    #2
    Re: urllib2.HTTPErr or: HTTP Error 204: NoContent


    On Oct 19, 2008, at 6:13 AM, silk.odyssey wrote:
    I am getting the following error trying to download an html page using
    urllib2.
    >
    urllib2.HTTPErr or: HTTP Error 204: NoContent
    >
    The url is of this type:
    >

    >
    I can open it in my browser without problems.Any ideas on a solution?
    Are you changing the user-agent? Some sites sniff user agents and
    return different results to browsers than to suspected bots.

    I'd try it from here if you post a self-contained sample that
    demonstrates the problem. Should only take a couple of lines.



    Comment

    • Mark Sapiro

      #3
      Re: urllib2.HTTPErr or: HTTP Error 204: NoContent

      On Oct 19, 9:49 am, Philip Semanchuk <phi...@semanch uk.comwrote:
      On Oct 19, 2008, at 6:13 AM, silk.odyssey wrote:
      >
      I am getting the following error trying to download an html page using
      urllib2.
      >
      urllib2.HTTPErr or: HTTP Error 204: NoContent
      >
      The url is of this type:
      >>
      I can open it in my browser without problems.Any ideas on a solution?
      >
      Are you changing the user-agent? Some sites sniff user agents and  
      return different results to browsers than to suspected bots.

      I tried it.
      >>import urllib2
      >>url = 'http://www.amazon.com/gp/offer-listing/B000KJX3A0%3FSu bscriptionId%3D 183VXJS74KNQ89D 0NRR2%26tag%3Dw s%26linkCode%3D xm2%26camp%3D20 25%26creative%3 D386001%26creat iveASIN%3DB000K JX3A0'
      >>op = urllib2.urlopen (url)
      Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python2.5/urllib2.py", line 121, in urlopen
      return _opener.open(ur l, data)
      File "/usr/lib/python2.5/urllib2.py", line 380, in open
      response = meth(req, response)
      File "/usr/lib/python2.5/urllib2.py", line 491, in http_response
      'http', request, response, code, msg, hdrs)
      File "/usr/lib/python2.5/urllib2.py", line 418, in error
      return self._call_chai n(*args)
      File "/usr/lib/python2.5/urllib2.py", line 353, in _call_chain
      result = func(*args)
      File "/usr/lib/python2.5/urllib2.py", line 499, in
      http_error_defa ult
      raise HTTPError(req.g et_full_url(), code, msg, hdrs, fp)
      urllib2.HTTPErr or: HTTP Error 204: NoContent
      >>headers = {}
      >>headers['User-Agent'] = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3'
      >>ro = urllib2.Request (url, None, headers)
      >>op = urllib2.urlopen (ro)
      >>page = op.read()
      >>page
      (lots of HTML)

      So the answer is as Philip suggests - amazon.com doesn't like 'Python-
      urllib/2.5' as a User-Agent. You have to give it something that looks
      like a browser.

      --
      (for email use this address please - you can figure it out)

      Mark Sapiro mark at msapiro net Any clod can have the facts;
      San Francisco Bay Area, California having opinions is an art. -
      C. McCabe, The Fearless
      Spectator

      Comment

      Working...