Re: urllib accept-language doesn't have any effect

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Martin Bachwerk

    Re: urllib accept-language doesn't have any effect

    Hey Philip,

    thanks for the snipplet, but I have tried that code already. It does
    indeed give me a swedish version.. of www.google.de :) That's the beauty
    about Google that they have all languages for all domains available.

    However if I try it with www.gizmodo.com (a tech blog in several
    languages) I still get the German version.

    Both sites obviously redirect the client to the country-based version
    according to the IP first, and Google presents that page in the desired
    language AFTER that.. most other multihost sites won't have a Swedish
    version of the .de site, so this doesn't quit help :(

    Thanks anyway,

    Martin
    >
    On Oct 16, 2008, at 6:50 AM, Martin Bachwerk wrote:
    >
    >Hmm, thanks for the ideas,
    >>
    >I've checked the requests in Firefox one more time after deleting all
    >the cookies and both google.com and gizmodo.com do indeed forward me
    >to the German site without caring about the browser settings.
    >>
    >wget shows me that the server does a 302 redirect straight away.. soo..
    >
    I'm not sure what you mean by this. In my experiment with wget, Google
    respects the Accept-Language header. On other words, this returns a
    Swedish page even though I'm executing it from a U.S. IP address:
    >
    wget "--header=Accept-Language: sv" http://www.google.com/
    >
    >
    I see the same behavior from urllib2, although my code is slightly
    different from yours. Here's my code. If I use "sv" in the header I
    get Swedish, "pl" gives me Polish, etc. I get the same result when I
    add your Mozilla user-agent string.
    >
    ----------------------------------------
    import urllib2
    >
    headers = { "Accept-Language" : "sv" }
    >
    req = urllib2.Request ("http://www.google.com/", None, headers)
    f = urllib2.urlopen (req)
    content = f.read()
    f.close()
    >
    print content
    ----------------------------------------
    >
    >
    Do you get different results with this same code in Germany?
    >
    Cheers
    Philip
    >
    >
    >
    >>
    >>>
    >>On Oct 15, 2008, at 9:50 AM, Martin Bachwerk wrote:
    >>>
    >>>Hello,
    >>>>
    >>>I'm trying to load a couple of pages using the urllib2 module. The
    >>>problem is that I live in Germany and some sites seem to look at
    >>>the IP of the client and forward him to a localized page.. Here's
    >>>an example of the code, how I want to access google.com main
    >>>english page, but get German instead. (For those of you who live in
    >>>US, you will probably get correct results.. try emulating with 'fr'
    >>>in accepted languages or something)
    >>>>
    >>>opener = urllib2.build_o pener()
    >>>opener.addhe aders = [('Host', 'www.google.com '),
    >>>('Accept-Language','en-gb,en;q=0.5'), ('User-agent', 'Mozilla/5.0
    >>>(Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.1) Gecko/2008070208
    >>>Firefox/3.0.1')]
    >>>webfile = opener.open(url )
    >>>
    >>Martin,
    >>It looks to me like what you're sending is correct. Debugging
    >>suggestions --
    >>>
    >>- Set up a Web server on 127.0.0.1 and see what that server receives
    >>when your Python code connects to it. Maybe you're not sending quite
    >>what you think.
    >>- Try emulating your Python code with wget or a similar command line
    >>tool that lets you set headers.
    >>- Sniff the conversation you're having with google using Wireshark.
    >>Maybe you're getting redirected by the remote server.
    >>>
    >>Good luck
    >>Philip
    >>>
    >>
    >
    >
  • Lawrence D'Oliveiro

    #2
    Re: urllib accept-language doesn't have any effect

    In message <mailman.2553.1 224166073.3487. python-list@python.org >, Martin
    Bachwerk wrote:
    It does indeed give me a swedish version.. of www.google.de :) That's the
    beauty about Google that they have all languages for all domains
    available.
    >
    However if I try it with www.gizmodo.com (a tech blog in several
    languages) I still get the German version.
    Sounds like a bug in the gizmodo.com site.

    Comment

    Working...