Hey Philip,
thanks for the snipplet, but I have tried that code already. It does
indeed give me a swedish version.. of www.google.de :) That's the beauty
about Google that they have all languages for all domains available.
However if I try it with www.gizmodo.com (a tech blog in several
languages) I still get the German version.
Both sites obviously redirect the client to the country-based version
according to the IP first, and Google presents that page in the desired
language AFTER that.. most other multihost sites won't have a Swedish
version of the .de site, so this doesn't quit help :(
Thanks anyway,
Martin
thanks for the snipplet, but I have tried that code already. It does
indeed give me a swedish version.. of www.google.de :) That's the beauty
about Google that they have all languages for all domains available.
However if I try it with www.gizmodo.com (a tech blog in several
languages) I still get the German version.
Both sites obviously redirect the client to the country-based version
according to the IP first, and Google presents that page in the desired
language AFTER that.. most other multihost sites won't have a Swedish
version of the .de site, so this doesn't quit help :(
Thanks anyway,
Martin
>
On Oct 16, 2008, at 6:50 AM, Martin Bachwerk wrote:
>
>
I'm not sure what you mean by this. In my experiment with wget, Google
respects the Accept-Language header. On other words, this returns a
Swedish page even though I'm executing it from a U.S. IP address:
>
wget "--header=Accept-Language: sv" http://www.google.com/
>
>
I see the same behavior from urllib2, although my code is slightly
different from yours. Here's my code. If I use "sv" in the header I
get Swedish, "pl" gives me Polish, etc. I get the same result when I
add your Mozilla user-agent string.
>
----------------------------------------
import urllib2
>
headers = { "Accept-Language" : "sv" }
>
req = urllib2.Request ("http://www.google.com/", None, headers)
f = urllib2.urlopen (req)
content = f.read()
f.close()
>
print content
----------------------------------------
>
>
Do you get different results with this same code in Germany?
>
Cheers
Philip
>
>
>
>
>
On Oct 16, 2008, at 6:50 AM, Martin Bachwerk wrote:
>
>Hmm, thanks for the ideas,
>>
>I've checked the requests in Firefox one more time after deleting all
>the cookies and both google.com and gizmodo.com do indeed forward me
>to the German site without caring about the browser settings.
>>
>wget shows me that the server does a 302 redirect straight away.. soo..
>>
>I've checked the requests in Firefox one more time after deleting all
>the cookies and both google.com and gizmodo.com do indeed forward me
>to the German site without caring about the browser settings.
>>
>wget shows me that the server does a 302 redirect straight away.. soo..
I'm not sure what you mean by this. In my experiment with wget, Google
respects the Accept-Language header. On other words, this returns a
Swedish page even though I'm executing it from a U.S. IP address:
>
wget "--header=Accept-Language: sv" http://www.google.com/
>
>
I see the same behavior from urllib2, although my code is slightly
different from yours. Here's my code. If I use "sv" in the header I
get Swedish, "pl" gives me Polish, etc. I get the same result when I
add your Mozilla user-agent string.
>
----------------------------------------
import urllib2
>
headers = { "Accept-Language" : "sv" }
>
req = urllib2.Request ("http://www.google.com/", None, headers)
f = urllib2.urlopen (req)
content = f.read()
f.close()
>
print content
----------------------------------------
>
>
Do you get different results with this same code in Germany?
>
Cheers
Philip
>
>
>
>>
>>
>>>
>>On Oct 15, 2008, at 9:50 AM, Martin Bachwerk wrote:
>>>
>>>Hello,
>>>>
>>>I'm trying to load a couple of pages using the urllib2 module. The
>>>problem is that I live in Germany and some sites seem to look at
>>>the IP of the client and forward him to a localized page.. Here's
>>>an example of the code, how I want to access google.com main
>>>english page, but get German instead. (For those of you who live in
>>>US, you will probably get correct results.. try emulating with 'fr'
>>>in accepted languages or something)
>>>>
>>>opener = urllib2.build_o pener()
>>>opener.addhe aders = [('Host', 'www.google.com '),
>>>('Accept-Language','en-gb,en;q=0.5'), ('User-agent', 'Mozilla/5.0
>>>(Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.1) Gecko/2008070208
>>>Firefox/3.0.1')]
>>>webfile = opener.open(url )
>>>
>>Martin,
>>It looks to me like what you're sending is correct. Debugging
>>suggestions --
>>>
>>- Set up a Web server on 127.0.0.1 and see what that server receives
>>when your Python code connects to it. Maybe you're not sending quite
>>what you think.
>>- Try emulating your Python code with wget or a similar command line
>>tool that lets you set headers.
>>- Sniff the conversation you're having with google using Wireshark.
>>Maybe you're getting redirected by the remote server.
>>>
>>Good luck
>>Philip
>>>
>>On Oct 15, 2008, at 9:50 AM, Martin Bachwerk wrote:
>>>
>>>Hello,
>>>>
>>>I'm trying to load a couple of pages using the urllib2 module. The
>>>problem is that I live in Germany and some sites seem to look at
>>>the IP of the client and forward him to a localized page.. Here's
>>>an example of the code, how I want to access google.com main
>>>english page, but get German instead. (For those of you who live in
>>>US, you will probably get correct results.. try emulating with 'fr'
>>>in accepted languages or something)
>>>>
>>>opener = urllib2.build_o pener()
>>>opener.addhe aders = [('Host', 'www.google.com '),
>>>('Accept-Language','en-gb,en;q=0.5'), ('User-agent', 'Mozilla/5.0
>>>(Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.1) Gecko/2008070208
>>>Firefox/3.0.1')]
>>>webfile = opener.open(url )
>>>
>>Martin,
>>It looks to me like what you're sending is correct. Debugging
>>suggestions --
>>>
>>- Set up a Web server on 127.0.0.1 and see what that server receives
>>when your Python code connects to it. Maybe you're not sending quite
>>what you think.
>>- Try emulating your Python code with wget or a similar command line
>>tool that lets you set headers.
>>- Sniff the conversation you're having with google using Wireshark.
>>Maybe you're getting redirected by the remote server.
>>>
>>Good luck
>>Philip
>>>
>
Comment