thanks Stefan,
both lxml and threading works perfect.
One small problem, "with_tail" was not recognized as a valid keyword.
cheers,
Stef
Stefan Behnel wrote:
both lxml and threading works perfect.
One small problem, "with_tail" was not recognized as a valid keyword.
cheers,
Stef
Stefan Behnel wrote:
Stef Mientki <stef.mientki <atgmail.comwri tes:
>
>
You should give lxml.html a try.
>
>
It can parse directly from HTTP URLs (no need to go through urlopen), and it
frees the GIL while parsing, so it will become efficient to create a little
Thread that doesn't do more than parsing the web site, as in (untested):
>
def read_bablefish( text, lang, result):
url = BABLEFISH_URL + '?' + urlencode({'trt ext':text, 'lp':lang})
page = lxml.html.parse (url)
for div in page.iter('div' ):
style = div.get('style' )
if style is not None and 'padding:0.6em; ' in style:
result.append(
lxml.html.tostr ing(div, method="text", with_tail=False ))
>
result = []
thread = threading.Threa d(target=read_b ablefish,
args=("...", "en_nl", result))
thread.start()
while thread.isAlive( ):
# ... do other stuff
if result:
print result[0]
>
Stefan
>
>
--
>
>
>Although it works functionally,
>it can take lots of time waiting for the translation.
>>
>What I basically do is, after selecting a new string to be translated:
>>
> kwds = { 'trtext' : line_to_be_tran slated, 'lp' :'en_nl'}
> soup = BeautifulSoup (urlopen(url, urlencode ( kwds ) ) )
> translation= soup.find ( 'div', style='padding: 0.6em;' ).string
> self.Editor_Bab el.SetLabel ( translation )
>>
>it can take lots of time waiting for the translation.
>>
>What I basically do is, after selecting a new string to be translated:
>>
> kwds = { 'trtext' : line_to_be_tran slated, 'lp' :'en_nl'}
> soup = BeautifulSoup (urlopen(url, urlencode ( kwds ) ) )
> translation= soup.find ( 'div', style='padding: 0.6em;' ).string
> self.Editor_Bab el.SetLabel ( translation )
>>
You should give lxml.html a try.
>
>
It can parse directly from HTTP URLs (no need to go through urlopen), and it
frees the GIL while parsing, so it will become efficient to create a little
Thread that doesn't do more than parsing the web site, as in (untested):
>
def read_bablefish( text, lang, result):
url = BABLEFISH_URL + '?' + urlencode({'trt ext':text, 'lp':lang})
page = lxml.html.parse (url)
for div in page.iter('div' ):
style = div.get('style' )
if style is not None and 'padding:0.6em; ' in style:
result.append(
lxml.html.tostr ing(div, method="text", with_tail=False ))
>
result = []
thread = threading.Threa d(target=read_b ablefish,
args=("...", "en_nl", result))
thread.start()
while thread.isAlive( ):
# ... do other stuff
if result:
print result[0]
>
Stefan
>
>
--
>