how do I extract data from html file ? - web scraping

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • masterinex
    New Member
    • Dec 2009
    • 1

    how do I extract data from html file ? - web scraping

    Hi guys , Im a little unfamiliar with Python . Hope you can take a look at this:

    Im trying to extract the number 7.2 from the html string below using python:
    Code:
    '''<a href="/ratings_explained">weighted average</a> vote of <a href="/List?ratings=7">7.2</a> / 10</p><p>'''
    I thought this would be code to do this .But how come this doesnt work ?
    Code:
    averageget = re.compile('<a href="/List?ratings=7">(.*?)</a>')
    average = averageget.findall(htmlr)
    Could it be that there some special structures in the html file again which I missed out ?
    Last edited by bvdet; Dec 26 '09, 04:07 PM. Reason: Add code tags
  • bvdet
    Recognized Expert Specialist
    • Oct 2006
    • 2851

    #2
    Please use code tags when posting code.

    The question mark (?) is a special character sequence recognized in regular expression patterns. To use the literal character, precede the character with a backslash.

    Code:
    averageget = re.compile('<a href="/List\?ratings=7">(.*?)</a>')

    Comment

    Working...