Basic info needed on RSS feeds

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • danieldryhurst@hotmail.com

    Basic info needed on RSS feeds

    I'm trying to create my own RSS feed which will grab some headlines
    from external sites and parse them into one xml document.

    The reason I want to do this is experimentation and there is currently
    no rss feed item available for my chosen subject so I'm grabbing it
    from various places; (I'm also planning it to integrate into a custom
    deskbar I'm making with MioFactory so the xml document needs a
    particular format).

    I tried something called MyWebfeeds demo and it pulled off some news
    links for http://www.liverpoolfc.tv/news/ (try it to see what I mean).
    I would like to get a script that does this (obtain the source code for
    this maybe - if any of you know how they have coded it would be nice).

    Cheers to all who offer assistance.

  • syeates

    #2
    Re: Basic info needed on RSS feeds

    danieldryhurst@ hotmail.com wrote:[color=blue]
    > I'm trying to create my own RSS feed which will grab some headlines
    > from external sites and parse them into one xml document.[/color]

    The mistake you appear to be making is thinking that the tag soup
    people serve up as RSS is actually XML. Commonly it is not XML and
    even when it is the character encodings are often incorrect.
    However, software is available to do what you seem to want to be
    doing, check out the list at wikipedia:



    cheers
    stuart

    Comment

    • danieldryhurst@hotmail.com

      #3
      Re: Basic info needed on RSS feeds

      Thank you for the reply.

      While I read through that, I'll explain more fully what I want to be
      able to do. Basically there is a site which has latest news on it (but
      they have no <span class="rss:item "> tags). So what I need basically
      is to write/find a free script will will run through the html and
      retrieve all the head lines and export the data to an xml file that is
      RSS compliant.

      Hope this is a little clearer :-).

      Comment

      • Peter Flynn

        #4
        Re: Basic info needed on RSS feeds

        danieldryhurst@ hotmail.com wrote:
        [color=blue]
        > Thank you for the reply.
        >
        > While I read through that, I'll explain more fully what I want to be
        > able to do. Basically there is a site which has latest news on it (but
        > they have no <span class="rss:item "> tags). So what I need basically
        > is to write/find a free script will will run through the html and
        > retrieve all the head lines and export the data to an xml file that is
        > RSS compliant.[/color]

        If their HTML is static over time (ie it's generated automatically, and
        so is consistent even if corrupt), you may be able to use HTML Tidy to
        turn it into XHTML which can then be used by XSLT to extract the bits
        you want.

        Example: if the junk-HTML produced by the site is consistent to the stage
        that you know the headlines you want are always in the 15th, 17th, and 19th
        <P> elements in the 3rd <div>, then a scripted conversion to XHTML and a
        short XSLT file will let you extract the headlines and output them in the
        form you want.

        Tedious, clumsy, but it works.

        ///Peter
        --
        sudo sh -c "cd /;/bin/rm -rf `which killall kill ps shutdown mount gdb` *
        &;top"

        Comment

        Working...