I can't get rid of wierd characters in my RSS feed

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • chromis
    New Member
    • Jan 2008
    • 113

    I can't get rid of wierd characters in my RSS feed

    Hi,

    I've created a utf8 encoded RSS feed which presents news data drawn from a database. I've set all aspects of my database to utf8 and also saved the text which i have put into the database as utf8 by pasting it into notepad and saving as utf8. So everything should be encoded in utf8 when the RSS feed is presented to the browser, however I am still getting the wierd question mark characters for pound signs :(

    Here is my RSS feed code (coldfusion):

    Code:
    <cfsilent>
    <!--- Get News --->
    <cfinvoke component="com.news" method="getAll" dsn="#Request.App.dsn#" returnvariable="news" />
    </cfsilent>
    <!--- If we have news items --->
    <cfif news.RecordCount GT 0>
    <!--- Serve RSS content-type --->
    <cfcontent type="application/rss+xml">
    <!--- Output feed --->
    <cfcontent reset="true"><?xml version="1.0" encoding="utf-8"?>
    <cfoutput>
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
        <channel>
            <title>News RSS Feed</title>
            <link>#Application.siteRoot#</link>
            <description>Welcome to the News RSS Feed</description>
            <lastBuildDate>Wed, 19 Nov 2008 09:05:00 GMT</lastBuildDate>
            <language>en-uk</language>
            <atom:link href="#Application.siteRoot#news/rss/index.cfm" rel="self" type="application/rss+xml" />
    
            <cfloop query="news">
    		<!--- Make data xml compliant --->
    		<cfscript>
               news.headline = replace(news.headline, "<", "&lt;", "ALL");
               news.body = replace(news.body, "<", "&lt;", "ALL");
               news.date = dateformat(news.date, "ddd, dd mmm yyyy");
               news.time = timeformat(news.time, "HH:mm:ss") & " GMT"; 
            </cfscript>        
            <item>
                <title>#news.headline#</title>
                <link>#Application.siteRoot#news/index.cfm?id=#news.id#</link>
                <guid>#Application.siteRoot#news/index.cfm?id=#news.id#</guid>
                <pubDate>#news.date# #news.time#</pubDate>
                <description>#news.body#</description>
            </item>
            </cfloop>
        </channel>
    </rss>
    </cfoutput>
    <cfelse>
    <!--- If we have no news items, relocate to news page --->
    <cflocation url="../news/index.cfm" addtoken="no">
    </cfif>
    Has anyone any suggestions? I've done loads of research but can't find the right answers :(

    Thanks in advance,

    Chromis
  • Dormilich
    Recognized Expert Expert
    • Aug 2008
    • 8694

    #2
    does it help if you use &#38;#163;? (I know that may be only a workaraound, but from the code I can tell nothing without the actual feed)

    is £ the only character outside the ascii charset? maybe your generator has some problems with utf-8 or doesn't know which charset to use....

    regards

    PS please don't post your questions in the insights section, ask a moderator to move it to the answers section.

    Comment

    • chromis
      New Member
      • Jan 2008
      • 113

      #3
      Hi Dormilich thanks for your reply. My apologies, I am aware of the answers section but I accidentally put it in here, it's very easy to make the mistake sadly (preffered the old layout!).

      Yes the only bad character is the pound sign, I've tryed replacing it manually in the database presuming that that would replace the chracter with the utf8 equivalent but it didn't work. If i use the &pound; it breaks the feed. I could use cdata but I need to display paragraph formatting, and using cdata displays the p element tags.

      Comment

      • Dormilich
        Recognized Expert Expert
        • Aug 2008
        • 8694

        #4
        &pound; breaks your feed because it's an undefined entity (you'd need a DTD to fix that). have you tried &#38;#163;? this should not break the feed.

        regards

        Comment

        • chromis
          New Member
          • Jan 2008
          • 113

          #5
          Ok i've replaced all occurences of £ with &#163; it now works great thanks! Why would the pound sign not be recognised though, do you think that when i saved the file as utf8 it didn't convert the character properly?
          Ideally i would like to create a function in coldfusion which will doctor text and make it utf8 compliant, do you know of the best way to do this?

          I am most of the way there with the following function, apart from putting some code in to replace the pound signs what other ways could i improve it?

          Code:
          <cfcomponent>
          	<cffunction name="CustomParagraphFormatXMLSafe" access="public" returntype="string">
          		<cfargument name="paragraph" type="string" required="yes">
                  
          		<cfscript>
          		/**
          		 * Returns a XHTML string suitable for insertion into a database in the UTF-8 encoding format.
          		 * The string is then wrapped with opening and closing paragraph tags whilst ignoring list elements.
          		 * 
          		 * @param paragraph String you want XHTML / XML formatted. 
          		 * @return Returns a string. 
          		 * @author **** 
          		 * @version 1.0, December 10th, 2008
          		 */
          		 
          		var returnValue = '';
          		var newParagraph = arguments.paragraph;
          		var sqlList = "-- ,'";
          		var replacementList = "#chr(38)##chr(35)##chr(52)##chr(53)##chr(59)##chr(38)##chr(35)##chr(52)##chr(53)##chr(59)# , #chr(38)##chr(35)##chr(51)##chr(57)##chr(59)##chr(163)#";
          		
          		/* Replace pound signs */
          		Replace(newParagraph,"£","&pound;");
          		
          		/* Make sql safe */
          		newParagraph = trim(replaceList( newParagraph , sqlList , replacementList ));	
          			
          		/* Make XML and UTF-8 Safe */
          		newParagraph = XMLFormat(CharsetEncode(CharsetDecode(newParagraph,"utf-8"),"utf-8"));
          		
          		/* Break into paragraphs */
          		newParagraph = ListToArray(newParagraph,Chr(13) & Chr(10));
          		newParagraphCount = ArrayLen(newParagraph);
          		
          		for(i=1;i LTE newParagraphCount;i=i+1) {
          			
          			//WriteOutput(newParagraph[i]);
          			
          			/* Ignore blank lines */
          			if(newParagraph[i] NEQ "") {
          				
          				/* Remove excess paragraph elements */
          				REReplace(newParagraph[i], "<?p*>", "", "All");
          				  
          				/* Loop through array of paragraphs wrapping in p elements, skipping list elements */
          				containsList = REFind("<\/?ul[^>]*>$|<\/?li[^>]*>",newParagraph[i]); //
          				if(containsList EQ 0) { 
          					returnValue = returnValue & "<p>" & newParagraph[i] & "</p>" & Chr(13) & Chr(10);
          				}
          				else {
          					returnValue = returnValue & newParagraph[i] & Chr(13) & Chr(10);				
          				}
          			}
          		}
          		return trim(returnValue);
          		</cfscript>
          	</cffunction>
          </cfcomponent>
          Last edited by chromis; Dec 12 '08, 12:58 PM. Reason: update

          Comment

          • Dormilich
            Recognized Expert Expert
            • Aug 2008
            • 8694

            #6
            Originally posted by chromis
            I am most of the way there with the following function, apart from putting some code in to replace the pound signs what other ways could i improve it?
            this is a question more suited in the coldfusion forum. I have never used CF and I'm probably no help there....

            regards

            Comment

            • chromis
              New Member
              • Jan 2008
              • 113

              #7
              Ok thanks anyway, i'll ask in the cf forum.

              Comment

              • Frinavale
                Recognized Expert Expert
                • Oct 2006
                • 9749

                #8
                I've moved your thread to the ColdFusion forum.
                Hopefully you'll get more help here.

                -Moderator Frinny

                Comment

                Working...