wirting special characters out to excel in HTML

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Matthew Shaw

    wirting special characters out to excel in HTML

    We have a web-based reporting application written in J2EE
    that writes out to excel using response.setCon tentType
    ("applicatio n/vnd.ms-excel; ")….

    The problem is that where we have any special characters
    in our report result set E.g umlauts and accents ( ASCII
    values 128 to 165 ) this data is corrupted, and does not
    appear correctly.

    The standard font family used throughout our Web Reports
    is Arial,
    I have seen this handled by using Verdana, however we are
    reluctant to change the fonts in all our reports.

    I believe this relates to Excel performing a unicode
    translation, unfortunately we require Excel functionality
    to enable users to perform operations on the finished
    reports.

    Is there a font family similar in appearance to Arial that
    will handle the unicode character set?
    Or is there a mechanism to tell Excel not to perform this
    conversion?

    thanks.

  • Jon Skeet [C# MVP]

    #2
    Re: wirting special characters out to excel in HTML

    Matthew Shaw <matthewhshaw@h otmail.com> wrote:[color=blue]
    > We have a web-based reporting application written in J2EE
    > that writes out to excel using response.setCon tentType
    > ("applicatio n/vnd.ms-excel; ")….
    >
    > The problem is that where we have any special characters
    > in our report result set E.g umlauts and accents ( ASCII
    > values 128 to 165 ) this data is corrupted, and does not
    > appear correctly.[/color]

    There *are* no ASCII values 128-165. ASCII is 7-bit.

    Now, when you say the data is "corrupted" , what exactly happens?
    Perhaps it's writing it out in UTF-8 or something similar? How are you
    writing out the data in the first place, exactly? (i.e what file
    format, etc.)
    [color=blue]
    > The standard font family used throughout our Web Reports
    > is Arial,
    > I have seen this handled by using Verdana, however we are
    > reluctant to change the fonts in all our reports.
    >
    > I believe this relates to Excel performing a unicode
    > translation, unfortunately we require Excel functionality
    > to enable users to perform operations on the finished
    > reports.
    >
    > Is there a font family similar in appearance to Arial that
    > will handle the unicode character set?[/color]

    I'd be very surprised if it were the font which was at fault here -
    althoguh I could certainly be wrong.
    [color=blue]
    > Or is there a mechanism to tell Excel not to perform this
    > conversion?[/color]

    How exactly are you exporting from Excel? Or are you only *importing*
    into Excel? If you can specify somewhere which character encoding to
    use, and make sure you use the same one everywhere, you should be okay.

    --
    Jon Skeet - <skeet@pobox.co m>
    Pobox has been discontinued as a separate service, and all existing customers moved to the Fastmail platform.

    If replying to the group, please do not mail me too

    Comment

    • Matthew Shaw

      #3
      Re: wirting special characters out to excel in HTML



      We are only importing into Excel. You can explicitly provide a character
      encoding...

      E.G application/vnd.ms-excel;charset=I SO-8859-1

      which I believe is default, others include charset=windows-1251.

      They do appear to be producing slightly different results, however none
      of the ones I have tried can handle umlauts...

      thanks.

      *** Sent via Developersdex http://www.developersdex.com ***
      Don't just participate in USENET...get rewarded for it!

      Comment

      • Jon Skeet [C# MVP]

        #4
        Re: wirting special characters out to excel in HTML

        Matthew Shaw <matthewhshaw@h otmail.com> wrote:[color=blue]
        > We are only importing into Excel. You can explicitly provide a character
        > encoding...
        >
        > E.G application/vnd.ms-excel;charset=I SO-8859-1[/color]

        Right - but if you explicitly provide the charset there, do you also
        make sure your J2EE app is actually *using* that character set?
        [color=blue]
        > which I believe is default, others include charset=windows-1251.[/color]

        Ah - if you're using 1251 that may well give different results to
        ISO-8859-1. If you can get both sides to use UTF-8 I believe that's the
        most likely to work for everything in a simple fashion.
        [color=blue]
        > They do appear to be producing slightly different results, however none
        > of the ones I have tried can handle umlauts...[/color]

        Hmm... well, I hope the above is helpful...

        --
        Jon Skeet - <skeet@pobox.co m>
        Pobox has been discontinued as a separate service, and all existing customers moved to the Fastmail platform.

        If replying to the group, please do not mail me too

        Comment

        • Matthew Shaw

          #5
          Re: wirting special characters out to excel in HTML



          I have tried the following
          "applicatio n/vnd.ms-excel;charset=w indows-1251",1250,1252

          I believe the default is charset=ISO-8859-1

          they do look as though they are altering the imported characters,
          although they appear either as . , or ? , or just those wierd square
          things that you get when you open a file using an editor that doesn't
          support the file format.

          *** Sent via Developersdex http://www.developersdex.com ***
          Don't just participate in USENET...get rewarded for it!

          Comment

          • Jon Skeet [C# MVP]

            #6
            Re: wirting special characters out to excel in HTML

            Matthew Shaw <matthewhshaw@h otmail.com> wrote:[color=blue]
            > I have tried the following
            > "applicatio n/vnd.ms-excel;charset=w indows-1251",1250,1252
            >
            > I believe the default is charset=ISO-8859-1
            >
            > they do look as though they are altering the imported characters,
            > although they appear either as . , or ? , or just those wierd square
            > things that you get when you open a file using an editor that doesn't
            > support the file format.[/color]

            Hmm. Thing is, if it's really writing an Excel spreadsheet then it's a
            binary file to start with, which is part of what confuses me - unless
            it's actually just writing CSV data and using the content-type to
            direct it to Excel...

            --
            Jon Skeet - <skeet@pobox.co m>
            Pobox has been discontinued as a separate service, and all existing customers moved to the Fastmail platform.

            If replying to the group, please do not mail me too

            Comment

            Working...