Character set support

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • oraustin@hotmail.com

    Character set support

    I'm looking at an MS SQL server database and it stores city names
    across the world. All a
    arabic , chinese etc names are converted to the latin alphabet.

    I feel it is probably critical we support accents as a means to
    distinguish between cities that would be converted to the same Latin
    alphabet representation otherwise and as a matter of cultural
    politeness.


    Now ...... how do we do this?
    Use varchar? which character code page do we use?
    Use nvarchar? I heard a rumour that SQL server pads unused characters
    with blanks and hence will use space unnecessarily.


    The data will be presented on sheets and emailed around the world - it
    will also appear on a website.


    Any advice greatly appreciated - thanks
    Oliver

  • Erland Sommarskog

    #2
    Re: Character set support

    (oraustin@hotma il.com) writes:
    I'm looking at an MS SQL server database and it stores city names
    across the world. All a
    arabic , chinese etc names are converted to the latin alphabet.
    >
    I feel it is probably critical we support accents as a means to
    distinguish between cities that would be converted to the same Latin
    alphabet representation otherwise and as a matter of cultural
    politeness.
    >
    >
    Now ...... how do we do this?
    Use varchar? which character code page do we use?
    Use nvarchar?
    You need to use nvarchar. With varchar you can only support one code
    page, and that is not enough. For instance, if you use CP-1252, Windows
    Latin-1, you will not be able to enter the capital of Roumania
    correctly, nor the largest city of Turkey. (I'm using a newsreader that
    is not Unicode capable, so embarrassing enough, I cannot enter these
    name correctly myself.)
    I heard a rumour that SQL server pads unused characters
    with blanks and hence will use space unnecessarily.
    That is incorrect. If you use varchar/nvarchar, SQL Server will not
    pad. However, if you save data with trailing spaces, SQL Server will
    by default to chop that off. The char/nchar data types on the other
    hand are fixed-length, but you would not use them for city length.

    On the other hand, in nvarchar each character takes up two bytes, in
    difference to varchar with one byte per character. So nvarchar takes
    twice the space.

    But given your requirements, you don't have much choice.


    --
    Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

    Books Online for SQL Server 2005 at

    Books Online for SQL Server 2000 at

    Comment

    Working...