Extracting Unicode characters from RTF

**geegeegeegee** · Mar 4 '08, 04:50 AM

I should have mentioned, testing was carried out with Input Language set to Romanian.
Greg

**NeoPa** · Mar 5 '08, 12:13 PM

Greg,

I commend you on the care taken to specify the question as well as the trouble you've obviously already gone to to find a solution yourself.

I'm afraid I can't help you directly with this issue, but I will flag it for some of the other Access experts to come and have a look-see in case any of them can help. It is more of a problem come across using Access than an Access problem per-se though, so if we can find no joy in here it may be worth throwing up a link to this thread in the Windows forum too.

Let's see what flagging to the other Access experts can do for us first though.

**Scott Price** · Mar 5 '08, 03:41 PM

The ChrW() function will return/display the character associated with the hex value of any Unicode character.

Syntax is ChrW(&H15F) this displays correctly the s with cedilla below in a simple text box that I set up in my test database. Using ChrW(&HBA) displays the degree symbol that you mention. You mention them being the other way 'round, which makes me wonder if that isn't a typo?

I'm not personally familiar with Lebans' RTF2, but after doing a little research into the character sets and code pages involved, it looks to me that you actually have a code page problem, not a fcharset problem. For example, the codepage for Latin 2 is 1250 (see here), and maps the code page character BA to the Unicode character 015F. However, the code page 1252 (see here) kindly takes the same code page character BA and maps it to the Unicode character BA which corresponds to the masculine ordinal indicator (so it says... Just means the degree character more or less).

My suggestion is that you are receiving the text encoded with code page 1250 and interpreting it based on the 1252 encoding.

Again, I'm not familiar with Lebans' RTF2, but somehow you will need to find the coding to change this encoding/decoding discrepancy. Sorry to not be able to give you any specific help on doing that :-(

Regards,
Scott

**Scott Price** · Mar 5 '08, 03:46 PM

A few links that contain helpful and not so helpful information that I came across in my research:

MS developer discussion

Character sets and Code pages

Wikipedia Character Encoding

Wikipedia Code Pages

Wikipedia Romanian Alphabet

Kind regards,
Scott

**geegeegeegee** · Mar 6 '08, 02:39 AM

Thanks for your suggestion Scott. I think the 1252 code page will point us in the right direction. Will let you know how we go.

**Scott Price** · Mar 6 '08, 03:21 AM

Let me know how it goes! Good luck.

Regards,
Scott

Extracting Unicode characters from RTF

Extracting Unicode characters from RTF

Comment

Comment

Comment

Comment

Comment

Comment