Problem with wctomb...

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • allez
    New Member
    • May 2007
    • 2

    Problem with wctomb...

    Hi,
    I'm trying to convert a wide character string in UTF-8 into a multibyte string using wctomb and I'm running into a problem when I try to convert characters that take more than one byte (ie, non ASCII characters). Below is simple code that produces the problem:

    Code:
    #include <iostream>
    using namespace std;
    
    int main()
    {
      char  buffer[40];
      for(size_t i=0; i<40; ++i)
        buffer[i]=0;
    //  wchar_t wch = L'ア'; (UTF-8 65393)
      wchar_t wch = L'ñ';
      wchar_t wide = 241;
      cout << wch << endl;
      int length;
      if (wch == wide)
        cout << "Yes!" << endl;
    
      length = wctomb( buffer, wch );
      printf( "The number of bytes that comprise the multibyte "
                 "character is %i\n", length );
      printf( "And the converted string is \"%s\"\n", buffer );
    
    	return 0;
    }
    If I substitue a simple character like L'e', then it works as expected, however with either of these more complicated characters (the first is Chinese, second is spanish) wctomb returns -1, meaning it couldn't convert the character. Why?
    It fails if I use either the number or the actual letter representation, which makes sense because they are equivalent.

    I'm running this on Suse Linux Enterprise Desktop 10 and compiling with g++ 4.1.0.

    Thanks,
    Andrew
  • allez
    New Member
    • May 2007
    • 2

    #2
    I found the solution. In my shell I have the LANG environment variable set to en_US.UTF-8 so I thought everything would be okay. However, inside the program I added the following lines
    Code:
    #include <locale>
    	char* local = setlocale(LC_CTYPE, NULL);
    	cout << "Current LC_CTYPE is " << local << endl;
    and found out that it thought the LC_CTYPE is C. So adding the line
    Code:
    	setlocale(LC_ALL, "en_US.UTF-8");
    made everything work as it should.

    Comment

    • ilikepython
      Recognized Expert Contributor
      • Feb 2007
      • 844

      #3
      Originally posted by allez
      I found the solution. In my shell I have the LANG environment variable set to en_US.UTF-8 so I thought everything would be okay. However, inside the program I added the following lines
      Code:
      #include <locale>
      	char* local = setlocale(LC_CTYPE, NULL);
      	cout << "Current LC_CTYPE is " << local << endl;
      and found out that it thought the LC_CTYPE is C. So adding the line
      Code:
      	setlocale(LC_ALL, "en_US.UTF-8");
      made everything work as it should.
      Glad you solved your problem.

      Comment

      Working...