MultiByteToWideChar and GCC?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Swandog46
    New Member
    • Jul 2008
    • 4

    MultiByteToWideChar and GCC?

    I hope this is the appropriate forum for this question. I apologize
    if it is not.

    Consider the following code snippet:

    DWORD length;
    PCHAR buf = new CHAR[64];
    strcpy( buf, "HelloWorld " );
    printf( "*(&length - 1) = 0x%x\n", *(&length - 1) );
    length = MultiByteToWide Char( CP_ACP, 0, buf, -1, NULL, 0 );
    printf( "*(&length - 1) = 0x%x\n", *(&length - 1) );

    When compiled under GCC 3.4.2 (mingw) with the optimizer on, output is
    produced such as:

    *(&length - 1) = 0x22fc40
    *(&length - 1) = 0x0

    In other words, MultiByteToWide Char is somehow overwriting another
    local variable on the stack. Specifically it is the variable at the
    memory address four bytes previous to the "length" DWORD. (in my
    particular case, it happens to be the 'this' pointer for the class I'm
    implementing, which causes the entire class to implode.) It turns out
    that whatever value I pass as the sixth argument to
    MultiByteToWide Char (the WCHAR character count of theoutput buffer)
    gets written to the DWORD at address (&length - 1).

    When compiled under GCC 3.4.2 with the optimizer off, the entire
    MultiByteToWide Char call fails with last error set to "invalid
    parameter".

    Now consider the following code, which is exactly the same as above
    except the character buffer is allocated on the stack:

    DWORD length;
    CHAR buf[64];
    strcpy( buf, "HelloWorld " );
    printf( "*(&length - 1) = 0x%x\n", *(&length - 1) );
    length = MultiByteToWide Char( CP_ACP, 0, buf, -1, NULL, 0 );
    printf( "*(&length - 1) = 0x%x\n", *(&length - 1) );

    In this case GCC 3.4.2 has no problem with the code, either optimized
    or not.

    Why should the behavior be different just based upon whether the
    buffer is allocated on the heap or the stack?

    I've disassembled the binary in each case and can't see anything wrong
    with what GCC is doing. It seems to me like the bug would have to be
    within MultiByteToWide Char itself (unless I am doing something obvious
    and stupid and just can't see it).

    The frustrating thing is that I cannot make this happen if I recompile
    the above code from scratch in a new program. However, in the class I
    was originally writing, where this first arose, I can make it happen
    every time.

    I must be doing something wrong, but I can't for the life of me see
    what. Any ideas would be greatly appreciated. Thank you very much!
  • Banfa
    Recognized Expert Expert
    • Feb 2006
    • 9067

    #2
    Originally posted by Swandog46
    The frustrating thing is that I cannot make this happen if I recompile
    the above code from scratch in a new program. However, in the class I
    was originally writing, where this first arose, I can make it happen
    every time.
    I have to say that to me this suggests that although the symptom occurs in this section of code the actual error is elsewhere.

    Comment

    • Swandog46
      New Member
      • Jul 2008
      • 4

      #3
      Thank you for the reply.

      Yes, I might expect so too, but if I dump the entire stack frame at each instruction, I find the problem occurs exactly at the call to MultiByteToWide Char. And then if I write the exact code snippet I posted, which makes no reference to any previous code at all, the problem occurs. The stack is intact before the call and corrupted after it. I might expect you were right, if the call passed even a single local variable not defined in this code snippet. But it doesn't.

      Comment

      • weaknessforcats
        Recognized Expert Expert
        • Mar 2007
        • 9214

        #4
        Do not expect calls like this to work as log as you have hard-coded types like char in your programs.

        Use the TCHAR mappings.

        Then things should work regardlesss of wheher you are using Unicode or not. I expect in this case you are compiling with Unicode ON (the default) and using char data types which are too small.

        Comment

        • Swandog46
          New Member
          • Jul 2008
          • 4

          #5
          I am reading a file encoded in UTF-8, and I need to convert it to UTF-16. This is the standard API for that purpose, and it has nothing to do with TCHAR. Either the API works or it doesn't --- but it has to be used in this case.

          Comment

          • Swandog46
            New Member
            • Jul 2008
            • 4

            #6
            Anyone, any ideas? Thanks!

            Comment

            Working...