How to map python's unicode stuff to a wchar_t based api?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Ames Andreas (MPA/DF)

    How to map python's unicode stuff to a wchar_t based api?

    Hi all,

    besides PyUnicode_(From )|(As)WideChar I haven't found specific support
    for wchar_t in the python api. Is there a default codec that produces
    wchar_t* (in a platform-neutral way) or something else in
    PyArg_ParseTupl e's format string that could help me? What encoding is
    used for python's Py_UNICODE thing?

    My specific problem is that I wrap an api where one function can have
    as well an ansi as a unicode variant. I don't want to decide which
    variant to use at compile time but rather at runtime. Therefore I
    have a default argument useUnicode and if possible I want to get rid
    of the

    if (PyObject_IsTru e(useUnicode)) {
    doTheUnicodeStu ff();
    ...
    }
    else {
    doTheAnsiThingW hichLooksAlmost IdenticalToTheA boveButJustAlmo st();
    ...
    }

    annoyance in almost any function.


    TIA,

    andreas


  • Neil Hodgson

    #2
    Re: How to map python's unicode stuff to a wchar_t based api?

    Ames Andreas:
    [color=blue]
    > Therefore I have a default argument useUnicode and
    > if possible I want to get rid of the
    >
    > if (PyObject_IsTru e(useUnicode)) {
    > doTheUnicodeStu ff();
    > ...
    > }
    > else {
    > doTheAnsiThingW hichLooksAlmost IdenticalToTheA boveButJustAlmo st();
    > ...
    > }
    >
    > annoyance in almost any function.[/color]


    To support Unicode file names on Win32, the convention described in PEP
    277 is to call the wide API when the argument was Unicode, otherwise call
    the ANSI API. From src/Modules/posixmodule.c this looks like

    #ifdef Py_WIN_WIDE_FIL ENAMES
    if (unicode_file_n ames()) {
    PyUnicodeObject *po;
    if (PyArg_ParseTup le(args, "Ui:access" , &po, &mode)) {
    Py_BEGIN_ALLOW_ THREADS
    /* PyUnicode_AS_UN ICODE OK without thread lock as
    it is a simple dereference. */
    res = _waccess(PyUnic ode_AS_UNICODE( po), mode);
    Py_END_ALLOW_TH READS
    return(PyBool_F romLong(res == 0));
    }
    /* Drop the argument parsing error as narrow strings
    are also valid. */
    PyErr_Clear();
    }
    #endif

    This code then falls through to the ANSI API.

    Py_WIN_WIDE_FIL ENAMES is only defined (in src/PC/pyconfig.h) when Python
    is using 2 byte wide Unicode characters (Py_UNICODE_SIZ E == 2), thus
    ensuring that the result on Win32 of PyUnicode_AS_UN ICODE is equivalent to
    wchar_t*.

    Whether PY_UNICODE_TYPE is wchar_t depends on platform and other
    definitions.

    The extra code required for PEP 277 adds to size and obscures intent.
    Various code reduction techniques can be alleviate this such as posix_1str
    in posixmodule or using some preprocessor cleverness.

    Neil



    Comment

    Working...