Replace every n instances of a string

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Tom Cross

    Replace every n instances of a string

    Hello-

    I have a function that returns to me a text representation of Unicode
    data, which looks like this:

    \u0013\u0021\u0 03c\u003f\u0044 \u001f\u006a\u0 05a\u0050\u0015 \u0018\u001d\u0 07e\u006b\u004e \u007d\u006a\u0 06e\u0068\u0042 \u0026\u003c\u0 04f\u0059\u0056 \u002b\u001a\u0 077\u0065\u006a \u000a\u0021\u0 05f\u0025\u003f \u0025\u0024\u0 07e\u0020\u0011 \u0060\u002c\u0 037\u0067\u007a \u0074\u0074\u0 003\u0003\u000f \u0039\u0018\u0 059\u0038\u0029 \u0001\u0073\u0 034\u0009\u0069 \u005e\u0003\u0 06e\u000d\u004c \u001d\u00
    f\u006e\u001b\u 006e\u0063\u000 b\u0014\u0071\u 007c\u004e\u006 a\u0011\u004a\u 001f\u0063\u001 6\u003d\u0020\u 0065\u003e\u004 3\u0012\u0047\u 0026\u0062\u000 4\u0025\u003b\u 0005\u004c\u002 e\u005a\u0070\u 0048

    I would like to add carriage returns to this for usability. But I
    don't want to add a return after each "\u" I encounter in the text
    (regexp comes to mind if I did). I want to add a return after each 12
    "\\u"s I encounter in the string.

    Any ideas? Do I not want to search for "\\u" but instead just insert
    a \n after each 72 characters (equivalent to 12 \uXXXX codes)? Would
    this provide better performance? If so, what would be the easiest way
    to do that?

    Thanks much!
  • Terry Reedy

    #2
    Re: Replace every n instances of a string


    "Tom Cross" <thomasacross@h otmail.com> wrote in message
    news:62de87da.0 308151325.41ea4 622@posting.goo gle.com...[color=blue]
    > I have a function that returns to me a text representation of[/color]
    Unicode[color=blue]
    > data, which looks like this:[/color]
    ....[color=blue]
    > I would like to add carriage returns...after each 12
    > "\\u"s I encounter in the string.
    >
    > Any ideas? Do I not want to search for "\\u" but instead just[/color]
    insert[color=blue]
    > a \n after each 72 characters (equivalent to 12 \uXXXX codes)?[/color]
    Would[color=blue]
    > this provide better performance? If so, what would be the easiest[/color]
    way[color=blue]
    > to do that?[/color]

    Split string into list of 6*n (72) char chunks and join with \n:

    #unirep = textrep(unidata ) #ie, call your func and store result. for
    illustration...
    unirep =
    r'\u0013\u0021\ u003c\u003f\u00 44\u001f\u006a\ u005a\u0050\u00 15\u0018'

    blocklen = 6*4 #instead of 6*12 to get multiple lines with short
    unirep
    unilist = []
    for i in range(0, len(unirep), blocklen):
    unilist.append( unirep[i:i+blocklen])

    unilines = '\n'.join(unili st)[color=blue][color=green][color=darkred]
    >>> print unilines[/color][/color][/color]
    \u0013\u0021\u0 03c\u003f
    \u0044\u001f\u0 06a\u005a
    \u0050\u0015\u0 018

    Consider whether you want to change r'\n' to something else like
    spaces for easier viewing. If so, do so on unirep before chop into
    blocks and adjust blocklen if replacement is not two chars.

    Terry J. Reedy




    Terry J. Reedy


    Comment

    Working...