Using repr() with escape sequences

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • nummertolv

    Using repr() with escape sequences

    Hi,

    My application is receiving strings, representing windows paths, from
    an external source. When using these paths, by for instance printing
    them using str() (print path), the backslashes are naturally
    interpreted as escape characters.
    [color=blue][color=green][color=darkred]
    >>> print "d:\thedir"[/color][/color][/color]
    d: hedir

    The solution is to use repr() instead of str():
    [color=blue][color=green][color=darkred]
    >>> print repr("d:\thedir ")[/color][/color][/color]
    'd:\thedir'

    What I have not been able to figure out is how to handle escape
    sequences like \a, \b, \f, \v and \{any number} inside the paths. Using
    repr() on these escape sequences either prints the hex value of the
    character (if "unprintabl e" i guess) or some character ( like in the
    last example below).
    [color=blue][color=green][color=darkred]
    >>> print repr("d:\thedir \10")[/color][/color][/color]
    'd:\thedir\x08'
    [color=blue][color=green][color=darkred]
    >>> print repr("d:\thedir \foo")[/color][/color][/color]
    'd:\thedir\x0co o'
    [color=blue][color=green][color=darkred]
    >>> print repr("d:\thedir \100")[/color][/color][/color]
    'd:\thedir@'

    Could someone clear this out for me and let me know how I can find the
    "real" path that I am trying to receive?

    /Henrik

  • Steven D'Aprano

    #2
    Re: Using repr() with escape sequences

    On Thu, 23 Feb 2006 07:32:36 -0800, nummertolv wrote:
    [color=blue]
    > Hi,
    >
    > My application is receiving strings, representing windows paths, from
    > an external source. When using these paths, by for instance printing
    > them using str() (print path), the backslashes are naturally
    > interpreted as escape characters.
    >[color=green][color=darkred]
    >>>> print "d:\thedir"[/color][/color]
    > d: hedir[/color]

    No. What is happening here is not what you think is happening.
    [color=blue]
    > The solution is to use repr() instead of str():[/color]

    The solution to what? What is the problem? The way the strings are
    DISPLAYED is surely not the issue, is it?
    [color=blue][color=green][color=darkred]
    >>>> print repr("d:\thedir ")[/color][/color]
    > 'd:\thedir'[/color]


    You have created a string object: "d:\thedir"

    That string object is NOT a Windows path. It contains a tab character,
    just like the print statement shows -- didn't you wonder about the large
    blank space in the string?

    Python uses backslashes for character escapes. \t means a tab character.
    When you enter "d:\thedir" you are embedding a tab between the colon and
    the h.

    The solutions to this problem are:

    (1) Escape the backslash: "d:\\thedir "

    (2) Use raw strings that don't use char escapes: r"d:\thedir"

    (3) Use forward slashes, and let Windows automatically handle them:
    "d:/thedir"

    However, if you are receiving strings from an external source, as you say,
    and reading them from a file, this should not be an issue. If you read a
    file containing "d:\thedir" , and print the string you have just read, the
    print statement uses repr() and you will see that the string is just
    what you expect:

    d:\thedir

    You can also check for yourself that the string is correct by looking at
    its length: nine characters.


    --
    Steven.

    Comment

    • nummertolv

      #3
      Re: Using repr() with escape sequences

      I think I might have misused the terms "escape character" and/or
      "escape sequence" or been unclear in some other way because I seem to
      have confused you. In any case you don't seem to be addressing my
      problem.

      I know that the \t in the example path is interpreted as the tab
      character (that was part of the point of the example) and what the
      strings are representing is irrelevant. And yes, the way the strings
      are displayed is part of the issue.

      So let me try to be clearer by boiling the problem down to this:

      - Consider a string variable containing backslashes.
      - One or more of the backslashes are followed by one of the letters
      a,b,f,v or a number.

      myString = "bar\foo\12foob ar"

      How do I print this string so that the output is as below?

      bar\foo\12fooba r

      typing 'print myString' prints the following:

      bar oo
      foobar

      and typing print repr(myString) prints this:

      'bar\x0coo\nfoo bar'


      Hope this makes it clearer. I guess there is a simple solution to this
      but I have not been able to find it. Thanks.

      /H

      Comment

      • Sybren Stuvel

        #4
        Re: Using repr() with escape sequences

        nummertolv enlightened us with:[color=blue]
        > myString = "bar\foo\12foob ar"[/color]

        Are the interpretations of the escape characters on purpose?
        [color=blue]
        > How do I print this string so that the output is as below?
        >
        > bar\foo\12fooba r[/color]

        Why do you want to?
        [color=blue]
        > typing 'print myString' prints the following:
        >
        > bar oo
        > foobar[/color]

        Which is correct.

        Sybren
        --
        The problem with the world is stupidity. Not saying there should be a
        capital punishment for stupidity, but why don't we just take the
        safety labels off of everything and let the problem solve itself?
        Frank Zappa

        Comment

        • Daniel Dittmar

          #5
          Re: Using repr() with escape sequences

          nummertolv wrote:[color=blue]
          > - Consider a string variable containing backslashes.
          > - One or more of the backslashes are followed by one of the letters
          > a,b,f,v or a number.
          >
          > myString = "bar\foo\12foob ar"
          >
          > How do I print this string so that the output is as below?
          >
          > bar\foo\12fooba r
          >
          > typing 'print myString' prints the following:
          >
          > bar oo
          > foobar
          >
          > and typing print repr(myString) prints this:
          >
          > 'bar\x0coo\nfoo bar'
          >[/color]

          The interpretation of escape sequences happens when the Python compiler
          reads the string "bar\foo\12foob ar". You'll see that when you do
          something like[color=blue][color=green][color=darkred]
          >>> map (ord, "bar\foo\12foob ar")[/color][/color][/color]
          [98, 97, 114, 12, 111, 111, 10, 102, 111, 111, 98, 97, 114]
          This displays the ASCII values of all the characters.

          If you want to use a string literal containing backslashes, use r'' strings:[color=blue][color=green][color=darkred]
          >>> myString = r'bar\foo\12foo bar'
          >>> map (ord, myString)[/color][/color][/color]
          [98, 97, 114, 92, 102, 111, 111, 92, 49, 50, 102, 111, 111, 98, 97, 114][color=blue][color=green][color=darkred]
          >>> print myString[/color][/color][/color]
          bar\foo\12fooba r[color=blue][color=green][color=darkred]
          >>> print repr (myString)[/color][/color][/color]
          'bar\\foo\\12fo obar'

          If you get the strings from an external source as suggested by your
          original post, then you really have no problem at all. No interpretation
          of escape sequences takes place when you read a string from a file.

          Daniel

          Comment

          • nummertolv

            #6
            Re: Using repr() with escape sequences

            myString = "bar\foo\12foob ar"
            print repr(myString)

            My "problem" was that I wanted to know if there is a way of printing
            "unraw" strings like myString so that the escape characters are written
            like a backslash and a letter or number. My understanding was that
            repr() did this and it does in most cases (\n and \t for instance). In
            the cases of \a,\b,\f and \v however, it prints hexadecimal numbers.
            But I guess I'll just have to live with that and as you point out, it
            doesn't have to be a problem anyway.

            Comment

            Working...