Interpreting \ escape sequences in strings

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Paul Watson

    Interpreting \ escape sequences in strings

    How can I get the escapes from a command line parameter interpreted?

    The user provides a string on the command line. The string might contain
    traditional escapes such as \t, \n, etc. It might also contain escaped
    octal or hex such as \001 or \x09.

    The escapes are coming into sys.argv[] without shell interpretation. Do I
    need to use the compile module to make this work? Any suggestions?

    ===
    $ cat ./try_arglen2.py
    #! /usr/bin/env python
    import sys, StringIO
    print sys.argv[1]

    print sys.argv[1] % ()

    sf = StringIO.String IO()
    print >> sf, sys.argv[1],
    c = sf.getvalue()
    sf.close()
    print "now" + c + "is" + c + "the"

    # This works because the interpreter is processing the escapes.

    sf = StringIO.String IO("\001")
    c = sf.getvalue()
    sf.close()
    print "now" + c + "is" + c + "the"

    ===
    $ ./try_arglen2.py '\001'
    \001
    \001
    now\001is\001th e
    now?is?the


  • Peter Otten

    #2
    Re: Interpreting \ escape sequences in strings

    Paul Watson wrote:
    [color=blue]
    > How can I get the escapes from a command line parameter interpreted?
    >
    > The user provides a string on the command line. The string might contain
    > traditional escapes such as \t, \n, etc. It might also contain escaped
    > octal or hex such as \001 or \x09.
    >
    > The escapes are coming into sys.argv[] without shell interpretation. Do I
    > need to use the compile module to make this work? Any suggestions?
    >
    > ===
    > $ cat ./try_arglen2.py
    > #! /usr/bin/env python
    > import sys, StringIO
    > print sys.argv[1]
    >
    > print sys.argv[1] % ()
    >
    > sf = StringIO.String IO()
    > print >> sf, sys.argv[1],
    > c = sf.getvalue()
    > sf.close()
    > print "now" + c + "is" + c + "the"
    >
    > # This works because the interpreter is processing the escapes.
    >
    > sf = StringIO.String IO("\001")
    > c = sf.getvalue()
    > sf.close()
    > print "now" + c + "is" + c + "the"
    >
    > ===
    > $ ./try_arglen2.py '\001'
    > \001
    > \001
    > now\001is\001th e
    > now?is?the[/color]

    If I'm understanding you correctly:

    <args.py>
    import sys
    print sys.argv[1].decode("string _escape")
    </args.py>

    $ python args.py "winter\nof\012 our\x0Adisconte nt"
    winter
    of
    our
    discontent

    Peter

    Comment

    • Paul Watson

      #3
      Re: Interpreting \ escape sequences in strings


      "Peter Otten" <__peter__@web. de> wrote in message
      news:c302qi$3lj $04$1@news.t-online.com...[color=blue]
      > Paul Watson wrote:
      >[color=green]
      > > How can I get the escapes from a command line parameter interpreted?
      > >
      > > The user provides a string on the command line. The string might[/color][/color]
      contain[color=blue][color=green]
      > > traditional escapes such as \t, \n, etc. It might also contain escaped
      > > octal or hex such as \001 or \x09.
      > >
      > > The escapes are coming into sys.argv[] without shell interpretation. Do[/color][/color]
      I[color=blue][color=green]
      > > need to use the compile module to make this work? Any suggestions?
      > >
      > > ===
      > > $ cat ./try_arglen2.py
      > > #! /usr/bin/env python
      > > import sys, StringIO
      > > print sys.argv[1]
      > >
      > > print sys.argv[1] % ()
      > >
      > > sf = StringIO.String IO()
      > > print >> sf, sys.argv[1],
      > > c = sf.getvalue()
      > > sf.close()
      > > print "now" + c + "is" + c + "the"
      > >
      > > # This works because the interpreter is processing the escapes.
      > >
      > > sf = StringIO.String IO("\001")
      > > c = sf.getvalue()
      > > sf.close()
      > > print "now" + c + "is" + c + "the"
      > >
      > > ===
      > > $ ./try_arglen2.py '\001'
      > > \001
      > > \001
      > > now\001is\001th e
      > > now?is?the[/color]
      >
      > If I'm understanding you correctly:
      >
      > <args.py>
      > import sys
      > print sys.argv[1].decode("string _escape")
      > </args.py>
      >
      > $ python args.py "winter\nof\012 our\x0Adisconte nt"
      > winter
      > of
      > our
      > discontent
      >
      > Peter[/color]

      I did have not explained it clearly. I want the user to specify a string
      that I will put between words in the output. The user specified string can
      have escape sequences. For example, the user wants to put a binary 1 (\001)
      between each output word.

      import sys
      words = ['now', 'is', 'the', 'time']
      print '\001'.join(wor ds) #this works
      print sys.argv[1].join(words) #this fails

      $ ./putbetween.py '\001'
      now?is?the?time
      now\001is\001th e\001time


      Comment

      • Peter Otten

        #4
        Re: Interpreting \ escape sequences in strings

        Paul Watson wrote:
        [color=blue]
        > I did have not explained it clearly. I want the user to specify a string[/color]

        Seems it was clear enough, you only didn't recognize the answer :-)
        [color=blue]
        > that I will put between words in the output. The user specified string
        > can
        > have escape sequences. For example, the user wants to put a binary 1
        > (\001) between each output word.
        >
        > import sys
        > words = ['now', 'is', 'the', 'time']
        > print '\001'.join(wor ds) #this works
        > print sys.argv[1].join(words) #this fails[/color]

        Change the above line to

        print sys.argv[1].decode("string _escape")

        s.decode("strin g_escape") returns a new string with all c-style escape
        sequences converted into the corresponding characters. This is an abuse -
        ahem, example of a general mechanism. Look for codecs if you want to learn
        more about it.

        Peter

        Comment

        • Paul Watson

          #5
          Re: Interpreting \ escape sequences in strings


          "Peter Otten" <__peter__@web. de> wrote in message
          news:c311kc$l1h $07$1@news.t-online.com...[color=blue]
          > Paul Watson wrote:
          >[color=green]
          > > I did have not explained it clearly. I want the user to specify a[/color][/color]
          string[color=blue]
          >
          > Seems it was clear enough, you only didn't recognize the answer :-)
          >[color=green]
          > > that I will put between words in the output. The user specified string
          > > can
          > > have escape sequences. For example, the user wants to put a binary 1
          > > (\001) between each output word.
          > >
          > > import sys
          > > words = ['now', 'is', 'the', 'time']
          > > print '\001'.join(wor ds) #this works
          > > print sys.argv[1].join(words) #this fails[/color]
          >
          > Change the above line to
          >
          > print sys.argv[1].decode("string _escape")
          >
          > s.decode("strin g_escape") returns a new string with all c-style escape
          > sequences converted into the corresponding characters. This is an abuse -
          > ahem, example of a general mechanism. Look for codecs if you want to learn
          > more about it.
          >
          > Peter[/color]

          Thank you. I appreciate your help. Yes, I missed it. I will look at the
          decode doc. I expected that this was for converting character encodings
          (codepages). This does work under Python 2.3, and decode was available in
          2.2.

          However, I am in a Python 2.1 environment. Do you know of any techniques
          that would work under Python 2.1?


          Comment

          • Peter Otten

            #6
            Re: Interpreting \ escape sequences in strings

            Paul Watson wrote:
            [color=blue]
            > However, I am in a Python 2.1 environment. Do you know of any techniques
            > that would work under Python 2.1?[/color]

            eval('"' + s + '"')

            This of course requires that " chars occuring in s are preceded by a
            backslash:
            [color=blue][color=green][color=darkred]
            >>> def unescape(s):[/color][/color][/color]
            .... return eval('"' + s + '"')
            ....[color=blue][color=green][color=darkred]
            >>> unescape("\\x0a ")[/color][/color][/color]
            '\n'[color=blue][color=green][color=darkred]
            >>> unescape("\\x0a '")[/color][/color][/color]
            "\n'"[color=blue][color=green][color=darkred]
            >>> unescape("\\x0a \"")[/color][/color][/color]
            Traceback (most recent call last):
            File "<stdin>", line 1, in ?
            File "<stdin>", line 2, in unescape
            File "<string>", line 1
            "\x0a""
            ^
            SyntaxError: invalid token[color=blue][color=green][color=darkred]
            >>> unescape('\\x0a \\"')[/color][/color][/color]
            '\n"'

            Peter

            Comment

            • Peter Otten

              #7
              Re: Interpreting \ escape sequences in strings

              Peter Otten wrote:
              [color=blue]
              > Paul Watson wrote:
              >[color=green]
              >> However, I am in a Python 2.1 environment. Do you know of any techniques
              >> that would work under Python 2.1?[/color]
              >
              > eval('"' + s + '"')[/color]

              I should have warned you that this is a security hole, as it allows the user
              to execute arbitrary code. E. g:

              <args.py>
              import sys

              def somefunc():
              print "somefunc called"
              return ""

              def unescape(s):
              return eval('"' + s + '"')

              print unescape(sys.ar gv[1])
              </args.py>

              $ python args.py '"+somefunc()+" '
              somefunc called

              Peter

              Comment

              Working...