sed to python: replace Q

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Raymond

    sed to python: replace Q

    For some reason I'm unable to grok Python's string.replace( ) function.
    Just trying to parse a simple IP address, wrapped in square brackets,
    from Postfix logs. In sed this is straightforward given:

    line = "date process text [ip] more text"

    sed -e 's/^.*\[//' -e 's/].*$//'

    yet the following Python code does nothing:

    line = line.replace('^ .*\[', '', 1)
    line = line.replace('].*$', '')

    Is there a decent description of string.replace( ) somewhere?

    Raymond
  • Lutz Horn

    #2
    Re: sed to python: replace Q

    Hi,

    2008/4/30 Raymond <not-for-mail@sonic.net> :
    For some reason I'm unable to grok Python's string.replace( ) function.
    replace() does not work with regular expressions.
    Is there a decent description of string.replace( ) somewhere?
    Use re.sub().
    >>import re
    >>line = "date process text [ip] more text"
    >>re.sub('].*$', '', re.sub('^.*\[', '', line, 1))
    'ip'

    Lutz

    Comment

    • happyriding

      #3
      Re: sed to python: replace Q

      On Apr 29, 11:27 pm, Raymond <not-for-m...@sonic.netw rote:
      For some reason I'm unable to grok Python's string.replace( ) function.
      line = "abc"
      line = line.replace("a ", "x")
      print line

      --output:--
      xbc

      line = "abc"
      line = line.replace("[apq]", "x")
      print line

      --output:--
      abc


      Does the 5 character substring "[apq]" exist anywhere in the original
      string?

      Comment

      • Robert Bossy

        #4
        Re: sed to python: replace Q

        Raymond wrote:
        For some reason I'm unable to grok Python's string.replace( ) function.
        Just trying to parse a simple IP address, wrapped in square brackets,
        from Postfix logs. In sed this is straightforward given:
        >
        line = "date process text [ip] more text"
        >
        sed -e 's/^.*\[//' -e 's/].*$//'
        >
        alternatively:
        sed -e 's/.*\[\(.*\)].*/\1/'
        yet the following Python code does nothing:
        >
        line = line.replace('^ .*\[', '', 1)
        line = line.replace('].*$', '')
        >
        Is there a decent description of string.replace( ) somewhere?
        >
        In python shell:
        help(str.replac e)

        Online:


        But what you are probably looking for is re.sub():



        RB

        Comment

        • Kam-Hung Soh

          #5
          Re: sed to python: replace Q

          On Wed, 30 Apr 2008 15:27:36 +1000, Raymond <not-for-mail@sonic.netw rote:
          For some reason I'm unable to grok Python's string.replace( ) function.
          Just trying to parse a simple IP address, wrapped in square brackets,
          from Postfix logs. In sed this is straightforward given:
          >
          line = "date process text [ip] more text"
          >
          sed -e 's/^.*\[//' -e 's/].*$//'
          >
          yet the following Python code does nothing:
          >
          line = line.replace('^ .*\[', '', 1)
          line = line.replace('].*$', '')
          str.replace() doesn't support regular expressions.

          Try:

          import re
          p = re.compile("^.* \[")
          q = re.compile("].*$")
          q.sub('',p.sub( '', line))
          >
          Is there a decent description of string.replace( ) somewhere?
          >
          Raymond
          Section 3.6.1 String Functions

          --
          Kam-Hung Soh <a href="http://kamhungsoh.com/blog">Software Salariman</a>

          Comment

          • Kam-Hung Soh

            #6
            Re: sed to python: replace Q

            On Wed, 30 Apr 2008 17:12:15 +1000, Kam-Hung Soh <kamhung.soh@gm ail.com>
            wrote:
            On Wed, 30 Apr 2008 15:27:36 +1000, Raymond <not-for-mail@sonic.net
            wrote:
            >
            >For some reason I'm unable to grok Python's string.replace( ) function..
            >Just trying to parse a simple IP address, wrapped in square brackets,
            >from Postfix logs. In sed this is straightforward given:
            >>
            >line = "date process text [ip] more text"
            >>
            > sed -e 's/^.*\[//' -e 's/].*$//'
            >>
            >yet the following Python code does nothing:
            >>
            > line = line.replace('^ .*\[', '', 1)
            > line = line.replace('].*$', '')
            >
            str.replace() doesn't support regular expressions.
            >
            Try:
            >
            import re
            p = re.compile("^.* \[")
            q = re.compile("].*$")
            q.sub('',p.sub( '', line))
            >
            Another approach is to use the split() function in "re" module.

            import re
            re.split("[\[\]]", line)[1]

            See http://docs.python.org/lib/node46.html

            --
            Kam-Hung Soh <a href="http://kamhungsoh.com/blog">Software Salariman</a>

            Comment

            • Raymond

              #7
              Re: sed to python: replace Q

              >Another approach is to use the split() function in "re" module.

              Ah ha, thar's the disconnect. Thanks for all the pointers, my def is
              now working. Still don't understand the logic behind this design though.
              I mean why would any programming language have separate search or find
              functions, one for regex and and another for non-regex based pattern
              matching?

              Aren't sed, awk, grep, and perl the reference implementations of search
              and replace? They don't have non-regex functions, why does Python?
              Wouldn't it be a lot simpler to use a flag, like grep's '-f', to change
              the meaning of a search string to be literal?

              My other gripe is with the kludgy object-oriented regex functions.
              Couldn't these be better implemented in-line? Why should I, as a coder,
              have to 're.compile()' when all the reference languages do this at compile
              time, from a much more straightforward and easy to read in-line function...

              Raymon

              Comment

              • Marco Mariani

                #8
                Re: sed to python: replace Q

                Raymond wrote:
                Aren't sed, awk, grep, and perl the reference implementations of search
                and replace?
                I don't know about "reference implementations ", but I daresay they are a
                mess w.r.t. usability.

                Comment

                • Mel

                  #9
                  Re: sed to python: replace Q

                  Raymond wrote:
                  My other gripe is with the kludgy object-oriented regex functions.
                  Couldn't these be better implemented in-line? Why should I, as a coder,
                  have to 're.compile()' when all the reference languages do this at compile
                  time, from a much more straightforward and easy to read in-line
                  function...
                  Because compile time doesn't do

                  pattern = raw_input ("Pattern, please: ")
                  saved_pattern = re.compile (pattern)

                  Mel.

                  Comment

                  • Diez B. Roggisch

                    #10
                    Re: sed to python: replace Q

                    Ah ha, thar's the disconnect. Thanks for all the pointers, my def is
                    now working. Still don't understand the logic behind this design though.
                    I mean why would any programming language have separate search or find
                    functions, one for regex and and another for non-regex based pattern
                    matching?
                    >
                    Aren't sed, awk, grep, and perl the reference implementations of search
                    and replace? They don't have non-regex functions, why does Python?
                    Wouldn't it be a lot simpler to use a flag, like grep's '-f', to change
                    the meaning of a search string to be literal?
                    And by this possibly destroying other modules code that rely on their
                    respective strings being that - and not patterns.
                    My other gripe is with the kludgy object-oriented regex functions.
                    Couldn't these be better implemented in-line? Why should I, as a coder,
                    have to 're.compile()' when all the reference languages do this at compile
                    time, from a much more straightforward and easy to read in-line
                    function...
                    You can do that already, no need to - the patterns are cached. Albeit the
                    cache might be limited in size. but code like

                    m = re.match(patter n, s)

                    is not considerably slower than

                    rex = re.compile(patt ern)
                    m = rex.match(s)

                    Diez

                    Comment

                    • Dan Stromberg

                      #11
                      Re: sed to python: replace Q

                      On Tue, 06 May 2008 14:55:07 +0000, Raymond wrote:
                      >>Another approach is to use the split() function in "re" module.
                      >
                      Ah ha, thar's the disconnect. Thanks for all the pointers, my def is
                      now working. Still don't understand the logic behind this design
                      though. I mean why would any programming language have separate search
                      or find functions, one for regex and and another for non-regex based
                      pattern matching?
                      >
                      Aren't sed, awk, grep, and perl the reference implementations of search
                      and replace? They don't have non-regex functions, why does Python?
                      Wouldn't it be a lot simpler to use a flag, like grep's '-f', to change
                      the meaning of a search string to be literal?
                      >
                      My other gripe is with the kludgy object-oriented regex functions.
                      Couldn't these be better implemented in-line? Why should I, as a coder,
                      have to 're.compile()' when all the reference languages do this at
                      compile time, from a much more straightforward and easy to read in-line
                      function...
                      >
                      Raymon
                      Hm. Are regex's first class citizens in these languages, like they are
                      in python?

                      And from a language design perspective, isn't it much cleaner to put
                      regex's into just another portion of the runtime rather than dumping it
                      into the language definition proper?

                      It does actually make sense - to have a string method do a string thing,
                      and to have a regex method do a regex thing. And while command line
                      options are pretty nice when done well, there's nothing in particular
                      stopping one from using arguments with defaults in python.

                      I'm good with sed and grep, though I never got into awk much - perhaps a
                      small mistake. When it came to perl, I skipped it and went directly to
                      python, and have never regretted the decision. Python's got a much more
                      coherent design than perl, most certainly, and more than sed as well.
                      awk's not that bad though. And grep's nice and focused - I quite like
                      grep's design.


                      Comment

                      Working...