manipulating string question

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • python101
    New Member
    • Sep 2007
    • 90

    manipulating string question

    [code=python]
    #assume that I have a string

    stringA = 'ASD DSA DASFSADSA FSAFSADSAF AS'

    # I would like to make it
    stringA = '1$ASD DSA D2$ASFSADSA FSASADSAF 3$AS'
    #whatever there is 'AS' it should become 'i$AS', where i is its occurrence in the string
    #how can I do that?
    [/code]
  • bartonc
    Recognized Expert Expert
    • Sep 2006
    • 6478

    #2
    Originally posted by python101
    [code=python]
    #assume that I have a string

    stringA = 'ASD DSA DASFSADSA FSAFSADSAF AS'

    # I would like to make it
    stringA = '1$ASD DSA D2$ASFSADSA FSASADSAF 3$AS'
    #whatever there is 'AS' it should become 'i$AS', where i is its occurrence in the string
    #how can I do that?
    [/code]
    [CODE=python]
    >>> stringA = 'ASD DSA DASFSADSA FSAFSADSAF AS'
    >>> i = 1
    >>> resList = []
    >>> for anStr in stringA.split() :
    ... thisItem = ""
    ... mark = 0
    ... n = anStr.count("AS ")
    ... for j in range(n):
    ... k = anStr.find("AS" )
    ... thisItem += anStr[mark:k] + "%d$AS" %i
    ... i += 1
    ... mark += k + 2
    ... thisItem += anStr[mark:]
    ... if n == 0:
    ... thisItem = anStr
    ... resList.append( thisItem)
    ...
    >>> resList
    ['1$ASD', 'DSA', 'D2$ASFSADSA', 'FSAFSADSAF', '3$AS']
    >>> " ".join(resL ist)
    '1$ASD DSA D2$ASFSADSA FSAFSADSAF 3$AS'
    >>> [/CODE]

    Comment

    • bartonc
      Recognized Expert Expert
      • Sep 2006
      • 6478

      #3
      Originally posted by python101
      [code=python]
      #assume that I have a string

      stringA = 'ASD DSA DASFSADSA FSAFSADSAF AS'

      # I would like to make it
      stringA = '1$ASD DSA D2$ASFSADSA FSASADSAF 3$AS'
      #whatever there is 'AS' it should become 'i$AS', where i is its occurrence in the string
      #how can I do that?
      [/code]
      Always more ways than one to skin a cat:[CODE=python]
      >>> import re
      >>> stringA = 'ASD DSA DASFSADSA FSAFSADSAF AS'
      >>> marks = re.finditer("AS ", stringA)
      >>> res = ""
      >>> mark = 0
      >>> i = 1
      >>> for m in marks:
      ... res += stringA[mark:m.start()] + "%d$AS" %i
      ... mark = m.end()
      ... i += 1
      ...
      >>> res
      '1$ASD DSA D2$ASFSADSA FSAFSADSAF 3$AS'
      >>> [/CODE]

      Comment

      • rhitam30111985
        New Member
        • Aug 2007
        • 112

        #4
        here is another way:
        [CODE=python]
        stringA = 'ASD DSA DASFSADSA FSAFSADSAF AS'
        a=list(stringA)
        b=[]

        count=1
        for i in range(len(a)):
        if a[i]=='A' and a[i+1]=='S':
        b.append(str(co unt))
        b.append('$')
        b.append(a[i])
        count +=1
        else:
        b.append(a[i])
        stringA=''.join (b)
        print stringA
        [/CODE]

        Comment

        • bvdet
          Recognized Expert Specialist
          • Oct 2006
          • 2851

          #5
          Nice solutions guys. For the exercise, here's another:[code=Python]def indexList(s, item, i=0):
          i_list = []
          while True:
          try:
          i = s.index(item, i)
          i_list.append(i )
          i += 1
          except:
          break
          return i_list

          def str_replace_mul ti(s, tar, sub):
          """
          The target substring is to be replaced by an index
          number representing its occurrence + '$' + the target.
          'AS', replaced by i$AS'
          'sub' must be a string that can be evaluated
          """
          s1 = s
          indices = indexList(strin gA, tar)
          for i, item in enumerate(indic es):
          item += len(s1)-len(stringA)
          s1 = ''.join([s1[:max(0,item)], eval(sub), s1[item+len(tar):]])
          return s1[/code]

          >>> stringA = 'ASD DSA DASFSADSA FSAFSADSAF AS'
          >>> str_replace_mul ti(stringA, 'AS', "'%d$AS' % (i+1)")
          '1$ASD DSA D2$ASFSADSA FSAFSADSAF 3$AS'
          >>>

          I keep finding uses for function indexList() :)

          Comment

          • rogerlew
            New Member
            • Jun 2007
            • 15

            #6
            And another... with re.
            It had to be done!

            Code:
            import re
            def repl(match):
                globals()['count']+=1
                return '$%i%s'%(globals()['count'],
                                match.groups()[0])
                     
            globals()['count']=0 
            print re.sub('(AS)',repl,stringA)
            
            >>> $1ASD DSA D$2ASFSADSA FSAFSADSAF $3AS

            Comment

            • python101
              New Member
              • Sep 2007
              • 90

              #7
              Thank everyone's, especially rhitam30111985; s code

              Originally posted by rhitam30111985
              here is another way:
              [CODE=python]
              stringA = 'ASD DSA DASFSADSA FSAFSADSAF AS'
              a=list(stringA)
              b=[]

              count=1
              for i in range(len(a)):
              if a[i]=='A' and a[i+1]=='S':
              b.append(str(co unt))
              b.append('$')
              b.append(a[i])
              count +=1
              else:
              b.append(a[i])
              stringA=''.join (b)
              print stringA
              [/CODE]

              Comment

              • bartonc
                Recognized Expert Expert
                • Sep 2006
                • 6478

                #8
                Originally posted by python101
                Thank everyone's, especially rhitam30111985; s code
                Which will throw an error if the very last character is an "A".

                Comment

                • rhitam30111985
                  New Member
                  • Aug 2007
                  • 112

                  #9
                  hey barton u are right .. didnt notice that .. ok how about this?

                  [CODE=python]
                  stringA = 'ASD DSA DASFSADSA FSAFSADSAF AS'
                  a=list(stringA)
                  b=[]

                  count=1
                  if a[-1]=='A':
                  a.append(' ')# :-P
                  for i in range(len(a)):
                  if a[i]=='A' and a[i+1]=='S':
                  b.append(str(co unt))
                  b.append('$')
                  b.append(a[i])
                  count +=1
                  else:
                  b.append(a[i])
                  stringA=''.join (b)#while printing, the extra space wont be noticed
                  print stringA
                  [/CODE]

                  Comment

                  • bartonc
                    Recognized Expert Expert
                    • Sep 2006
                    • 6478

                    #10
                    Originally posted by rhitam30111985
                    hey barton u are right .. didnt notice that .. ok how about this?
                    Yep. That would do it. But wouldn't the result contain the extra character?

                    I still like my Regular Expression method best.
                    Last edited by bartonc; Oct 12 '07, 05:50 AM.

                    Comment

                    • rhitam30111985
                      New Member
                      • Aug 2007
                      • 112

                      #11
                      yep.. it will contain the extra chcracter.. which is just a space... so not really a problem we can put a check at the end for the last character if it is a space.. then we can use the list.pop() or list.remove() method to get rid of it... .. as for regular expressions.. i personally stay away from them.... they r just beyond my comprehension.. :-(

                      Comment

                      • bartonc
                        Recognized Expert Expert
                        • Sep 2006
                        • 6478

                        #12
                        Originally posted by rhitam30111985
                        yep.. it will contain the extra chcracter.. which is just a space... so not really a problem .. as for regular expressions.. i personally stay away from them.... they r just beyond my comprehension..
                        This (slightly simplified) one seems pretty basic:[CODE=python]
                        import re

                        stringA = 'ASD DSA DASFSADSA FSAFSADSAF AS'
                        matches = re.finditer("AS ", stringA)
                        res = ""
                        mark = 0 # marks where we left off in the string
                        for i, match in enumerate(match es):
                        res += stringA[mark:match.star t()] + "%d$AS" %i # or ] + str(i) + "$AS"
                        mark = match.end()

                        print res
                        '1$ASD DSA D2$ASFSADSA FSAFSADSAF 3$AS'[/CODE]

                        Comment

                        • bartonc
                          Recognized Expert Expert
                          • Sep 2006
                          • 6478

                          #13
                          Originally posted by rogerlew
                          And another... with re.
                          It had to be done!

                          Code:
                          import re
                          def repl(match):
                              globals()['count']+=1
                              return '$%i%s'%(globals()['count'],
                                              match.groups()[0])
                                   
                          globals()['count']=0 
                          print re.sub('(AS)',repl,stringA)
                          
                          >>> $1ASD DSA D$2ASFSADSA FSAFSADSAF $3AS
                          Let Python search for the variable outside the scope of the function instead of returning the dictionary reference (just thinking out loud, here)... I think I really like your way. Just for completeness, I'll post to see how it looks:
                          Code:
                          import re
                          def repl(match):
                              global count
                              count += 1
                          Yep. I like your way better.

                          Comment

                          Working...