Mutable strings

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Gordon Airport

    Mutable strings

    Has anyone suggested introducing a mutable string type (yes, of course)
    and distinguishing them from standard strings by the quote type - single
    or double? As far as I know ' and " are currently interchangeable in all
    circumstances (as long as they're paired) so there's no overloading to
    muddy the language. Of course there could be some interesting problems
    with current code that doesn't make a distinction, but it would be dead
    easy to fix with a search-and-replace. And which would be the default
    return type for functions returning strings...
    It looks like there are ways of handling this by digging around in the
    modules for more basic types, but it would be much nicer to have it
    available at 'user level'.

  • Peter Hansen

    #2
    Re: Mutable strings

    Gordon Airport wrote:[color=blue]
    >
    > Has anyone suggested introducing a mutable string type (yes, of course)
    > and distinguishing them from standard strings by the quote type - single
    > or double? As far as I know ' and " are currently interchangeable in all
    > circumstances (as long as they're paired) so there's no overloading to
    > muddy the language. Of course there could be some interesting problems
    > with current code that doesn't make a distinction,[/color]
    [color=blue]
    > but it would be dead easy to fix with a search-and-replace.[/color]
    ^^^^^^^^^^^^^^^ ^^^^^^^^^^
    No, it definitely would not. You would also have to account for
    embedded quotation marks that are not escaped already, and I'm
    certain there are other complications.

    It might be worth your writing a PEP, however, if only so that the
    idea could be killed and buried for good. ;-)

    -Peter

    Comment

    • Andy Jewell

      #3
      Re: Mutable strings

      On Saturday 20 Sep 2003 6:40 pm, Gordon Airport wrote:[color=blue]
      > Has anyone suggested introducing a mutable string type (yes, of course)
      > and distinguishing them from standard strings by the quote type - single
      > or double? As far as I know ' and " are currently interchangeable in all
      > circumstances (as long as they're paired) so there's no overloading to
      > muddy the language. Of course there could be some interesting problems
      > with current code that doesn't make a distinction, but it would be dead
      > easy to fix with a search-and-replace. And which would be the default
      > return type for functions returning strings...
      > It looks like there are ways of handling this by digging around in the
      > modules for more basic types, but it would be much nicer to have it
      > available at 'user level'.[/color]


      Mutable strings are one thing that I missed, initially, when I first started
      using Python. After a while, as the "Pythonic" way of doing things sank in,
      I realised that Python doesn't *need* mutable strings.

      Python strings (and integers and floats) are all immutable for a very good
      reason: dictionaries can't reliably use mutable objects as keys. At first,
      this seemed rather like "the tail wagging the dog"... however, once I fully
      understood the % (percent) string operator, and the ability to efficiently
      convert strings into lists and back, my anxiety went away. These cover most
      usage of strings that might convince you you need mutability.

      As for the suggestion that the kind of quote used should determine whether or
      not a string is mutable, I sort of /half/ agree. On one hand, making (say)
      the apostrophe mean mutable and the double quote mean immutable would break
      thousands of existing applications - for end users, "a simple search and
      replace" is simply not feasable! Furthermore, the meaning of the following
      snippet would be subtly (and possibly dangerously) changed:

      ----8<-----
      s1="this is an 'immutable' string"
      s2='this is a "mutable" string'

      s3=s1.replace(" '",'"')+" and "+s2.replace('" ','"') # replace quotes with
      # apostrophes and vice-versa

      d1={s3:(s1,s2)}
      ----8<-----

      Q1) What type will s3 be?
      Q2) What happens to s2? As it's mutable, shouldn't it do the replacement
      "in-line"?
      Q3) Will the assignment of d1 succeed? If it fails, wouldn't that be
      confusing?

      On the other hand, Python already has this type distinction for raw and
      unicode strings (r"..." and u"...", respectively). If it were to be adopted,
      I would be ok with an m"..." type of string, which could be barred from being
      a dictionary key. This would open up a can of worms wrt the other immutable
      types, too: would we end up with:

      ----8<-----
      a=1234567m # a mutable integer
      b=1234567.89m # a mutable float
      c=1234567890123 456789012345678 90Lm # a mutable long integer
      d=123+456jm # a mutable complex number
      e=m(1,2,3,4,5,6 ,"a","b","c" ) # a mutable tuple !!! :-p
      ----8<-----

      This could start a flame-war/heated debate on the scale of the ternary
      operator PEP!

      Maybe there's a project for you (and a good introduction to a practical
      application for new-style Python classes to boot)!

      I'm sure people have written this type of thing in the past - and in some
      situations, it's bound to be useful, but I think it should be kept as a
      separate module, so that you have to *declare* your usage of this /strange/
      behaviour to the reader; "explicit is better than implicit".

      Remember, Mohammed had to go *to* the mountain, not the other way round!

      hth,
      -andyj




      Comment

      • Gordon Airport

        #4
        Re: Mutable strings

        Peter Hansen wrote:

        snip good points that I suspect could be handled with a fairly simple
        regex
        [color=blue]
        > It might be worth your writing a PEP, however, if only so that the
        > idea could be killed and buried for good. ;-)
        >[/color]

        Yeah, I didn't see one already but I kind of expected this response.
        Still, it didn't put a stake in the heart of the ternary operator
        issue.


        Comment

        • Gordon Airport

          #5
          Re: Mutable strings - symetry with list types

          Andy Jewell wrote:
          [color=blue]
          >
          > Mutable strings are one thing that I missed, initially, when I first started
          > using Python. After a while, as the "Pythonic" way of doing things sank in,
          > I realised that Python doesn't *need* mutable strings.
          >[/color]

          Well...it doesn't /need/ the simple expressions that were given to alot
          of things.
          [color=blue]
          > Python strings (and integers and floats) are all immutable for a very good
          > reason: dictionaries can't reliably use mutable objects as keys.[/color]

          And I'm not suggesting doing away with immutable strings.
          [color=blue]
          > At first,
          > this seemed rather like "the tail wagging the dog"... however, once I fully
          > understood the % (percent) string operator, and the ability to efficiently
          > convert strings into lists and back, my anxiety went away. These cover most
          > usage of strings that might convince you you need mutability.
          >[/color]

          Yeah, you /can/ do everything, it's a question of clarity. You see how
          often ' '.join( blah ) is the answer to people's questions here, it's
          not obvious and it looks like a hack, IMO. Plus you can't do
          somestring = '%s %s %s' % [ 'nine', 'bladed', 'sword' ]
          The extra steps in list(somestring ) ... ''.join( somestring ) are what
          could be removed I guess.
          [color=blue]
          > As for the suggestion that the kind of quote used should determine whether or
          > not a string is mutable, I sort of /half/ agree. On one hand, making (say)
          > the apostrophe mean mutable and the double quote mean immutable would break
          > thousands of existing applications - for end users, "a simple search and
          > replace" is simply not feasable![/color]

          I'm less sure about that now, but the important point is that you would
          know that all old string delimiters would be changed to the immutable
          one. I'll try to come up with a regex.

          Furthermore, the meaning of the following[color=blue]
          > snippet would be subtly (and possibly dangerously) changed:
          >
          > ----8<-----
          > s1="this is an 'immutable' string"
          > s2='this is a "mutable" string'
          >
          > s3=s1.replace(" '",'"')+" and "+s2.replace('" ','"') # replace quotes with
          > # apostrophes and vice-versa
          >
          >
          > d1={s3:(s1,s2)}
          > ----8<-----
          >
          > Q1) What type will s3 be?
          > Q2) What happens to s2? As it's mutable, shouldn't it do the replacement
          > "in-line"?
          > Q3) Will the assignment of d1 succeed? If it fails, wouldn't that be
          > confusing?
          >[/color]

          I think these problems can be avoided if you just escape both symbols
          within both types of string. This complicates the code conversion, of
          course.
          [color=blue]
          > On the other hand, Python already has this type distinction for raw and
          > unicode strings (r"..." and u"...", respectively). If it were to be adopted,
          > I would be ok with an m"..." type of string, which could be barred from being
          > a dictionary key. This would open up a can of worms wrt the other immutable
          > types, too: would we end up with:
          >
          > ----8<-----
          > a=1234567m # a mutable integer
          > b=1234567.89m # a mutable float
          > c=1234567890123 456789012345678 90Lm # a mutable long integer
          > d=123+456jm # a mutable complex number
          > e=m(1,2,3,4,5,6 ,"a","b","c" ) # a mutable tuple !!! :-p
          > ----8<-----[/color]

          I don't understand what a mutable numeric type would be. I just want a
          string type that I can directly treat as an array of characters; numeric
          types aren't indexable.
          [color=blue]
          >
          > This could start a flame-war/heated debate on the scale of the ternary
          > operator PEP![/color]

          Viva la ?:! ;-)
          [color=blue]
          >
          > Maybe there's a project for you (and a good introduction to a practical
          > application for new-style Python classes to boot)!
          >
          > I'm sure people have written this type of thing in the past - and in some
          > situations, it's bound to be useful, but I think it should be kept as a
          > separate module, so that you have to *declare* your usage of this /strange/
          > behaviour to the reader; "explicit is better than implicit".[/color]

          Think of it as a symetry with the mutable and immutable list types we
          already have. It is kind of strange, but we learn their applications and
          deal with it. What's the balance of what shows up in code? I suspect
          that in gross terms there's more (mutable) list use than (immutable)
          tuple; mutable strings would have their place likewise.

          Comment

          • Dennis Lee Bieber

            #6
            Re: Mutable strings - symetry with list types

            Gordon Airport fed this fish to the penguins on Sunday 21 September
            2003 02:10 pm:

            [color=blue]
            >
            > Yeah, you /can/ do everything, it's a question of clarity. You see how
            > often ' '.join( blah ) is the answer to people's questions here, it's[/color]

            Prior to the creation of string methods, you'd have done

            import string

            .... string.join(bla h, ' ')

            [color=blue]
            > not obvious and it looks like a hack, IMO. Plus you can't do
            > somestring = '%s %s %s' % [ 'nine', 'bladed', 'sword' ][/color]

            If you know both sides have equal numbers of terms (the %s matches the
            number of entries in the list) you /can/ do a minor modification to
            that line:

            somestring = "%s %s %s" % tuple(["nine", "bladed", "sword"])

            Of course, you could also create a dictionary and store those as
            attributes (though to my mind, you have a sword with one modifier
            "nine-bladed"; as is it could be interpreted to mean nine
            bladed-sword(s) -- though all swords are bladed...).
            [color=blue][color=green][color=darkred]
            >>> weapon = {"type":"Sword" , "attribute":"bl aded", "modifier":"nin e"}
            >>> weapon[/color][/color][/color]
            {'attribute': 'bladed', 'modifier': 'nine', 'type': 'Sword'}[color=blue][color=green][color=darkred]
            >>> somestring = "%(modifier )s %(attribute)s %(type)s" % weapon
            >>> somestring[/color][/color][/color]
            'nine bladed Sword'



            --[color=blue]
            > =============== =============== =============== =============== == <
            > wlfraed@ix.netc om.com | Wulfraed Dennis Lee Bieber KD6MOG <
            > wulfraed@dm.net | Bestiaria Support Staff <
            > =============== =============== =============== =============== == <
            > Bestiaria Home Page: http://www.beastie.dm.net/ <
            > Home Page: http://www.dm.net/~wulfraed/ <[/color]

            Comment

            • Peter Hansen

              #7
              Re: Mutable strings

              Gordon Airport wrote:[color=blue]
              >
              > Peter Hansen wrote:
              >
              > snip good points that I suspect could be handled with a fairly simple
              > regex[/color]

              I'd argue the point, but I guess until you try it, we'll never know. <wink>
              [color=blue][color=green]
              > > It might be worth your writing a PEP, however, if only so that the
              > > idea could be killed and buried for good. ;-)[/color]
              >
              > Yeah, I didn't see one already but I kind of expected this response.
              > Still, it didn't put a stake in the heart of the ternary operator
              > issue.[/color]

              Apparently it served its purpose quite well. The main problem before
              the PEP and vote was that there was no PEP to point to when somebody
              asked about it, so you could say "asked and answered... will not happen".

              Now there is, and the few times the issue has come up since, someone
              has fairly quickly pointed to the PEP each time, avoiding lengthier
              discussion.

              -Peter

              Comment

              • Hans-Joachim Widmaier

                #8
                Re: Mutable strings

                Andy Jewell <andy@wild-flower.co.uk> wrote in message news:<mailman.1 064086468.9586. python-list@python.org >...
                [color=blue]
                > Mutable strings are one thing that I missed, initially, when I first star
                > ted
                > using Python. After a while, as the "Pythonic" way of doing things sank
                > in,
                > I realised that Python doesn't *need* mutable strings.[/color]

                Mutable strings come to *my* mind whenever I have to play with huge
                binary data. Working with tens of megabytes is inherently somewhat
                slow.
                [color=blue]
                > Python strings (and integers and floats) are all immutable for a very goo
                > d
                > reason: dictionaries can't reliably use mutable objects as keys.[/color]

                All understood. But then, I don't want to use my 32-MB binary blob as
                a key.
                [color=blue]
                > however, once I fully
                > understood the % (percent) string operator, and the ability to efficiently
                > convert strings into lists and back, my anxiety went away. These cover
                > most usage of strings that might convince you you need mutability.[/color]

                Converting said blob 'efficiently' to a list is something that I
                certainly would not call 'efficiently' - if not for the conversion
                itself, then for the memory consumption as list.

                I don't think strings are immutable because they ought to be that way
                (e.g. some CS guru teaches that "mutable strings are the root of all
                evil"). They're immutable because they allow them to be used as
                dictionary keys. And it was found that this doesn't affect the
                usefulness of the language too much.

                Still, I can see a use for mutable strings. Or better, mutable binary
                data, made up of bytes. (where 'byte' is the smallest individually
                addressable memory unit blabla, ... you get the meaning. Just to not
                invite nit-pickers on that term.)
                [color=blue]
                > "explicit is better than implicit".[/color]

                Yes, definitely: Let there be another type.

                Byte-twiddlingly yours,
                Hans-J.

                Comment

                • Rob Tillotson

                  #9
                  Re: Mutable strings

                  hjwidmaier@web. de (Hans-Joachim Widmaier) writes:[color=blue]
                  > Still, I can see a use for mutable strings. Or better, mutable binary
                  > data, made up of bytes. (where 'byte' is the smallest individually
                  > addressable memory unit blabla, ... you get the meaning. Just to not
                  > invite nit-pickers on that term.)
                  >[color=green]
                  >> "explicit is better than implicit".[/color]
                  >
                  > Yes, definitely: Let there be another type.[/color]

                  There already is one: array. Mutable blocks of bytes (or shorts,
                  longs, floats, etc.), usable in many places where you might otherwise
                  use a string (struct.unpack, writing to a file, etc.). It is not
                  quite a mutable string, but it does fit the bill for manipulating raw
                  bytes. For example, off the top of my head:
                  [color=blue][color=green][color=darkred]
                  >>> import array
                  >>> a = array.array('B' ,'abcdefg')
                  >>> a[/color][/color][/color]
                  array('B', [97, 98, 99, 100, 101, 102, 103])[color=blue][color=green][color=darkred]
                  >>> a[2:4] = array.array('B' ,'12345')
                  >>> a[/color][/color][/color]
                  array('B', [97, 98, 49, 50, 51, 52, 53, 101, 102, 103])[color=blue][color=green][color=darkred]
                  >>> a.tostring()[/color][/color][/color]
                  'ab12345efg'

                  For times when you really need a mutable string, there is always
                  UserString.Muta bleString (not quite sure what version this first
                  appeared in) -- it isn't terribly efficient since it uses a regular
                  string internally to hold the data, but it gets the job done and if
                  you really need something faster it would be a fairly simple exercise
                  to rewrite it using an array instead.

                  --Rob

                  --
                  Rob Tillotson N9MTB <rob@pyrite.org >

                  Comment

                  • Alex Martelli

                    #10
                    Re: Mutable strings

                    Hans-Joachim Widmaier wrote:
                    ...[color=blue][color=green]
                    >> I realised that Python doesn't *need* mutable strings.[/color]
                    >
                    > Mutable strings come to *my* mind whenever I have to play with huge
                    > binary data. Working with tens of megabytes is inherently somewhat
                    > slow.[/color]

                    But mutable strings are not the best place to keep "huge binary
                    data". Lists of smaller blocks, arrays of bytes, and lists of
                    arrays can be much more appropriate data structures.

                    [color=blue][color=green]
                    >> Python strings (and integers and floats) are all immutable for a very goo
                    >> d
                    >> reason: dictionaries can't reliably use mutable objects as keys.[/color]
                    >
                    > All understood. But then, I don't want to use my 32-MB binary blob as
                    > a key.[/color]

                    Since you don't in fact need to use it in any of the ways typically
                    applicable only to strings, it doesn't need to be a string.

                    [color=blue][color=green]
                    >> convert strings into lists and back, my anxiety went away. These cover
                    >> most usage of strings that might convince you you need mutability.[/color]
                    >
                    > Converting said blob 'efficiently' to a list is something that I
                    > certainly would not call 'efficiently' - if not for the conversion
                    > itself, then for the memory consumption as list.[/color]

                    A typical case might be one where the blob is, e.g., in fact made
                    up of 65K sectors of 512 bytes each. In this case, the extra memory
                    consumption due to keeping the blob in memory as a list of 65K small
                    strings rather than one big string is, I would guess, about 1%. So,
                    who cares? And similarly if the "substrings " are of different sizes,
                    just as long as you only have a few tens of thousands of such
                    substrings. It's quite unusual that the "intrinsic structure" of
                    the blob is in fact one big undifferentiate d 32MB thingy -- when it
                    is, you're unlikely to need it in memory, or if you do you're
                    unlikely to be able to apply any processing mutation to it sensibly;
                    and for those unusual and unlikely cases, arrays of bytes are often
                    just fine (after all, C has nothing BUT arrays of bytes [or of other
                    fixed entities], yet it's quite suitable for some such processing).

                    [color=blue]
                    > I don't think strings are immutable because they ought to be that way
                    > (e.g. some CS guru teaches that "mutable strings are the root of all
                    > evil"). They're immutable because they allow them to be used as
                    > dictionary keys. And it was found that this doesn't affect the
                    > usefulness of the language too much.[/color]

                    Wrong. Consider Java, even back from the very first version: it had
                    no dictionaries on which string might be keys, yet it still decided
                    to make its strings immutable. This should make it obvious that the
                    interest of using keys as dict keys cannot possibly be the sole
                    motivation for the decision to make strings immutable in a language.
                    Rather, the deeper motivation is connected to wanting strings to be
                    ATOMIC, ELEMENTARY types, just like numbers; and to lots of useful
                    practical returns of that choice. All you lose is the "ability" to
                    "confuse" (type-pun) between strings and arrays of bytes in many
                    situations, but that's an ability best lost in many cases. It's not
                    an issue of "evil" -- a close-to-the-hardware low-level language
                    like C has excellent reasons to choose a different, close-to-HW
                    semantics -- but in a higher-level language I think Python's and
                    Java's choice to have strings immutable works better than (e.g.)
                    Perl's and Ruby's to have them mutable.

                    [color=blue]
                    > Still, I can see a use for mutable strings. Or better, mutable binary
                    > data, made up of bytes. (where 'byte' is the smallest individually
                    > addressable memory unit blabla, ... you get the meaning. Just to not
                    > invite nit-pickers on that term.)[/color]

                    Just "import array" and you have your "mutable binary data made up
                    of bytes". So, what's the problem? Type-punning between THAT type,
                    and strings, is just not all that useful.

                    [color=blue][color=green]
                    >> "explicit is better than implicit".[/color]
                    >
                    > Yes, definitely: Let there be another type.[/color]

                    But, there IS one! So, hat's wrong with it...?!


                    Alex

                    Comment

                    • Jeff Epler

                      #11
                      Re: Mutable strings

                      On Mon, Sep 22, 2003 at 12:31:58PM +0000, Alex Martelli wrote:[color=blue]
                      > But, there IS one! So, hat's wrong with it...?![/color]

                      People seem to love to have literals for things. Otherwise, they feel
                      that a type is second-class.

                      Jeff

                      Comment

                      • logistix at cathoderaymission.net

                        #12
                        Re: Mutable strings

                        hjwidmaier@web. de (Hans-Joachim Widmaier) wrote in message news:<6e990e29. 0309220251.51fa 648d@posting.go ogle.com>...[color=blue]
                        > Andy Jewell <andy@wild-flower.co.uk> wrote in message news:<mailman.1 064086468.9586. python-list@python.org >...
                        >[color=green]
                        > > Mutable strings are one thing that I missed, initially, when I first star
                        > > ted
                        > > using Python. After a while, as the "Pythonic" way of doing things sank
                        > > in,
                        > > I realised that Python doesn't *need* mutable strings.[/color]
                        >
                        > Mutable strings come to *my* mind whenever I have to play with huge
                        > binary data. Working with tens of megabytes is inherently somewhat
                        > slow.
                        >[/color]

                        import array
                        x = arrray.array('c ')

                        Pretty much creates a mutable string for these cases, although the
                        interface is a little different.

                        Comment

                        • Alex Martelli

                          #13
                          Re: Mutable strings

                          On Monday 22 September 2003 02:50 pm, Jeff Epler wrote:[color=blue]
                          > On Mon, Sep 22, 2003 at 12:31:58PM +0000, Alex Martelli wrote:[color=green]
                          > > But, there IS one! So, hat's wrong with it...?![/color]
                          >
                          > People seem to love to have literals for things. Otherwise, they feel
                          > that a type is second-class.[/color]

                          Sure. I have no problem deeming "mutable strings" (array of bytes)
                          to be "second-class" in some vague sense, since their use is so rare
                          and the need for literals of that type even rarer; lacking literals for,
                          e.g., sets.Set "troubles" me far more;-).

                          I do keep daydreaming of some "user-defined semiliteral syntax"
                          such as, e.g. <identifier>{<b alanced-parentheses tokens>} to
                          result in a call to (e.g.) <identifier>.__ literal__ with a list (or other
                          sequence) of tokens as the argument, returning whatever that
                          call returns. But perhaps it isn't that good an idea after all (it
                          does imply the __literal__ classmethod or staticmethod doing
                          some sort of runtime compilation and execution of those tokens,
                          and opens the doors to the risk of some seriously nonPythonic
                          syntax for such "literals-initializers").


                          Alex


                          Comment

                          • John Roth

                            #14
                            Re: Mutable strings

                            Look at PEP's 296 and 298.

                            John Roth

                            "Gordon Airport" <uce@ftc.gov> wrote in message
                            news:qhadna08dp EsmvOiXTWJig@co mcast.com...[color=blue]
                            > Peter Hansen wrote:
                            >
                            > snip good points that I suspect could be handled with a fairly simple
                            > regex
                            >[color=green]
                            > > It might be worth your writing a PEP, however, if only so that the
                            > > idea could be killed and buried for good. ;-)
                            > >[/color]
                            >
                            > Yeah, I didn't see one already but I kind of expected this response.
                            > Still, it didn't put a stake in the heart of the ternary operator
                            > issue.
                            >
                            >[/color]


                            Comment

                            • Gordon Airport

                              #15
                              Re: Mutable strings

                              Peter Hansen wrote:[color=blue]
                              > Gordon Airport wrote:
                              >[color=green]
                              >>Peter Hansen wrote:
                              >>
                              >>snip good points that I suspect could be handled with a fairly simple
                              >>regex[/color]
                              >
                              >
                              > I'd argue the point, but I guess until you try it, we'll never know. <wink>
                              >[/color]

                              Okay, I've tried it and I'll chalk it up to my inexperience with regular
                              expressions and sed, but I don't have anyhting to show. The general
                              strategy , though, is to make several passes; first escape all inner
                              strings, then convert all outer string delimiters (now the only ones not
                              escaped) to the immutable symbol. I feel like I'll wake up at 2 a.m.
                              with the answer, but I'll post now anyway.
                              (Assuming you /can/ say "every instance of A between B's" in regex...you
                              could always do it with a python script)

                              <snip>[color=blue]
                              >
                              >
                              > Apparently it served its purpose quite well. The main problem before
                              > the PEP and vote was that there was no PEP to point to when somebody
                              > asked about it, so you could say "asked and answered... will not happen".
                              >
                              > Now there is, and the few times the issue has come up since, someone
                              > has fairly quickly pointed to the PEP each time, avoiding lengthier
                              > discussion.
                              >
                              > -Peter[/color]

                              Fair enough. Maybe I will submit a PEP for this, I've never looked into
                              what's involved.

                              Comment

                              Working...