"in"consistency?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • David C. Ullrich

    "in"consistency?

    Luckily I tried it before saying no, that's
    not how "in" works:
    >>'ab' in 'abc'
    True
    >>[1,2] in [1,2,3]
    False

    Is there a reason for the inconsistency? I would
    have thought "in" would check for elements of a
    sequence, regardless of what sort of sequence it was...

    --
    David C. Ullrich
  • Nick Dumas

    #2
    Re: "in"c onsistency?

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    [1,2] in [1,2,3] checks to see if the list [1,2] is an item in [1,2,3].
    Because the list [1,2,3] only contains the integers 1,2,3, the code
    returns a False. Try "[1,2] in [[1,2],[2,3]]"

    David C. Ullrich wrote:
    Luckily I tried it before saying no, that's
    not how "in" works:
    >
    >>>'ab' in 'abc'
    True
    >>>[1,2] in [1,2,3]
    False
    >
    Is there a reason for the inconsistency? I would
    have thought "in" would check for elements of a
    sequence, regardless of what sort of sequence it was...
    >
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.9 (MingW32)
    Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

    iEYEARECAAYFAkh yiwEACgkQLMI5fn dAv9jbiwCeKMXrA clILJMPro5VuSRg dkvB
    cGkAn1igcjFWRQJ SwEDOxpk3spzceZ Ga
    =iq8L
    -----END PGP SIGNATURE-----

    Comment

    • Ben Finney

      #3
      Re: "in"c onsistency?

      "David C. Ullrich" <dullrich@spryn et.comwrites:
      >'ab' in 'abc'
      True
      >[1,2] in [1,2,3]
      False
      <URL:http://www.python.org/doc/ref/comparisons.htm l>
      Is there a reason for the inconsistency?
      Probably. The special behaviour of string types was changed in Python
      2.3, according to that document.

      --
      \ “I put contact lenses in my dog's eyes. They had little |
      `\ pictures of cats on them. Then I took one out and he ran around |
      _o__) in circles.” —Steven Wright |
      Ben Finney

      Comment

      • Gary Herron

        #4
        Re: &quot;in&quot;c onsistency?

        Nick Dumas wrote:
        -----BEGIN PGP SIGNED MESSAGE-----
        Hash: SHA1
        >
        [1,2] in [1,2,3] checks to see if the list [1,2] is an item in [1,2,3].
        Because the list [1,2,3] only contains the integers 1,2,3, the code
        returns a False. Try "[1,2] in [[1,2],[2,3]]"
        >
        The inconsistency goes deeper than that. For instance, the type of a
        value returned by the indexing operation:

        Indexing a string returns a string (of length 1 of course),
        while indexing a list does not (necessarily) return a list.

        Conclusion: They are different types supporting different operations.
        Given all the obvious differences (mutability, sorting and other
        methods, types of individual elements), I'd say there are more
        differences than similarities, even though, as sequences, they both
        support a small subset of similar operations.

        Gary Herron



        David C. Ullrich wrote:
        >
        >Luckily I tried it before saying no, that's
        >not how "in" works:
        >>
        >>
        >>>>'ab' in 'abc'
        >>>>>
        >True
        >>
        >>>>[1,2] in [1,2,3]
        >>>>>
        >False
        >>
        >Is there a reason for the inconsistency? I would
        >have thought "in" would check for elements of a
        >sequence, regardless of what sort of sequence it was...
        >>
        >>
        -----BEGIN PGP SIGNATURE-----
        Version: GnuPG v1.4.9 (MingW32)
        Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
        >
        iEYEARECAAYFAkh yiwEACgkQLMI5fn dAv9jbiwCeKMXrA clILJMPro5VuSRg dkvB
        cGkAn1igcjFWRQJ SwEDOxpk3spzceZ Ga
        =iq8L
        -----END PGP SIGNATURE-----
        --

        >

        Comment

        • Mel

          #5
          Re: &quot;in&quot;c onsistency?

          Ben Finney wrote:
          "David C. Ullrich" <dullrich@spryn et.comwrites:
          >
          >>'ab' in 'abc'
          >True
          >>[1,2] in [1,2,3]
          >False
          >
          <URL:http://www.python.org/doc/ref/comparisons.htm l>
          >
          >Is there a reason for the inconsistency?
          >
          Probably. The special behaviour of string types was changed in Python
          2.3, according to that document.
          As it stands, you'd get

          [1,2] in [1,2,3] == False

          [1,2] in [1, [1,2], 3] == True


          This could be a good thing.

          Mel.
          >

          Comment

          • Terry Reedy

            #6
            Re: &quot;in&quot;c onsistency?



            David C. Ullrich wrote:
            >>>'ab' in 'abc'
            True
            'a' in 'abc' works according to the standard meaning of o in collection.

            'ab' in 'abc' could not work by that standard meaning because strings,
            as virtual sequences, only contain characters (length 1 strings). Among
            built-in collections, this limitation is unique to strings (and bytes,
            in 3.0). So in 2.3, 'in' was given a useful extension of meaning that
            is also unique to strings (and bytes).
            >>>[1,2] in [1,2,3]
            False
            [1,2] can be an member of tuples, lists, dicts and other general
            collections. [1,2] in collection therefore has that meaning, that it is
            a single element of collection. Extending the meaning would conflict
            with this basic meaning.
            Is there a reason for the inconsistency? I would
            have thought "in" would check for elements of a
            sequence, regardless of what sort of sequence it was...
            It is not an inconsistency but an extension corresponding to the
            limitation of what an string element can be.

            Terry J. Reedy

            Comment

            • castironpi

              #7
              Re: &quot;in&quot;c onsistency?

              On Jul 7, 5:02 pm, Gary Herron <gher...@island training.comwro te:
              Nick Dumas wrote:
              -----BEGIN PGP SIGNED MESSAGE-----
              Hash: SHA1
              >
              [1,2] in [1,2,3] checks to see if the list [1,2] is an item in [1,2,3].
              Because the list [1,2,3] only contains the integers 1,2,3, the code
              returns a False. Try "[1,2] in [[1,2],[2,3]]"
              >
              The inconsistency goes deeper than that.  For instance, the type of a
              value returned by the indexing operation:
              >
                 Indexing a string returns a string (of length 1 of course),
                 while indexing a list does not (necessarily) return a list.
              >
              Conclusion:  They are different types supporting different operations.  
              Given all the obvious differences (mutability, sorting and other
              methods, types of individual elements), I'd say there are more
              differences than similarities, even though, as sequences,  they both
              support a small subset of similar operations.
              >
              Gary Herron
              >
              David C. Ullrich wrote:
              >
              Luckily I tried it before saying no, that's
              not how "in" works:
              >
              >>>'ab' in 'abc'
              >
              True
              >
              >>>[1,2] in [1,2,3]
              >
              False
              >
              Is there a reason for the inconsistency? I would
              have thought "in" would check for elements of a
              sequence, regardless of what sort of sequence it was...
              Strings are not containers.

              Another container type:

              Python 3.0b1 on win32
              >>{0} in {0,1}
              False

              Another string-like, non-container type:
              >>bytes( [ 0, 1 ] ) in bytes( [ 0, 1, 2 ] )
              True

              Comment

              • David C. Ullrich

                #8
                Re: &quot;in&quot;c onsistency?

                In article <g4ud3d$bro$1@a ioe.org>, Mel <mwilson@the-wire.comwrote:
                Ben Finney wrote:
                >
                "David C. Ullrich" <dullrich@spryn et.comwrites:
                >'ab' in 'abc'
                True
                >[1,2] in [1,2,3]
                False
                <URL:http://www.python.org/doc/ref/comparisons.htm l>
                Is there a reason for the inconsistency?
                Probably. The special behaviour of string types was changed in Python
                2.3, according to that document.
                >
                As it stands, you'd get
                >
                [1,2] in [1,2,3] == False
                >
                [1,2] in [1, [1,2], 3] == True
                >
                >
                This could be a good thing.
                Oh, of course that's a good thing - changing "in" for lists
                to give True there would be awful. I was wondering why it
                _does_ work that way for strings.

                Maybe the answer is "because it can" - for strings the sort
                of possible problem you point out can't come up.
                Mel.
                --
                David C. Ullrich

                Comment

                • David C. Ullrich

                  #9
                  Re: &quot;in&quot;c onsistency?

                  In article <iu6dnaRaPaWbFu _VnZ2dnUVZ_qLin Z2d@earthlink.c om>,
                  Nick Dumas <drakonik@gmail .comwrote:
                  -----BEGIN PGP SIGNED MESSAGE-----
                  Hash: SHA1
                  >
                  [1,2] in [1,2,3] checks to see if the list [1,2] is an item in [1,2,3].
                  Because the list [1,2,3] only contains the integers 1,2,3, the code
                  returns a False. Try "[1,2] in [[1,2],[2,3]]"
                  Thanks. I understand how it works for lists and why - I was
                  wondering why it's not the same for strings.
                  David C. Ullrich wrote:
                  Luckily I tried it before saying no, that's
                  not how "in" works:
                  >>'ab' in 'abc'
                  True
                  >>[1,2] in [1,2,3]
                  False

                  Is there a reason for the inconsistency? I would
                  have thought "in" would check for elements of a
                  sequence, regardless of what sort of sequence it was...
                  -----BEGIN PGP SIGNATURE-----
                  Version: GnuPG v1.4.9 (MingW32)
                  Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
                  >
                  iEYEARECAAYFAkh yiwEACgkQLMI5fn dAv9jbiwCeKMXrA clILJMPro5VuSRg dkvB
                  cGkAn1igcjFWRQJ SwEDOxpk3spzceZ Ga
                  =iq8L
                  -----END PGP SIGNATURE-----
                  --
                  David C. Ullrich

                  Comment

                  • David C. Ullrich

                    #10
                    Re: &quot;in&quot;c onsistency?

                    In article <mailman.137.12 15480143.20628. python-list@python.org >,
                    Terry Reedy <tjreedy@udel.e duwrote:
                    David C. Ullrich wrote:
                    >
                    >>'ab' in 'abc'
                    True
                    >
                    'a' in 'abc' works according to the standard meaning of o in collection.
                    >
                    'ab' in 'abc' could not work by that standard meaning because strings,
                    as virtual sequences, only contain characters (length 1 strings). Among
                    built-in collections, this limitation is unique to strings (and bytes,
                    in 3.0). So in 2.3, 'in' was given a useful extension of meaning that
                    is also unique to strings (and bytes).
                    Ah, I didn't realize that this was new. Thanks - at least this
                    means I was right about the way it worked formerly.
                    >>[1,2] in [1,2,3]
                    False
                    >
                    [1,2] can be an member of tuples, lists, dicts and other general
                    collections. [1,2] in collection therefore has that meaning, that it is
                    a single element of collection. Extending the meaning would conflict
                    with this basic meaning.
                    Well of course.
                    Is there a reason for the inconsistency? I would
                    have thought "in" would check for elements of a
                    sequence, regardless of what sort of sequence it was...
                    >
                    It is not an inconsistency but an extension corresponding to the
                    limitation of what an string element can be.
                    It's an inconsistency. That doesn't mean it's a bad thing or that
                    I want my money back. It may well be a reasonable inconsistency -
                    strings _can_ work that way while it's clear lists had better not.
                    But it's an inconsistency.
                    Terry J. Reedy
                    --
                    David C. Ullrich

                    Comment

                    • Terry Reedy

                      #11
                      Re: &quot;in&quot;c onsistency?



                      David C. Ullrich wrote:
                      In article <mailman.137.12 15480143.20628. python-list@python.org >,
                      Terry Reedy <tjreedy@udel.e duwrote:
                      >>Is there a reason for the inconsistency? I would
                      >>have thought "in" would check for elements of a
                      >>sequence, regardless of what sort of sequence it was...
                      >It is not an inconsistency but an extension corresponding to the
                      >limitation of what an string element can be.
                      >
                      It's an inconsistency. That doesn't mean it's a bad thing or that
                      I want my money back. It may well be a reasonable inconsistency -
                      strings _can_ work that way while it's clear lists had better not.
                      But it's an inconsistency.
                      To decisively argue 'inconsistency' as factual or not, versus us having
                      divergent opinions, you would have to supply a technical definition ;-)
                      The math definition of 'leading to a contradiction' in the sense of
                      being able to prove False is True, does not seem to apply here.

                      However,
                      a) In common English, 'in' and 'contains', applied to strings of
                      characters (text), is understood as applying to substrings that appear
                      in the text. This is also true of many other programming languages.
                      'Dictionary' contains 'diction'. This is even the basis of various word
                      games.
                      b) Python otherwise allows operators to vary in meaning for different
                      classes.

                      In any case, back to your original question: the extension of meaning,
                      'inconsistent' or not, was deliberated and adopted on the basis that the
                      usefulness of the extension would outweigh the confusion wrought by the
                      class-specific nature of the extension. (In other words, threads such
                      as this *were* anticipated ;-)

                      Terry Jan Reedy

                      Comment

                      • Terry Reedy

                        #12
                        Re: &quot;in&quot;c onsistency?



                        castironpi wrote:
                        Strings are not containers.
                        Library Reference/Built-in Types/Sequence Types says
                        "Strings contain Unicode characters."
                        Perhaps you have a different notion of contain/container.

                        I prefer 'collection' to 'container' since 'container' tends to imply an
                        exclusiveness that is not true. Byte/character sequences *are*
                        different from tuples, lists, sets, dicts, etc, in the following sense:
                        members of the latter collection classes must exist first before being
                        added to the collection (non-exclusively). Members of the former do
                        not. (In CPython, at least, they do not). So I consider them to
                        (reiterable) *virtual* sequence collections that can produce
                        subsequences on demand,

                        So I partially agree with you in that byte/char sequences are a
                        different sub-category.
                        Another container type:
                        >
                        Python 3.0b1 on win32
                        >>>{0} in {0,1}
                        False
                        And similarly, (0,) not in (0,1), [0] not in [0,1], {0:None} not in
                        {0:None,1:None) . These are all general manifest collection types that
                        can contain any Python object, and which could contain a sub-collection
                        even if they do not.

                        Terry Jan Reedy

                        Comment

                        • Mel

                          #13
                          Re: &quot;in&quot;c onsistency?

                          David C. Ullrich wrote:
                          Oh, of course that's a good thing - changing "in" for lists
                          to give True there would be awful. I was wondering why it
                          _does_ work that way for strings.
                          >
                          Maybe the answer is "because it can" - for strings the sort
                          of possible problem you point out can't come up.
                          I think that's it. When people realized how handy

                          small_string in big_string

                          would be, they expanded the string behaviour. Python is a pragmatic
                          language.

                          Cheers, Mel.

                          Comment

                          • David C. Ullrich

                            #14
                            Re: &quot;in&quot;c onsistency?

                            In article <mailman.159.12 15543188.20628. python-list@python.org >,
                            Terry Reedy <tjreedy@udel.e duwrote:
                            David C. Ullrich wrote:
                            In article <mailman.137.12 15480143.20628. python-list@python.org >,
                            Terry Reedy <tjreedy@udel.e duwrote:
                            >
                            >Is there a reason for the inconsistency? I would
                            >have thought "in" would check for elements of a
                            >sequence, regardless of what sort of sequence it was...
                            It is not an inconsistency but an extension corresponding to the
                            limitation of what an string element can be.
                            It's an inconsistency. That doesn't mean it's a bad thing or that
                            I want my money back. It may well be a reasonable inconsistency -
                            strings _can_ work that way while it's clear lists had better not.
                            But it's an inconsistency.
                            >
                            To decisively argue 'inconsistency' as factual or not, versus us having
                            divergent opinions, you would have to supply a technical definition ;-)
                            The math definition of 'leading to a contradiction' in the sense of
                            being able to prove False is True, does not seem to apply here.
                            >
                            However,
                            a) In common English, 'in' and 'contains', applied to strings of
                            characters (text), is understood as applying to substrings that appear
                            in the text. This is also true of many other programming languages.
                            'Dictionary' contains 'diction'. This is even the basis of various word
                            games.
                            b) Python otherwise allows operators to vary in meaning for different
                            classes.
                            >
                            In any case, back to your original question: the extension of meaning,
                            'inconsistent' or not, was deliberated and adopted on the basis that the
                            usefulness of the extension would outweigh the confusion wrought by the
                            class-specific nature of the extension. (In other words, threads such
                            as this *were* anticipated ;-)
                            I wasn't saying that the fact that the behavior of "in" for
                            strings is inconsistent with the behavior for lists was a bad
                            thing - I was just asking about the reason for it.

                            (I also wasn't claiming that it was inconsistent with the
                            common English usage of "in"...)

                            People have pointed out that "in" for strings _can_ work that
                            way, while (of course) "in" for lists had better not. That's
                            fine.
                            Terry Jan Reedy
                            --
                            David C. Ullrich

                            Comment

                            • castironpi

                              #15
                              Re: &quot;in&quot;c onsistency?

                              On Jul 8, 2:25 pm, Terry Reedy <tjre...@udel.e duwrote:
                              castironpi wrote:
                              Strings are not containers.
                              >
                              Library Reference/Built-in Types/Sequence Types says
                              "Strings contain Unicode characters."
                              Perhaps you have a different notion of contain/container.
                              >
                              I prefer 'collection' to 'container' since 'container' tends to imply an
                              exclusiveness that is not true.  Byte/character sequences *are*
                              different from tuples, lists, sets, dicts, etc, in the following sense:
                              members of the latter collection classes must exist first before being
                              added to the collection (non-exclusively).  Members of the former do
                              not.  (In CPython, at least, they do not).  So I consider them to
                              (reiterable) *virtual* sequence collections that can produce
                              subsequences on demand,
                              >
                              So I partially agree with you in that byte/char sequences are a
                              different sub-category.
                              >
                              Another container type:
                              >
                              Python 3.0b1 on win32
                              >>{0} in {0,1}
                              False
                              >
                              And similarly, (0,) not in (0,1), [0] not in [0,1], {0:None} not in
                              {0:None,1:None) .  These are all general manifest collection types that
                              can contain any Python object, and which could contain a sub-collection
                              even if they do not.
                              >
                              Terry Jan Reedy
                              Under that definition, "a" in "abc" is clearly well-defined. I
                              construe "abc" to "contain Unicode characters", specifically, "a",
                              "b", and "c". But "ab" is not a Unicode character.

                              "Contain" is still a good word for what strings "do", to the extent
                              that they "do" anything at all. The fact that they contain a uniform
                              data-type permits the extension of "in" to subset/substring testing.

                              Compare to an imaginary "set of ints" data type:
                              >>a= setofints( [ 0, 1, 2 ] )
                              Then, the semantics of
                              >>b= setofints( [ 0, 1 ] )
                              >>b in a
                              True

                              are consistent and predictable. Correct me if I'm wrong.

                              Comment

                              Working...