Test if list contains another list

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • mathieu

    Test if list contains another list

    Hi there,

    I am trying to write something very simple to test if a list
    contains another one:

    a = [1,2,3]

    b = [3,2,1,4]

    but 'a in b' returns False. How do I check that a is indeed contained
    in b ?

    thanks
  • Christian Heimes

    #2
    Re: Test if list contains another list

    mathieu wrote:
    Hi there,
    >
    I am trying to write something very simple to test if a list
    contains another one:
    >
    a = [1,2,3]
    >
    b = [3,2,1,4]
    >
    but 'a in b' returns False. How do I check that a is indeed contained
    in b ?
    Use sets:
    >>a = [1,2,3]
    >>b = [3,2,1,4]
    >>set(a).issubs et(set(b))
    True


    Christian

    Comment

    • James Mills

      #3
      Re: Test if list contains another list

      Hi,
      >>a = [1,2,3]
      >>b = [3,2,1,4]
      >>a = set(a)
      >>b = set(b)
      >>a.intersectio n(b)
      set([1, 2, 3])

      Is this what you want ?

      cheers
      James

      On 9/8/08, mathieu <mathieu.malate rre@gmail.comwr ote:
      Hi there,
      >
      I am trying to write something very simple to test if a list
      contains another one:
      >
      a = [1,2,3]
      >
      b = [3,2,1,4]
      >
      but 'a in b' returns False. How do I check that a is indeed contained
      in b ?
      >
      thanks
      >
      --

      >

      --
      --
      -- "Problems are solved by method"

      Comment

      • mathieu

        #4
        Re: Test if list contains another list

        On Sep 8, 9:32 am, Bruno Desthuilliers
        <bdesth.quelque ch...@free.quel quepart.frwrote :
        mathieu a écrit :
        >
        Hi there,
        >
        I am trying to write something very simple to test if a list
        contains another one:
        >
        a = [1,2,3]
        >
        b = [3,2,1,4]
        >
        but 'a in b' returns False.
        >
        Indeed. Lists are not sets, and the fact that all elements of list a
        happens to also be part of list b doesn't make the list a itself an
        element of list b.
        >
        >>a = [1, 2, 3]
        >>b = [3,2,1,4]
        >>c = [b, a]
        >>a in c
        True
        >>b in c
        True
        >>c
        [[3, 2, 1, 4], [1, 2, 3]]
        >>>
        >
        How do I check that a is indeed contained
        in b ?
        >
        But that's what you did - you *did* checked if a was contained in b, and
        this is not the case. What you want is to check if *all elements* of a
        are contained in b, which is quite another problem. Now the answer to
        your question : use sets.
        >
        >>set(a).issubs et(set(b))
        True
        >>>
        thanks all !

        Comment

        • Derek Martin

          #5
          Re: Test if list contains another list

          On Thu, Sep 18, 2008 at 03:24:16AM -0700, gauravatnet@gma il.com wrote:
          I looked inside this thread for my query which brought me the
          following google search result
          "Test if list contains another list - comp.lang.pytho n | Google
          Groups"
          >
          But then I was disappointed to see the question asked was not exactly
          right.
          [...]
          def findAllMatching List(mainList, subList):
          resultIndex = []
          globalIndex = 0
          for i in range(len(mainL ist)):
          if i < globalIndex:
          continue
          globalIndex = i
          increment = 0
          for j in range(len(subLi st)):
          if mainList[globalIndex] == subList[j]:
          globalIndex += 1
          increment += 1
          if j == (len(subList)-1):
          resultIndex.app end(globalIndex-increment)
          else:
          break
          >
          return resultIndex
          I didn't time them to compare, but how about this instead:
          >>def find_needle_in_ haystack(needle , haystack):
          .... r = []
          .... L = len(needle)
          .... for i in range(len(hayst ack)):
          .... if haystack[i:i+L] == needle:
          .... r.append(i)
          .... return r
          >># this fails because "3" is not 3...
          >>find_needle_i n_haystack([1,2,3], ["a","b",1,2,"3" ,"9"])
          []
          >>find_needle_i n_haystack([1,2,3], [1,2,3])
          [0]
          >>find_needle_i n_haystack([1,2,3], ["a","b",1,2,3," 9"])
          [2]
          >>find_needle_i n_haystack([1,2,3], ["a","b",1,2,3," 9","q",1,2,3])
          [2, 7]

          --
          Derek D. Martin

          GPG Key ID: 0x81CFE75D


          -----BEGIN PGP SIGNATURE-----
          Version: GnuPG v1.2.1 (GNU/Linux)

          iD8DBQFI3TcUdjd lQoHP510RAlc0AJ 9CaqXxWSN6FeWsg ApaNNf973medQCe KJUZ
          AAZ4X3LFezSlIXR p3H3W748=
          =3OrB
          -----END PGP SIGNATURE-----

          Comment

          • bearophileHUGS@lycos.com

            #6
            Re: Test if list contains another list

            I suggest Python programmers to fill the holes in the Python std lib
            with some debugged & tuned implementations , their "bag of tricks", so
            they don't have to re-invent and debug things all the time. This works
            well with Psyco:


            def issubseq(sub, items):
            """issubseq(sub , items): return true if the sequence 'sub' is a
            contiguous
            subsequence of the 'items' sequence.
            >>issubseq()
            Traceback (most recent call last):
            ...
            TypeError: issubseq() takes exactly 2 arguments (0 given)
            >>issubseq("abc ")
            Traceback (most recent call last):
            ...
            TypeError: issubseq() takes exactly 2 arguments (1 given)
            >>issubseq(1, [1, 2, 3])
            Traceback (most recent call last):
            ...
            TypeError: 'int' object is not iterable
            >>isi = lambda s,i: int(issubseq(s, i))
            >>isi([], [])
            1
            >>isi("a", "")
            0
            >>isi("", "a")
            1
            >>isi("", "aaa")
            1
            >>isi("a", "a")
            1
            >>isi("ab", "bab")
            1
            >>isi("ab", "bab")
            1
            >>isi("ba", "bbb")
            0
            >>isi("bab", "ab")
            0
            >>isi(("a", "b"), ("a","a","b" ))
            1
            >>isi(("a", "b"), ("a","a","c" ))
            0
            >>isi([1,2,1], [3,5, 1,2,4, 1,2,1, 6])
            1
            >>isi([1,2,1], [3,5, 1,2,4, 1,2,3, 6])
            0
            >>l = [1] * 50 + [1,2,3] + [4] * 50
            >>isi([1,2,3], l)
            1
            >>l = [1] * 50 + [1,2,4] + [5] * 50
            >>isi([1,2,3], l)
            0
            """
            if not hasattr(sub, "__getitem_ _"):
            sub = list(sub)

            len_sub = len(sub)
            if len_sub == 0:
            return True

            try:
            if not len(items) or len(items) < len_sub:
            return False
            except TypeError:
            pass

            if sub == items:
            return True

            table = [0] * (len_sub + 1)

            # building prefix-function
            m = 0
            for i in xrange(1, len_sub):
            while m 0 and sub[m] != sub[i]:
            m = table[m - 1]
            if sub[m] == sub[i]:
            m += 1
            table[i] = m

            # searching
            m, i = 0, 0
            for x in items:
            while m 0 and sub[m] != x:
            m = table[m - 1]
            if sub[m] == x:
            m += 1
            if m == len_sub:
            return True
            i += 1

            return False


            Usage:
            >>from util import issubseq # uses Psyco
            >>from timing import timeit
            >>a = range(1000000) + range(1000) + range(100000)
            >>sub = range(1000)
            >>len(a)
            1101000
            >>timeit(issubs eq, sub, a)
            (True, 0.0)
            >>a = range(999) * 10000 + range(1000) + range(10000)
            >>sub = range(1000)
            >>timeit(issubs eq, sub, a)
            (True, 0.2000000000000 0001)

            Bye,
            bearophile

            Comment

            • bearophileHUGS@lycos.com

              #7
              Re: Test if list contains another list

              bearophile:
              # searching
              m, i = 0, 0
              ....
              i += 1
              The name 'i' can be removed, sorry.

              Bye,
              bearophile

              Comment

              • Derek Martin

                #8
                Re: Test if list contains another list

                On Fri, Sep 26, 2008 at 01:39:16PM -0700, bearophileHUGS@ lycos.com wrote:
                # building prefix-function
                m = 0
                for i in xrange(1, len_sub):
                while m 0 and sub[m] != sub[i]:
                m = table[m - 1]
                if sub[m] == sub[i]:
                m += 1
                table[i] = m
                >
                # searching
                m, i = 0, 0
                for x in items:
                while m 0 and sub[m] != x:
                m = table[m - 1]
                if sub[m] == x:
                m += 1
                if m == len_sub:
                return True
                i += 1
                >
                return False
                Quite a lot faster than mine... even without using psyco. Which is
                interesting, especially because if I change my search loop to work
                like yours, but leave out all your other optimizations, mine runs
                roughly as fast as yours (i.e. execution time is negligible to a user
                running the program in both cases, even with the large example data
                you gave). This leads me to point out two caveats with your version:

                1. Guavat posted a version which returns a list of all the indexes
                where a match is found... which is what I mimiced. Yours returns only
                true or false indicating whether or not it found a match. The main
                difference in performance seems to come due to your iteration over the
                list items, versus my comparing a sublist-sized slice of the whole to
                the sublist. But this alone is an invalid optimization, because it
                doesn't produce the same results... If you only want to know if a
                match exists, it's great; but if you want to know *where*, or *how
                many times*, you lose. That could be fixed though, by counting the
                elements as you loop through them... I didn't attempt to determine
                the performance hit for doing that, but I assume it's negligible.
                I also imagine that that was your original purpose for the unused
                variable i...

                2. Your secondary optimizations add a great deal of complexity to your
                code, making the algorithm much harder to understand. However they
                don't appear to buy you much, given that the cases they optimize would
                probably be rare, and the difference in execution time gained by the
                optimization is not noticable to the user. Unless you're doing lots
                and lots of these in your application, or maybe if you know in advance
                that your data will contain many instances of the cases you optimized,
                I think you're better off leaving the optimizations out, for the sake
                of code clarity.

                At the very least, if you're going to write complicated optimizations,
                you ought to have explained what you were doing in comments... :)

                --
                Derek D. Martin

                GPG Key ID: 0x81CFE75D


                -----BEGIN PGP SIGNATURE-----
                Version: GnuPG v1.2.1 (GNU/Linux)

                iD8DBQFI4HJ2djd lQoHP510RAhrKAJ 9HhSX5lXuQrX565 dv6MF0T/jMlXgCfaQWe
                kJBxLoxYmzRu6/3C7XXHobU=
                =CAjG
                -----END PGP SIGNATURE-----

                Comment

                • bearophileHUGS@lycos.com

                  #9
                  Re: Test if list contains another list

                  Derek Martin:
                  >Quite a lot faster than mine... even without using psyco.<
                  It's designed for Psyco.

                  >However they don't appear to buy you much, given that the cases they optimize would probably be rare, and the difference in execution time gained by the optimization is not noticable to the user.<
                  EXACT STRING MATCHING ALGORITHMS Animation in Java, Morris-Pratt algorithm


                  >Unless you're doing lots and lots of these in your application,<
                  I don't agree. That's library code, so it has to be efficient and
                  flexible, because it's designed to be used in many different
                  situations (if you don't agree, then you can try to suggest a
                  replacement of the C code of the string search of CPython written by
                  effbot with some slower code).

                  Bye,
                  bearophile

                  Comment

                  • Derek Martin

                    #10
                    Re: Test if list contains another list

                    On Mon, Sep 29, 2008 at 04:12:13AM -0700, bearophileHUGS@ lycos.com wrote:
                    Derek Martin:
                    Unless you're doing lots and lots of these in your application,<
                    >
                    I don't agree. That's library code, so it has to be efficient and
                    flexible, because it's designed to be used in many different
                    situations
                    That's fair, but lots of folks writing Python code will look at that
                    and say, "What the %$#@! is this doing?!?" As I already suggested,
                    code that implements non-obvious algorithms ought to explain what it's
                    doing in comments, so that the neophyte programmers charged with
                    maintaining the library aren't tempted to rewrite the code so that
                    it's easier to understand what it's doing. It can be as simple as:

                    # Use Morris-Pratt algorithm to search data

                    Then, anyone not familiar with the algorithm can easily look it up,
                    and see why it was written that way.

                    I think it's just as important to do that in code you post on the
                    list, since a) the person asking the question obviously doesn't know
                    what you're doing, or they wouldn't have needed to ask the question,
                    and b) there are lots of other folks reading the list who could
                    benefit from the same knowledge. :)

                    --
                    Derek D. Martin

                    GPG Key ID: 0x81CFE75D


                    -----BEGIN PGP SIGNATURE-----
                    Version: GnuPG v1.2.1 (GNU/Linux)

                    iD8DBQFI4Pqudjd lQoHP510RAi3cAJ sH344P41yMojDLq VL2z29qGodldgCf Q5u8
                    4N48WpmWHjjnBNw uDx2vbGE=
                    =OT7l
                    -----END PGP SIGNATURE-----

                    Comment

                    • bearophileHUGS@lycos.com

                      #11
                      Re: Test if list contains another list

                      Derek Martin:
                      code that implements non-obvious algorithms ought to explain what it's
                      doing in comments,
                      I am sorry, you are right, of course.

                      In my libs/code there are always docstrings and doctests/tests, and
                      most times comments too, like you say. When I post code here I often
                      strip away part of those things, mostly to reduce the length of my
                      posts -.- I guess that I have to leave more of that meta-
                      information...

                      Bye,
                      bearophile

                      Comment

                      • Ali

                        #12
                        Re: Test if list contains another list


                        Its funny, I just visited this problem last week.

                        <http://dulceetutile.blogspot.com/200...ooking-python-
                        statement_17.ht ml>

                        ../Ali

                        Comment

                        Working...