list of unique non-subset sets

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • les_ander@yahoo.com

    list of unique non-subset sets

    Hi,
    I have many set objects some of which can contain same group of object
    while others can be subset of the other. Given a list of sets,
    I need to get a list of unique sets such that non of the set is an
    subset of another or contain exactly the same members.

    Tried to do the following:
    s1=set(['a','b','c'])
    s2=set(['a','c'])
    s3=set(['a','d','e','f'])
    s4=set(['r','k','l'])
    s5=set(['r','k','l'])
    L=[s1,s2,s3,s4,s5]
    ----------------------- > cleaned-up list should contain s1, s3, s5

    Lu=[] # unique list of sets
    Lu.append( L[0] )
    for i in range(1,len(L)) :
    s=L[i]
    # check for equality
    if s in L:
    continue
    # check for subseting
    for j in len(Lu):
    s1=Lu[j]
    if len (s-s1)==0:
    continue
    elif len(s1-s)==0:
    Lu[i]=s1 # replace with the bigger


    But this does not work since I get
    Lu=[set(['a', 'c', 'b'])]

    What am I doing wrong?
    thanks

  • Steven Bethard

    #2
    Re: list of unique non-subset sets

    les_ander@yahoo .com wrote:[color=blue]
    > Hi,
    > I have many set objects some of which can contain same group of object
    > while others can be subset of the other. Given a list of sets,
    > I need to get a list of unique sets such that non of the set is an
    > subset of another or contain exactly the same members.
    >
    > Tried to do the following:
    > s1=set(['a','b','c'])
    > s2=set(['a','c'])
    > s3=set(['a','d','e','f'])
    > s4=set(['r','k','l'])
    > s5=set(['r','k','l'])
    > L=[s1,s2,s3,s4,s5]
    > ----------------------- > cleaned-up list should contain s1, s3, s5[/color]

    py> s1 = set(['a','b','c'])
    py> s2 = set(['a','c'])
    py> s3 = set(['a','d','e','f'])
    py> s4 = set(['r','k','l'])
    py> s5 = set(['r','k','l'])
    py> lst = [s1, s2, s3, s4, s5]
    py> uniqlst = []
    py> for lstset in lst:
    .... for uniqlstset in uniqlst:
    .... if lstset.issubset (uniqlstset):
    .... break
    .... else:
    .... uniqlst.append( lstset)
    ....
    py> uniqlst
    [set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['k', 'r', 'l'])]

    Not horribly efficient, mind you. But I believe it's correct.

    I basically took your description "Given a list of sets, I need to get a
    list of unique sets such that non[e] of the set[s] is an subset of
    another" and translated that to code. So I iterate through each element
    in lst, and and only add that element to uniqlst if it's not a subset of
    any of the items already in uniqlst.

    STeVe

    Comment

    • Raymond Hettinger

      #3
      Re: list of unique non-subset sets

      [les_ander@yahoo .com][color=blue]
      > I have many set objects some of which can contain same group of object
      > while others can be subset of the other. Given a list of sets,
      > I need to get a list of unique sets such that non of the set is an
      > subset of another or contain exactly the same members.
      >
      > Tried to do the following:
      > s1=set(['a','b','c'])
      > s2=set(['a','c'])
      > s3=set(['a','d','e','f'])
      > s4=set(['r','k','l'])
      > s5=set(['r','k','l'])
      > L=[s1,s2,s3,s4,s5]
      > ----------------------- > cleaned-up list should contain s1, s3, s5[/color]

      This should do the trick:


      result = []
      for s1 in L:
      for s2 in result:
      if s1 <= s2:
      break
      else:
      result.append(s 1)

      print result


      Raymond Hettinger


      Comment

      • Kent Johnson

        #4
        Re: list of unique non-subset sets

        Raymond Hettinger wrote:[color=blue]
        > [les_ander@yahoo .com]
        >[color=green]
        >>I have many set objects some of which can contain same group of object
        >>while others can be subset of the other. Given a list of sets,
        >>I need to get a list of unique sets such that non of the set is an
        >>subset of another or contain exactly the same members.
        >>
        >>Tried to do the following:
        >>s1=set(['a','b','c'])
        >>s2=set(['a','c'])
        >>s3=set(['a','d','e','f'])
        >>s4=set(['r','k','l'])
        >>s5=set(['r','k','l'])
        >>L=[s1,s2,s3,s4,s5]
        >>----------------------- > cleaned-up list should contain s1, s3, s5[/color]
        >
        >
        > This should do the trick:
        >
        >
        > result = []
        > for s1 in L:
        > for s2 in result:
        > if s1 <= s2:
        > break
        > else:
        > result.append(s 1)
        >
        > print result[/color]

        If I understand the original post correctly, you also need to check for an existing set being a
        subset of the set you are adding. A better test case is
        s1=set(['a','b','c'])
        s2=set(['a','c'])
        s3=set(['a','d','e','f'])
        s4=set(['r','k','l'])
        s5=set(['r','k','l'])
        s6=set(['g', 'h'])
        s7=set(['h', 'i'])
        s8=set(['g', 'h', 'i'])

        L=[s1,s2,s3,s4,s5, s6,s7,s8]
        # ----------------------- > cleaned-up list should contain s1, s3, s5, s8

        which both Raymond and STeVe's proposals fail.

        Kent

        Comment

        • Steven Bethard

          #5
          Re: list of unique non-subset sets

          Kent Johnson wrote:[color=blue]
          > Raymond Hettinger wrote:
          >[color=green]
          >> [les_ander@yahoo .com]
          >>[color=darkred]
          >>> I have many set objects some of which can contain same group of object
          >>> while others can be subset of the other. Given a list of sets,
          >>> I need to get a list of unique sets such that non of the set is an
          >>> subset of another or contain exactly the same members.
          >>>
          >>> Tried to do the following:
          >>> s1=set(['a','b','c'])
          >>> s2=set(['a','c'])
          >>> s3=set(['a','d','e','f'])
          >>> s4=set(['r','k','l'])
          >>> s5=set(['r','k','l'])
          >>> L=[s1,s2,s3,s4,s5]
          >>> ----------------------- > cleaned-up list should contain s1, s3, s5[/color]
          >>
          >>
          >>
          >> This should do the trick:
          >>
          >>
          >> result = []
          >> for s1 in L:
          >> for s2 in result:
          >> if s1 <= s2:
          >> break
          >> else:
          >> result.append(s 1)
          >>
          >> print result[/color]
          >
          >
          > If I understand the original post correctly, you also need to check for
          > an existing set being a subset of the set you are adding. A better test
          > case is
          > s1=set(['a','b','c'])
          > s2=set(['a','c'])
          > s3=set(['a','d','e','f'])
          > s4=set(['r','k','l'])
          > s5=set(['r','k','l'])
          > s6=set(['g', 'h'])
          > s7=set(['h', 'i'])
          > s8=set(['g', 'h', 'i'])
          >
          > L=[s1,s2,s3,s4,s5, s6,s7,s8]
          > # ----------------------- > cleaned-up list should contain s1, s3, s5, s8
          >
          > which both Raymond and STeVe's proposals fail.[/color]

          Can you just do:

          py> def uniq(lst):
          .... result = []
          .... for s1 in sorted(lst, reverse=True):
          .... for s2 in result:
          .... if s1 <= s2:
          .... break
          .... else:
          .... result.append(s 1)
          .... return result
          ....
          py> lst = [set(['a','b','c']),
          .... set(['a','c']),
          .... set(['a','d','e','f']),
          .... set(['r','k','l']),
          .... set(['r','k','l']),
          .... set(['g', 'h']),
          .... set(['h', 'i']),
          .... set(['g', 'h', 'i'])]
          py> uniq(lst)
          [set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['k', 'r', 'l']),
          set(['i', 'h', 'g'])]

          Haven't thought about it too hard though...

          STeVe

          Comment

          • Peter Otten

            #6
            Re: list of unique non-subset sets

            Steven Bethard wrote:
            [color=blue]
            > Can you just do:
            >
            > py> def uniq(lst):
            > ...     result = []
            > ...     for s1 in sorted(lst, reverse=True):
            > ...         for s2 in result:
            > ...             if s1 <= s2:
            > ...                 break
            > ...         else:
            > ...             result.append(s 1)
            > ...     return result
            > ...
            > py> lst = [set(['a','b','c']),
            > ...        set(['a','c']),
            > ...        set(['a','d','e','f']),
            > ...        set(['r','k','l']),
            > ...        set(['r','k','l']),
            > ...        set(['g', 'h']),
            > ...        set(['h', 'i']),
            > ...        set(['g', 'h', 'i'])]
            > py> uniq(lst)
            > [set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['k', 'r', 'l']),
            > set(['i', 'h', 'g'])][/color]

            No, you can't:

            """Since sets only define partial ordering (subset relationships), the
            output of the list.sort() method is undefined for lists of sets.
            """

            Maybe you could just change the above code to

            ....
            for s1 in sorted(lst, key=len, reverse=True):
            ....
            [color=blue]
            > Haven't thought about it too hard though...[/color]

            Same here.

            Peter

            Comment

            • Raymond Hettinger

              #7
              Re: list of unique non-subset sets


              "Kent Johnson" <kent37@tds.net > wrote in message
              news:4239ead0$1 _3@newspeer2.td s.net...[color=blue]
              > Raymond Hettinger wrote:[color=green]
              > > [les_ander@yahoo .com]
              > >[color=darkred]
              > >>I have many set objects some of which can contain same group of object
              > >>while others can be subset of the other. Given a list of sets,
              > >>I need to get a list of unique sets such that non of the set is an
              > >>subset of another or contain exactly the same members.
              > >>
              > >>Tried to do the following:
              > >>s1=set(['a','b','c'])
              > >>s2=set(['a','c'])
              > >>s3=set(['a','d','e','f'])
              > >>s4=set(['r','k','l'])
              > >>s5=set(['r','k','l'])
              > >>L=[s1,s2,s3,s4,s5]
              > >>----------------------- > cleaned-up list should contain s1, s3, s5[/color]
              > >
              > >
              > > This should do the trick:
              > >
              > >
              > > result = []
              > > for s1 in L:
              > > for s2 in result:
              > > if s1 <= s2:
              > > break
              > > else:
              > > result.append(s 1)
              > >
              > > print result[/color]
              >
              > If I understand the original post correctly, you also need to check for an[/color]
              existing set being a[color=blue]
              > subset of the set you are adding. A better test case is
              > s1=set(['a','b','c'])
              > s2=set(['a','c'])
              > s3=set(['a','d','e','f'])
              > s4=set(['r','k','l'])
              > s5=set(['r','k','l'])
              > s6=set(['g', 'h'])
              > s7=set(['h', 'i'])
              > s8=set(['g', 'h', 'i'])
              >
              > L=[s1,s2,s3,s4,s5, s6,s7,s8]
              > # ----------------------- > cleaned-up list should contain s1, s3, s5, s8[/color]

              Try this:

              result = []
              seen = set()
              for s1 in L:
              for s2 in L:
              if s1 < s2:
              break
              else:
              if s1 not in seen:
              result.append(s 1)
              seen.add(frozen set(s1))
              print result


              Raymond Hettinger


              Comment

              • bearophileHUGS@lycos.com

                #8
                Re: list of unique non-subset sets

                s1 = set(['a','b','c'])
                s2 = set(['a','c'])
                s3 = set(['a','d','e','f'])
                s4 = set(['r','k','l'])
                s5 = set(['r','k','l'])
                ls = [s1,s2,s3,s4,s5]
                result1 = [s1, s3, s5]

                A result can contain s4 or s5 at random. This problem looks like the
                one of searching the correct hyperedges for a hypergraph. s1-5 are the
                hyperedges, and "a", "b",... are the nodes. Probably this problem is
                already solved somewhere.

                .. # You can start removing the duplicated sets (slow operation):
                .. ls2 = list(set(map(fr ozenset, ls)))
                .. # Then I've found a solution similar to the first Raymond Hettinger
                one:
                .. result = []
                .. for si in ls2:
                .. for sj in result:
                .. if si <= sj:
                .. break
                .. else:
                .. result.append(s i)
                .. print result

                This can be slow, but it probably can be made a bit faster version, but
                it's not easy. You can put the nodes in a undirected graph that
                represents the hypergraph, with other "extra nodes" that represent the
                hyperedges already in result).

                .. print list(enumerate( map(list,ls2)))
                .. # Output:
                [(0,['k','r','l']),(1,['a','c','b']),(2,['a','e','d','f']),(3,['a','c'])]
                .. # Probably you can work on ls too, avoiding the slow creation of ls2,
                but you have to cheek
                .. # more things later.
                .. import graph
                .. g = graph.Graph()
                .. for n1,s in enumerate(ls2):
                .. for n2 in s:
                .. g.addBiArc(n1, n2) # add edges and nodes as needed
                ..
                .. # then you can find the connected components of the graph
                .. print g # Output: 0-k,l,r 1-a,b,c 2-a,d,e,f 3-a,c
                .. cc = g.connectedComp onents()
                .. # now cc = [[0, 'k', 'r', 'l'], [1, 'a', 'c', 'b', 2, 3, 'e', 'd',
                'f']]
                .. # and you can filter the hyperedges. This can be improved a lot:
                .. ccf = [ [n for n in c if type(n)==int] for c in cc ]
                .. # now ccf = [[0], [1, 2, 3]]

                Eppstein unionFind data structure can also be used for a similar
                purpose. 0-th element of ls2 is set(['r','k','l']). It's totally
                disjoint from the others. Now you don't have to look every sj in
                result, but only the sj of result that are inside its connected ones.
                For example, you don't have to test set(['a','c']) against
                set(['r','k','l']). I don't know how much faster this can be compared
                to the simple O(len(ls)^2) program seen before... But it can be useful
                for situations with lots of sets.

                Bye,
                Bearophile

                Comment

                • bearophileHUGS@lycos.com

                  #9
                  Re: list of unique non-subset sets

                  Looking at all the hyperedges in the connected component is a big
                  waste... You can look at just the hyperedges that share one or more
                  nodes.
                  (Nodes are the original letters contained in the sets, and they must be
                  hashable).

                  If nodes aren't integers in [0, len(l)) then you can use this simpler
                  code:

                  .. l = [set(['a','b','c']), set(['a','c']), set(['a','d','e','1']),
                  .. set(['r','k','l']), set(['b','e','1']), set(['a','c']) ]
                  .. from graph import Graph
                  .. # http://www.fantascienza.net/animalia/graph.zip
                  .. # you can something similar with Boost
                  ..
                  .. debug = True
                  .. g = Graph()
                  .. for n1,s in enumerate(l):
                  .. g.addNode(n1, nodeData=1)
                  .. for n2 in s:
                  .. g.addBiArc(n1, n2)
                  .. if debug: print g # ****
                  ..
                  .. result = []
                  .. for i,s in enumerate(l):
                  .. ss = set()
                  .. for n1 in s:
                  .. ss.update(g.xou tNodes(n1))
                  .. ss.remove(i)
                  .. if debug: print i, ss # ****
                  ..
                  .. for sj in ss:
                  .. if s <= l[sj]:
                  .. break
                  .. else:
                  .. result.append(s )
                  .. print result



                  If nodes are hashable, but they can cointain integers in [0, len(l)),
                  then you can use this other code with nodeID translations (in a
                  different graph implementation, like Gato one, such translations can be
                  automatic):


                  .. l = [set(['a','b','c']), set(['a','c']), set(['a','d','e','1']),
                  .. set(['r','k','l']), set(['b','e','1']), set(['a','c']) ]
                  .. from graph import Graph
                  ..
                  .. debug = True
                  .. nodes = set()
                  .. for s in l:
                  .. nodes.update(s)
                  .. lenl = len(l)
                  .. nodes = dict( (i+lenl, n) for i,n in enumerate(nodes ) )
                  .. if debug: print "nodes:", nodes # ****
                  .. nodesid = dict( (n,i) for i,n in nodes.iteritems () )
                  .. l2 = [ set( nodesid[n] for n in s ) for s in l ]
                  .. if debug: print "l2:", l2 # ****
                  ..
                  .. g = Graph()
                  .. for n1,s in enumerate(l2):
                  .. g.addNode(n1)
                  .. for n2 in s:
                  .. g.addBiArc(n1, n2)
                  .. if debug: print g # ****
                  ..
                  .. result = []
                  .. for i,s in enumerate(l2):
                  .. ss = set()
                  .. for n1 in s:
                  .. ss.update(g.xou tNodes(n1))
                  .. ss.remove(i)
                  .. if debug: print "i, ss:", i, ss # ****
                  ..
                  .. for sj in ss:
                  .. if s <= l2[sj]:
                  .. break
                  .. else:
                  .. result.append(s )
                  .. if debug: print "result:", result # ****
                  .. result2 = [ set( nodes[n] for n in s ) for s in result ]
                  .. print result2

                  If the hyperedges define a sparse hypergraph, then this code can be
                  quite faster than the original quadratic one (if len(l) is big enough).

                  Bye,
                  Bearophile

                  Comment

                  • Bengt Richter

                    #10
                    Re: list of unique non-subset sets

                    On Thu, 17 Mar 2005 23:56:46 GMT, "Raymond Hettinger" <vze4rx4y@veriz on.net> wrote:
                    [color=blue]
                    >
                    >"Kent Johnson" <kent37@tds.net > wrote in message
                    >news:4239ead0$ 1_3@newspeer2.t ds.net...[color=green]
                    >> Raymond Hettinger wrote:[color=darkred]
                    >> > [les_ander@yahoo .com]
                    >> >
                    >> >>I have many set objects some of which can contain same group of object
                    >> >>while others can be subset of the other. Given a list of sets,
                    >> >>I need to get a list of unique sets such that non of the set is an
                    >> >>subset of another or contain exactly the same members.
                    >> >>
                    >> >>Tried to do the following:
                    >> >>s1=set(['a','b','c'])
                    >> >>s2=set(['a','c'])
                    >> >>s3=set(['a','d','e','f'])
                    >> >>s4=set(['r','k','l'])
                    >> >>s5=set(['r','k','l'])
                    >> >>L=[s1,s2,s3,s4,s5]
                    >> >>----------------------- > cleaned-up list should contain s1, s3, s5
                    >> >
                    >> >
                    >> > This should do the trick:
                    >> >
                    >> >
                    >> > result = []
                    >> > for s1 in L:
                    >> > for s2 in result:
                    >> > if s1 <= s2:
                    >> > break
                    >> > else:
                    >> > result.append(s 1)
                    >> >
                    >> > print result[/color]
                    >>
                    >> If I understand the original post correctly, you also need to check for an[/color]
                    >existing set being a[color=green]
                    >> subset of the set you are adding. A better test case is
                    >> s1=set(['a','b','c'])
                    >> s2=set(['a','c'])
                    >> s3=set(['a','d','e','f'])
                    >> s4=set(['r','k','l'])
                    >> s5=set(['r','k','l'])
                    >> s6=set(['g', 'h'])
                    >> s7=set(['h', 'i'])
                    >> s8=set(['g', 'h', 'i'])
                    >>
                    >> L=[s1,s2,s3,s4,s5, s6,s7,s8]
                    >> # ----------------------- > cleaned-up list should contain s1, s3, s5, s8[/color]
                    >
                    >Try this:
                    >
                    >result = []
                    >seen = set()
                    >for s1 in L:
                    > for s2 in L:
                    > if s1 < s2:
                    > break
                    > else:
                    > if s1 not in seen:
                    > result.append(s 1)
                    > seen.add(frozen set(s1))
                    >print result
                    >[/color]
                    Actually, ISTM there are more than one set of sets that can be drawn from the
                    original set that internally satisfy the criteria of no internal duplicates or subsets. E.g.,
                    here are some lists of sets selected from L above. I believe (unless I goofed) the requirement
                    """[color=blue][color=green][color=darkred]
                    >> >>I need to get a list of unique sets such that non of the set is an
                    >> >>subset of another or contain exactly the same members.[/color][/color][/color]
                    """
                    is satisfied internally within each list below.

                    [set(['h', 'g']), set(['i', 'h'])]
                    [set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f'])]
                    [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['h', 'g'])]
                    [set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f'])]
                    []
                    [set(['a', 'c', 'b']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['h', 'g'])]
                    [set(['a', 'c', 'b'])]
                    [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g'])]
                    [set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['i', 'h', 'g'])]
                    [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g'])]
                    [set(['a', 'e', 'd', 'f']), set(['i', 'h'])]
                    [set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g'])]
                    [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['i', 'h', 'g'])]
                    [set(['a', 'c'])]
                    [set(['k', 'r', 'l']), set(['a', 'c']), set(['h', 'g'])]
                    [set(['k', 'r', 'l']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['i', 'h'])]
                    [set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['i', 'h'])]
                    [set(['a', 'c']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c'])]
                    [set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g'])]
                    [set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['h', 'g'])]
                    [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['a', 'e', 'd', 'f']), set(['a', 'c'])]
                    [set(['k', 'r', 'l']), set(['h', 'g'])]
                    [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['h', 'g'])]
                    [set(['k', 'r', 'l']), set(['a', 'c']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['h', 'g'])]
                    [set(['k', 'r', 'l']), set(['a', 'c', 'b'])]
                    [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['i', 'h'])]
                    [set(['a', 'c', 'b']), set(['i', 'h', 'g'])]
                    [set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['h', 'g'])]
                    [set(['a', 'c', 'b']), set(['h', 'g'])]
                    [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['i', 'h', 'g'])]
                    [set(['a', 'c', 'b']), set(['i', 'h'])]
                    [set(['a', 'c']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'c']), set(['i', 'h', 'g'])]
                    [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f'])]
                    [set(['a', 'c']), set(['h', 'g'])]
                    [set(['k', 'r', 'l']), set(['i', 'h', 'g'])]
                    [set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['h', 'g'])]
                    [set(['a', 'e', 'd', 'f']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['a', 'c']), set(['i', 'h', 'g'])]
                    [set(['k', 'r', 'l']), set(['a', 'c']), set(['i', 'h'])]
                    [set(['k', 'r', 'l']), set(['a', 'c'])]
                    [set(['a', 'e', 'd', 'f']), set(['h', 'g'])]
                    [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['h', 'g']), set(['i', 'h'])]
                    [set(['a', 'e', 'd', 'f'])]
                    [set(['i', 'h', 'g'])]

                    Regards,
                    Bengt Richter

                    Comment

                    • les_ander@yahoo.com

                      #11
                      Re: list of unique non-subset sets

                      OK, so I need to be more precise.
                      Given a list of sets, output the largest list of sets (from this list,
                      order does not matter) such that:
                      1) there is no set that is a PROPER subset of another set in this list
                      2) no two sets have exactly the same members (100% overlap)

                      Seems like this problem is much harder than I thought...

                      Comment

                      • Bengt Richter

                        #12
                        Re: list of unique non-subset sets

                        On 18 Mar 2005 15:46:44 -0800, les_ander@yahoo .com wrote:
                        [color=blue]
                        >OK, so I need to be more precise.
                        >Given a list of sets, output the largest list of sets (from this list,
                        >order does not matter) such that:
                        >1) there is no set that is a PROPER subset of another set in this list
                        >2) no two sets have exactly the same members (100% overlap)
                        >
                        >Seems like this problem is much harder than I thought...
                        >[/color]
                        two from the above come out length 5:

                        5: [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']), set(['h', 'g']), set(['i', 'h'])]
                        5: [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']), set(['h', 'g']), set(['i', 'h'])]

                        How do you define "largest" ? ;-)
                        I guess you could sum(map(len, setlist)) as a measure, but what if that makes
                        a tie between two setlists (as I think it could, in general)?

                        Regards,
                        Bengt Richter

                        Comment

                        • Raymond Hettinger

                          #13
                          Re: list of unique non-subset sets

                          [les_ander@yahoo .com ][color=blue][color=green]
                          > >OK, so I need to be more precise.
                          > >Given a list of sets, output the largest list of sets (from this list,
                          > >order does not matter) such that:
                          > >1) there is no set that is a PROPER subset of another set in this list
                          > >2) no two sets have exactly the same members (100% overlap)[/color][/color]

                          [Bengt Richter][color=blue]
                          > two from the above come out length 5:
                          >
                          > 5: [set(['k', 'r', 'l']), set(['a', 'c', 'b']), set(['a', 'e', 'd', 'f']),[/color]
                          set(['h', 'g']), set(['i', 'h'])][color=blue]
                          > 5: [set(['k', 'r', 'l']), set(['a', 'e', 'd', 'f']), set(['a', 'c']),[/color]
                          set(['h', 'g']), set(['i', 'h'])][color=blue]
                          >
                          > How do you define "largest" ? ;-)
                          > I guess you could sum(map(len, setlist)) as a measure, but what if that makes
                          > a tie between two setlists (as I think it could, in general)?[/color]

                          With multiple outputs satisfying the constraints, I would suspect that there is
                          something wrong with the original spec (not as a stand-alone problem, but as
                          component of a real application).


                          Raymond Hettinger


                          Comment

                          • les_ander@yahoo.com

                            #14
                            Re: list of unique non-subset sets

                            Once again my specs were incomplete.
                            By largest I mean exactly what you pointed out as in sum(map(len,
                            setlist)).

                            I think this might work--sorting of the initial list should do the
                            trick.
                            1) sort the sets by size (in decending order)
                            2) put the first (largest) into a new list (Lu)
                            for s in Lnew[1:]:
                            keep=True
                            for i in range(len( Lun) ):
                            if len(s)==len( s & Lun[i] ):
                            keep=False
                            break
                            if keep==True:
                            Lun.append( s )

                            ----------------- here is the complete code
                            s1=set(['a','b','c'])
                            s2=set(['a','c'])
                            s3=set(['a','d','e','f'])
                            s4=set(['r','k','l'])
                            s5=set(['r','k','l'])
                            s6=set(['g', 'h'])
                            s7=set(['h', 'i'])
                            s8=set(['g', 'h', 'i'])

                            L=[s1,s2,s3,s4,s5, s6,s7,s8]
                            length=[len(s) for s in L]
                            L2= sorted(zip(leng th,range(len(L) )))
                            Lnew=[L[j] for (i,j) in L2]
                            Lnew.reverse()
                            Lun=[Lnew[0]] # list with the result
                            for s in Lnew[1:]:
                            keep=True
                            for i in range(len( Lun) ):
                            if len(s)==len( s & Lun[i] ):
                            keep=False
                            break
                            if keep==True:
                            Lun.append( s )

                            #----------------[color=blue][color=green][color=darkred]
                            >>> Lun[/color][/color][/color]
                            [set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g']), set(['k', 'r', 'l']),
                            set(['a', 'c', 'b'])]

                            Seems like I got it.

                            Comment

                            • Bengt Richter

                              #15
                              Re: list of unique non-subset sets

                              On 18 Mar 2005 19:41:55 -0800, les_ander@yahoo .com wrote:
                              [color=blue]
                              >Once again my specs were incomplete.
                              >By largest I mean exactly what you pointed out as in sum(map(len,
                              >setlist)).
                              >[/color]
                              But that will not necessarily yield a single setlist taken from the source set list,
                              so you still need a selection amongst equals. You can make it explicit, or
                              you can say you'll take whatever a sort puts at the sorted list extreme, which you
                              seem to have done ;-)
                              [color=blue]
                              >I think this might work--sorting of the initial list should do the
                              >trick.
                              >1) sort the sets by size (in decending order)
                              >2) put the first (largest) into a new list (Lu)
                              >for s in Lnew[1:]:
                              > keep=True
                              > for i in range(len( Lun) ):
                              > if len(s)==len( s & Lun[i] ):
                              > keep=False
                              > break
                              > if keep==True:
                              > Lun.append( s )
                              >
                              >----------------- here is the complete code
                              >s1=set(['a','b','c'])
                              >s2=set(['a','c'])
                              >s3=set(['a','d','e','f'])
                              >s4=set(['r','k','l'])
                              >s5=set(['r','k','l'])
                              >s6=set(['g', 'h'])
                              >s7=set(['h', 'i'])
                              >s8=set(['g', 'h', 'i'])
                              >
                              >L=[s1,s2,s3,s4,s5, s6,s7,s8]
                              >length=[len(s) for s in L]
                              >L2= sorted(zip(leng th,range(len(L) )))
                              >Lnew=[L[j] for (i,j) in L2]
                              >Lnew.reverse ()
                              >Lun=[Lnew[0]] # list with the result
                              >for s in Lnew[1:]:
                              > keep=True
                              > for i in range(len( Lun) ):
                              > if len(s)==len( s & Lun[i] ):
                              > keep=False
                              > break
                              > if keep==True:
                              > Lun.append( s )
                              >
                              >#----------------[color=green][color=darkred]
                              >>>> Lun[/color][/color]
                              >[set(['a', 'e', 'd', 'f']), set(['i', 'h', 'g']), set(['k', 'r', 'l']),
                              >set(['a', 'c', 'b'])]
                              >
                              >Seems like I got it.
                              >[/color]
                              Does it work on
                              L = [set('abc'), set('def'), set('abcdef')]
                              ?
                              I get:
                              0: []
                              6: [set(['a', 'c', 'b']), set(['e', 'd', 'f'])]
                              6: [set(['a', 'c', 'b', 'e', 'd', 'f'])]

                              Your code above:
                              [color=blue][color=green][color=darkred]
                              >>> L = [set('abc'), set('def'), set('abcdef')]
                              >>> length=[len(s) for s in L]
                              >>> L2= sorted(zip(leng th,range(len(L) )))
                              >>> Lnew=[L[j] for (i,j) in L2]
                              >>> Lnew.reverse()
                              >>> Lun=[Lnew[0]] # list with the result
                              >>> for s in Lnew[1:]:[/color][/color][/color]
                              ... keep=True
                              ... for i in range(len( Lun) ):
                              ... if len(s)==len( s & Lun[i] ):
                              ... keep=False
                              ... break
                              ... if keep==True:
                              ... Lun.append( s )
                              ...[color=blue][color=green][color=darkred]
                              >>> Lun[/color][/color][/color]
                              [set(['a', 'c', 'b', 'e', 'd', 'f'])]

                              But your successful set list is not unique in its success value (6) ;-)

                              Regards,
                              Bengt Richter

                              Comment

                              Working...