Weighted "random" selection from list of lists

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jesse Noller

    Weighted "random" selection from list of lists

    Hello -

    I'm probably missing something here, but I have a problem where I am
    populating a list of lists like this:

    list1 = [ 'a', 'b', 'c' ]
    list2 = [ 'dog', 'cat', 'panda' ]
    list3 = [ 'blue', 'red', 'green' ]

    main_list = [ list1, list2, list3 ]

    Once main_list is populated, I want to build a sequence from items
    within the lists, "randomly" with a defined percentage of the sequence
    coming for the various lists. For example, if I want a 6 item
    sequence, I might want:

    60% from list 1 (main_list[0])
    30% from list 2 (main_list[1])
    10% from list 3 (main_list[2])

    I know how to pull a random sequence (using random()) from the lists,
    but I'm not sure how to pick it with the desired percentages.

    Any help is appreciated, thanks

    -jesse
  • Ron Adam

    #2
    Re: Weighted "random&qu ot; selection from list of lists

    Jesse Noller wrote:

    [color=blue]
    > 60% from list 1 (main_list[0])
    > 30% from list 2 (main_list[1])
    > 10% from list 3 (main_list[2])
    >
    > I know how to pull a random sequence (using random()) from the lists,
    > but I'm not sure how to pick it with the desired percentages.
    >
    > Any help is appreciated, thanks
    >
    > -jesse[/color]

    Just add up the total of all lists.

    total = len(list1)+len( list2)+len(list 3)
    n1 = .60 * total # number from list 1
    n2 = .30 * total # number from list 2
    n3 = .10 * total # number from list 3

    You'll need to decide how to handle when a list has too few items in it.

    Cheers,
    Ron

    Comment

    • Peter Otten

      #3
      Re: Weighted "random&qu ot; selection from list of lists

      Jesse Noller wrote:
      [color=blue]
      > I'm probably missing something here, but I have a problem where I am
      > populating a list of lists like this:
      >
      > list1 = [ 'a', 'b', 'c' ]
      > list2 = [ 'dog', 'cat', 'panda' ]
      > list3 = [ 'blue', 'red', 'green' ]
      >
      > main_list = [ list1, list2, list3 ]
      >
      > Once main_list is populated, I want to build a sequence from items
      > within the lists, "randomly" with a defined percentage of the sequence
      > coming for the various lists. For example, if I want a 6 item
      > sequence, I might want:
      >
      > 60% from list 1 (main_list[0])
      > 30% from list 2 (main_list[1])
      > 10% from list 3 (main_list[2])
      >
      > I know how to pull a random sequence (using random()) from the lists,
      > but I'm not sure how to pick it with the desired percentages.[/color]


      If the percentages can be normalized to small integral numbers, just make a
      pool where each list is repeated according to its weight, e. g.
      list1 occurs 6, list2 3 times, and list3 once:

      pools =[list1, list2, list3]
      weights = [6, 3, 1]
      sample_size = 10

      weighted_pools = []
      for p, w in zip(pools, weights):
      weighted_pools. extend([p]*w)

      sample = [random.choice(r andom.choice(we ighted_pools))
      for _ in xrange(sample_s ize)]


      Another option is to use bisect() to choose a pool:

      pools =[list1, list2, list3]
      sample_size = 10

      def isum(items, sigma=0.0):
      for item in items:
      sigma += item
      yield sigma

      cumulated_weigh ts = list(isum([60, 30, 10], 0))
      sigma = cumulated_weigh ts[-1]

      sample = []
      for _ in xrange(sample_s ize):
      pool = pools[bisect.bisect(c umulated_weight s, random.random() *sigma)]
      sample.append(r andom.choice(po ol))

      (all code untested)

      Peter

      Comment

      • Scott David Daniels

        #4
        Re: Weighted "random&qu ot; selection from list of lists

        Jesse Noller wrote:
        <paraphrased>[color=blue]
        > Once main_list is populated, I want to build a sequence from items
        > within the lists, "randomly" with a defined percentage of the sequence
        > coming for the various lists. For example:
        > 60% from list 1 (main_list[0]), 30% from list 2 (main_list[1]), 10% from list 3 (main_list[2])[/color]


        import bisect, random
        main_list = [['a', 'b', 'c'],
        ['dog', 'cat', 'panda'],
        ['blue', 'red', 'green']]
        weights = [60, 30, 10]

        cumulative = []
        total = 0
        for index, value in enumerate(weigh ts):
        total += value
        cumulative.appe nd(total)

        for i in range(20):
        score = random.random() * total
        index = bisect.bisect(c umulative, score)
        print random.choice(m ain_list[index]),


        --
        -Scott David Daniels
        scott.daniels@a cm.org

        Comment

        • Steven D'Aprano

          #5
          Re: Weighted &quot;random&qu ot; selection from list of lists

          On Sat, 08 Oct 2005 12:48:26 -0400, Jesse Noller wrote:
          [color=blue]
          > Once main_list is populated, I want to build a sequence from items
          > within the lists, "randomly" with a defined percentage of the sequence
          > coming for the various lists. For example, if I want a 6 item
          > sequence, I might want:
          >
          > 60% from list 1 (main_list[0])
          > 30% from list 2 (main_list[1])
          > 10% from list 3 (main_list[2])[/color]

          If you are happy enough to match the percentages statistically rather than
          exactly, simply do something like this:

          pr = random.random()
          if pr < 0.6:
          list_num = 0
          elif pr < 0.9:
          list_num = 1
          else:
          list_num = 2
          return random.choice(m ain_list[list_num])

          or however you want to extract an item.

          On average, this will mean 60% of the items will come from list1 etc, but
          for small numbers of trials, you may have significant differences.



          --
          Steven.

          Comment

          Working...