Finding overlapping times...

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Breal

    Finding overlapping times...

    I have a list that looks like the following
    [(100000, 100010), (100005, 100007), (100009, 100015)]

    I would like to be able to determine which of these overlap each
    other. So, in this case, tuple 1 overlaps with tuples 2 and 3. Tuple
    2 overlaps with 1. Tuple 3 overlaps with tuple 1.

    In my scenario I would have hundreds, if not thousands of these
    ranges. Any nice pythonic way to do this?

    Thanks.
  • Shane Geiger

    #2
    Re: Finding overlapping times...

    You should give a more real data set (maybe 20 various pairs so that all
    scenarios can be seen) and the output you want.



    Breal wrote:
    I have a list that looks like the following
    [(100000, 100010), (100005, 100007), (100009, 100015)]
    >
    I would like to be able to determine which of these overlap each
    other. So, in this case, tuple 1 overlaps with tuples 2 and 3. Tuple
    2 overlaps with 1. Tuple 3 overlaps with tuple 1.
    >
    In my scenario I would have hundreds, if not thousands of these
    ranges. Any nice pythonic way to do this?
    >
    Thanks.
    >

    --
    Shane Geiger
    IT Director
    National Council on Economic Education
    sgeiger@ncee.ne t | 402-438-8958 | http://www.ncee.net

    Leading the Campaign for Economic and Financial Literacy

    Comment

    • Jonathan Gardner

      #3
      Re: Finding overlapping times...

      On Dec 13, 3:45 pm, Breal <sean.be...@cox .netwrote:
      I have a list that looks like the following
      [(100000, 100010), (100005, 100007), (100009, 100015)]
      >
      I would like to be able to determine which of these overlap each
      other. So, in this case, tuple 1 overlaps with tuples 2 and 3. Tuple
      2 overlaps with 1. Tuple 3 overlaps with tuple 1.
      >
      In my scenario I would have hundreds, if not thousands of these
      ranges. Any nice pythonic way to do this?
      >
      Sure.

      Start with a piece of paper and a pencil.

      Write down under what conditions two tuples of numbers will overlap.
      Be specific.

      (Hint: two numbers can be equal, less than, or greater than each
      other. That's 3 conditions. Then you can compare the first of the
      first tuple to the first of the second tuple, or the first to the
      second, or the second to the first, or the second to the second.
      That's 4 conditions. 3x4=12 so you have 12 possible conditions when
      comparing two tuples of two numbers. Describe what the result should
      be for each one. Being graphical will help you get it right.)

      Once you have that written down, translate it to python.

      In python, write a loop that goes through each item in the list and
      compares it to every other item. Remember which items compare
      favorably by storing them in a list.

      When the loop finishes, you should have a list of all the pairs that
      match your conditions.

      Since this sounds like a homework assignment, the rest is left as an
      exercise to the reader.

      Comment

      • John Machin

        #4
        Re: Finding overlapping times...

        On Dec 14, 10:45 am, Breal <sean.be...@cox .netwrote:
        I have a list that looks like the following
        [(100000, 100010), (100005, 100007), (100009, 100015)]
        >
        I would like to be able to determine which of these overlap each
        other. So, in this case, tuple 1 overlaps with tuples 2 and 3.
        Your definition of overlaps appears to be reflexive, so the following
        two results follow automatically.
        Tuple
        2 overlaps with 1. Tuple 3 overlaps with tuple 1.
        >
        In my scenario I would have hundreds, if not thousands of these
        ranges. Any nice pythonic way to do this?
        Probably not.
        Just ensure that (a) your time intervals (start, end) obey start <=
        end (b) your list of intervals is sorted, then get stuck in:

        # tested no more that what you see
        alist = [(100000, 100010), (100005, 100007), (100009, 100015),
        (100016, 100017), (100016, 100018)]
        n = len(alist)
        for i in xrange(n - 1):
        istart, iend = alist[i]
        for j in xrange(i + 1, n):
        jstart, jend = alist[j]
        if jstart iend:
        break
        print "Overlap:", i, alist[i], j, alist[j]

        After the sort, it looks like O(N**2) in worst case, but you could get
        a few zillion results done while you are hunting for a faster
        algorithm.

        Cheers,
        John

        Comment

        • Dave Hansen

          #5
          Re: Finding overlapping times...

          On Dec 13, 5:45 pm, Breal <sean.be...@cox .netwrote:
          I have a list that looks like the following
          [(100000, 100010), (100005, 100007), (100009, 100015)]
          >
          I would like to be able to determine which of these overlap each
          other. So, in this case, tuple 1 overlaps with tuples 2 and 3. Tuple
          2 overlaps with 1. Tuple 3 overlaps with tuple 1.
          >
          In my scenario I would have hundreds, if not thousands of these
          ranges. Any nice pythonic way to do this?
          >
          What are you going to do with the result? Do you need to know if
          there are any overlaps? The number of overlaps? Given any particular
          range, what ranges overlap?

          There's really no way, other than to compare each range. Note that
          the relationship is symmetric, so you can save half you work. A
          simple-minded approach might be

          ---

          def overlaps(t1, t2):
          return (t2[1] >= t1[0]) and (t2[0] <= t1[1])

          in_data = [(100000, 100010), (100005, 100007), (100009, 100015)]

          while (len(in_data) 1):
          target = in_data.pop(0)
          for test in in_data:
          if overlaps(target ,test):
          print "%s overlaps with %s" % (repr(target),r epr(test))

          ---

          If you want to save the information for later retrieval, you could
          build a dictionary with the ranges as keys:

          ---
          ovr_map = {}

          while len(in_data) 1:
          target = in_data.pop(0)
          for test in in_data:
          if overlaps(target ,test):
          if ovr_map.has_key (target): ovr_map[target].append(test)
          else: ovr_map[target] = [test]
          if ovr_map.has_key (test): ovr_map[test].append(target)
          else: ovr_map[test] = [target]

          for target in ovr_map.keys():
          for test in ovr_map[target]:
          print "%s overlaps with %s" % (repr(target),r epr(test))
          ---

          I don't know that there isn't a more Pythonic way, I'm not much of an
          expert. HTH,

          -=Dave

          Comment

          • John Machin

            #6
            Re: Finding overlapping times...

            On Dec 14, 12:05 pm, Dave Hansen <i...@hotmail.c omwrote:
            On Dec 13, 5:45 pm, Breal <sean.be...@cox .netwrote:
            >
            I have a list that looks like the following
            [(100000, 100010), (100005, 100007), (100009, 100015)]
            >
            I would like to be able to determine which of these overlap each
            other. So, in this case, tuple 1 overlaps with tuples 2 and 3. Tuple
            2 overlaps with 1. Tuple 3 overlaps with tuple 1.
            >
            In my scenario I would have hundreds, if not thousands of these
            ranges. Any nice pythonic way to do this?
            >
            What are you going to do with the result? Do you need to know if
            there are any overlaps? The number of overlaps? Given any particular
            range, what ranges overlap?
            >
            There's really no way, other than to compare each range. Note that
            the relationship is symmetric, so you can save half you work. A
            simple-minded approach might be
            >
            ---
            >
            def overlaps(t1, t2):
            return (t2[1] >= t1[0]) and (t2[0] <= t1[1])
            >
            in_data = [(100000, 100010), (100005, 100007), (100009, 100015)]
            >
            while (len(in_data) 1):
            target = in_data.pop(0)
            A tad excessive:

            pop(0) is O(N)
            pop() is O(1)

            Pythonic and likely to be faster, Take 1:

            while in_data:
            target = in_data.pop()

            Pythonic and likely to be faster, Take 2:

            while 1:
            try:
            target = in_data.pop()
            except IndexError:
            break
            for test in in_data:
            if overlaps(target ,test):
            print "%s overlaps with %s" % (repr(target),r epr(test))
            Cheers,
            John

            Comment

            • Terry Jones

              #7
              Re: Finding overlapping times...

              Hi Breal
              I have a list that looks like the following
              [(100000, 100010), (100005, 100007), (100009, 100015)]
              >
              I would like to be able to determine which of these overlap each
              other. So, in this case, tuple 1 overlaps with tuples 2 and 3. Tuple
              2 overlaps with 1. Tuple 3 overlaps with tuple 1.
              >
              In my scenario I would have hundreds, if not thousands of these
              ranges. Any nice pythonic way to do this?
              This may be of no help, but....

              In 1995 I wrote a C library to do some stuff like this. You're welcome to
              the code. It might give you some ideas, or you might want to wrap it for
              Python. If you did that, and had additional energy, you could release the
              result to the Python community.

              As someone said, how/what to do depends what you want to do with the
              resulting data structure once you've processed the inputs.

              In my case I wanted to be able to read in a large number of ranges and be
              able to answer questions like "is X in the range?". I'm pretty sure I made
              it possible to add ranges to ignore (i.e., that punch holes in an existing
              range). I'm also pretty sure I'd have made this run in n log n time, by
              first sorting all the lower bounds (and maybe sorting the upper ones too)
              and then walking that list.

              One use case is e.g., for a printer program command line:

              print --inc-pages 40-70 --exc-pages 51-52 --inc-pages 100-120

              which could use the library to step through the range of pages in a doc and
              easily check which ones should be printed. There are lots of other uses.
              Lookup time on those queries was probably log n where n is the resulting
              number of range fragments once the processing (merging/splitting) of the
              command line was done.

              I don't remember, but it took me several days to write the code, so it
              wasn't completely trivial.

              I can't offer help beyond the code I'm afraid - I have too much other stuff
              going on.

              Terry

              Comment

              • Neil Cerutti

                #8
                Re: Finding overlapping times...

                On 2007-12-14, John Machin <sjmachin@lexic on.netwrote:
                On Dec 14, 10:45 am, Breal <sean.be...@cox .netwrote:
                >I have a list that looks like the following
                >[(100000, 100010), (100005, 100007), (100009, 100015)]
                >>
                >I would like to be able to determine which of these overlap each
                >other. So, in this case, tuple 1 overlaps with tuples 2 and 3.
                >
                Your definition of overlaps appears to be reflexive, so the following
                two results follow automatically.
                >
                >Tuple
                >2 overlaps with 1. Tuple 3 overlaps with tuple 1.
                >>
                >In my scenario I would have hundreds, if not thousands of these
                >ranges. Any nice pythonic way to do this?
                >
                Probably not.
                Just ensure that (a) your time intervals (start, end) obey start <=
                end (b) your list of intervals is sorted, then get stuck in:
                >
                # tested no more that what you see
                alist = [(100000, 100010), (100005, 100007), (100009, 100015),
                (100016, 100017), (100016, 100018)]
                n = len(alist)
                for i in xrange(n - 1):
                istart, iend = alist[i]
                for j in xrange(i + 1, n):
                jstart, jend = alist[j]
                if jstart iend:
                break
                print "Overlap:", i, alist[i], j, alist[j]
                >
                After the sort, it looks like O(N**2) in worst case, but you
                could get a few zillion results done while you are hunting for
                a faster algorithm.
                Simply printing out the list of overlaps, even if you knew, a
                priori, that all elements overlapped, is an O(N**2) operation. So
                I don't think a better algorithm exists for the worst case.

                --
                Neil Cerutti
                You only get a once-in-a-lifetime opportunity so many times. --Ike Taylor

                Comment

                Working...