How to use a 5 or 6 bit integer in Python?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Glen Wheeler

    #16
    Re: How to use a 5 or 6 bit integer in Python?

    On 19 Dec 2003 06:02:23 GMT, bokr@oz.net (Bengt Richter) wrote:
    [color=blue]
    >If we can tease out a little more about your problem, maybe we can help better ;-)
    >E.g., how do your tuple-keys come into being and how many times are they re-used?
    >If there is no nesting, you could create a string key just by ''.join([chr(el) for el in tup])
    >which wouldn't be as compact as a compressed bit string, but would be smaller than the tuple.
    >If you were lucky, a change could gain you both time and space.
    >[/color]

    I'll be implementing that this weekend, as my employer is expecting
    results before Christmas. Damn deadlines.
    [color=blue]
    >I'm curious what your dicts and their int tuples represent in the real world ;-)[/color]

    Well, I'm a researcher for an Australian university (University of
    Wollongong) and my current task is to enumerate every possible 6 digit
    binary Gray code.
    Most recently people have done the 5 digit BGCs, something which I
    have also accomplished, but never the 6 digit codes.
    For a graphical representation, think of a cube in n-dimensions,
    where n represents the number of digits in the code. A binary Gray
    code around that cube represents a path starting from any arbitrary
    point and then visiting every vertex exactly once.
    The applications of such research are enourmous, and I'll not go
    over them here, but that is my aim. To find the number of ways a 6
    dimensional fly could walk a 6d cube visiting every vertex exactly
    once.
    I've actually been working on this problem for many months now and
    have gotten so close. The previous best estimate for calculation time
    (with a program written in C, I might add) was 2.8 * 10^15 years.
    I've got the thing down to about 10^4 years. If the calculation
    becomes tractable on a supercomputer, e.g. Barossa hosted at
    www.ac3.com.au, then I'll run it on that.

    I hope that's sated your curiosity for now :). If you'd like any
    more information, or if anyone else would like to know anything about
    this, then I'll be happy to correspond with them. I don't know how
    on-topic it is for c.l.py.

    --
    Glen

    Comment

    • Hartmut Goebel

      #17
      Re: How to use a 5 or 6 bit integer in Python?

      Hi,

      Glen Wheeler wrote:
      [color=blue]
      > Component ints will either be 0-63 or 0-31. In a single run of the
      > program, this will not change throughout.
      > The key tuples range in length from 1-64, depending on at what stage
      > the program is currently at. Every tuple will have the same length at
      > some point; the greatest possible distance in length from any two
      > tuples is 1.[/color]

      I'm afraid, I'll mix up some of your data-structure, but here are my
      thoughts:

      - pack the key-tuples into a string using pack; you may even pack 4
      intergers into one byte since the values range from 1-64 (which is
      equivalent to 0-63).

      - if you need easy access to eg. the last emlement of the tuple, keep it
      seperate. Eg: key = ('xdef', 12)

      - As far as I remeber, you wrote something about max. 64 entries. What
      about using an array/list here? Array may need a lot less memory
      (depending on the muber of elements of course).

      Only my 2 cents.

      --
      Regards
      Hartmut Goebel

      | Hartmut Goebel | We build the crazy compilers |
      | h.goebel@crazy-compilers.com | Compiler Manufacturer |

      Comment

      • Oren Tirosh

        #18
        Re: How to use a 5 or 6 bit integer in Python?

        On Fri, Dec 19, 2003 at 06:27:59AM +0000, Rainer Deyke wrote:[color=blue]
        > Encoding the key tuples as strings is easy (''.join([chr(x) for x in
        > the_tuple]). Encoding the ints in the data tuples is just as easy. I'm not
        > sure if those are enough to solve your problem.[/color]

        Using strings is a really, really good idea. Strings are compact and
        Python dictionaries are highly optimized for use with string keys with
        tricks like caching of the string hash and faster code paths for special
        cases like dicts that have only string keys.

        There is a much faster way for converting tuples of integers to strings.
        Instead of using the Python tuple type use arrays of bytes and then apply
        the array object's .tostring() method:
        [color=blue][color=green][color=darkred]
        >>> from array import array
        >>> the_array = array('b', range(40))
        >>> the_string = the_array.tostr ing()[/color][/color][/color]

        It's about 35 times faster than ''.join([chr(x) for x in the_array])
        [color=blue][color=green][color=darkred]
        >>> back_to_array = array('b', the_string)[/color][/color][/color]

        Oren

        Comment

        • Oren Tirosh

          #19
          Re: How to use a 5 or 6 bit integer in Python?

          On Fri, Dec 19, 2003 at 06:26:22PM +1100, Glen Wheeler wrote:
          .....[color=blue]
          > tuple of integers as it's data type. All integers are from 0-63.
          > For the second dict: Trivially small, with a maximum of 32
          > elements, starts at 0 and an average of around 7. Keys are kept as
          > integers from 0-63 and data ranges from -1-30, i[/color]

          It's your lucky day... All your integers are from -1 to 63 which fits
          nicely in Python's range of "privileged " integers from -1 to 100.
          These are handled much faster by using a static table of integer
          objects without the overhead of allocation and deallocatiom.

          Oren

          Comment

          • Anton Vredegoor

            #20
            Re: How to use a 5 or 6 bit integer in Python?

            Glen Wheeler <adsl5lcq@tpg.c om.au> wrote:
            [color=blue]
            > Well, I'm a researcher for an Australian university (University of
            >Wollongong) and my current task is to enumerate every possible 6 digit
            >binary Gray code.[/color]

            It is possible to write a function that turns every integer into its
            "reflected" form so that the normal enumeration of the integers
            corresponds to a gray code (successors that differ only in one bit
            position) sequence. This can be done without enumerating them all.

            Here's some code from a book by Kreher and Stinson (Combinatorial
            ALgorithms) which I translated into Python from the pseudo-code in the
            book. I made some adaptations and came up with this:

            import sets

            def unrank(n,r):
            T,b1 = [],0
            for i in range(n-1,-1,-1):
            b0 = r/(2**i)
            if b0 != b1:
            T.append(n-i-1)
            b1 = b0
            r -= b0*2**i
            return T

            def rank(n,T):
            S = sets.Set(T)
            r,b = 0,0
            for i in range(n-1,-1,-1):
            if n-i-1 in S:
            b = 1-b
            if b == 1:
            r += 2**i
            return r

            def reflect(n,r):
            S = sets.Set(unrank (n,r))
            L = ["01"[i in S] for i in range(n)]
            return int("".join(L), 2)

            def test():
            n = 2**6
            T = [None]
            for i in range(100):
            U = unrank(n,i)
            print i, reflect(n,i),U
            assert rank(n,U) == i
            assert len(sets.Set(T) ^ sets.Set(U)) == 1
            T = U

            if __name__=='__ma in__':
            test()
            [color=blue]
            > I hope that's sated your curiosity for now :). If you'd like any
            >more information, or if anyone else would like to know anything about
            >this, then I'll be happy to correspond with them. I don't know how
            >on-topic it is for c.l.py.[/color]

            If it's interesting and provides an excuse for exercising our
            programming skills in our favorite language, it's on topic. It's even
            on topic if it's outrageously off-topic enough, so take your pick :-)

            I guess I must have missed something because if the problem's solution
            can be looked up in a book it probably is not what you are looking
            for. So why is it not enough to have an indexing function and why do
            you need to have all values literally?

            Anton



            Comment

            • Bengt Richter

              #21
              Re: How to use a 5 or 6 bit integer in Python?

              On Sat, 20 Dec 2003 19:38:36 +1100, Glen Wheeler <adsl5lcq@tpg.c om.au> wrote:
              [color=blue]
              >On 19 Dec 2003 06:02:23 GMT, bokr@oz.net (Bengt Richter) wrote:
              >[color=green]
              >>If we can tease out a little more about your problem, maybe we can help better ;-)
              >>E.g., how do your tuple-keys come into being and how many times are they re-used?
              >>If there is no nesting, you could create a string key just by ''.join([chr(el) for el in tup])
              >>which wouldn't be as compact as a compressed bit string, but would be smaller than the tuple.
              >>If you were lucky, a change could gain you both time and space.
              >>[/color]
              >
              > I'll be implementing that this weekend, as my employer is expecting
              >results before Christmas. Damn deadlines.
              >[color=green]
              >>I'm curious what your dicts and their int tuples represent in the real world ;-)[/color]
              >
              > Well, I'm a researcher for an Australian university (University of
              >Wollongong) and my current task is to enumerate every possible 6 digit
              >binary Gray code.
              > Most recently people have done the 5 digit BGCs, something which I
              >have also accomplished, but never the 6 digit codes.
              > For a graphical representation, think of a cube in n-dimensions,
              >where n represents the number of digits in the code. A binary Gray
              >code around that cube represents a path starting from any arbitrary
              >point and then visiting every vertex exactly once.
              > The applications of such research are enourmous, and I'll not go
              >over them here, but that is my aim. To find the number of ways a 6
              >dimensional fly could walk a 6d cube visiting every vertex exactly
              >once.
              > I've actually been working on this problem for many months now and
              >have gotten so close. The previous best estimate for calculation time
              >(with a program written in C, I might add) was 2.8 * 10^15 years.
              >I've got the thing down to about 10^4 years. If the calculation
              >becomes tractable on a supercomputer, e.g. Barossa hosted at
              >www.ac3.com.au, then I'll run it on that.
              >
              > I hope that's sated your curiosity for now :). If you'd like any
              >more information, or if anyone else would like to know anything about
              >this, then I'll be happy to correspond with them. I don't know how
              >on-topic it is for c.l.py.
              >[/color]
              Just to see if I understand your problem statement, does the following (though it's slow)
              do what you are trying to do? If so, what would be the point of generating
              all those gray code sequences? IOW, how are you going to use them? Assuming my
              prog generated all the 6-bit grey code seqences to a monster file, all the 64-number
              sequences packed into 64-byte strings end to end, then you would either have to
              read that file sequentially, or seek to some point and read 64 bytes to get a
              given gray code sequence. If the latter, how about an algorithm that would
              do a virtual seek into the whole data as a virtual file, and then generate
              what you would read? I.e., there would be no point in storing all that data
              if you could easily generate an arbitrary segment of it. Of course, maybe that's
              not so easy ;-) But I'm wonder what the use is going to be, and the access pattern,
              if you assumed availability in a monster file, virtual or not.

              If you really do want to generate all that data, I suppose the loop and recursions
              could be fanned out to parallel processing. But first, do I understand the problem?

              ====< gray.py >============== =============== =============== =============
              # gray.py -- enumerate n-bit gray codes
              # 20031220 10:34:27 alpha version by Bengt Richter. No warranties ;-)
              #
              def enumgraycodes(n ):
              """
              Enumerate all possible complete n-bit gray code sequences
              returning a list of 2**n-length strings whose bytes each
              contain a complete gray code sequence in the ord values.
              """
              visit_mask = 2L**(2**n)-1
              nbits = [1<<i for i in xrange(n)]
              vbits = [1L<<i for i in xrange(2**n)]
              codes = []
              def visit(curr=0, visited=0L, code=''):
              """visit curr, and recursively available successors"""
              visited |= vbits[curr]
              code += chr(curr)
              if visited == visit_mask:
              codes.append(co de)
              return
              for nbit in nbits:
              succ = curr^nbit
              if vbits[succ]&visited: continue
              visit(succ, visited, code)
              for start in xrange(2**n):
              visit(start)
              return codes

              if __name__ == '__main__':
              import sys
              codes = enumgraycodes(i nt(sys.argv[1]))
              for code in codes:
              print ''.join(['%02x'%ord(c) for c in code])
              =============== =============== =============== =============== ============

              Example: (even 4 bits generates >3mb and takes a while, so I won't post it ;-)

              [10:41] C:\pywk\clp\gra y>python gray.py 2
              00010302
              00020301
              01000203
              01030200
              02030100
              02000103
              03020001
              03010002

              [10:41] C:\pywk\clp\gra y>python gray.py 3
              000103020607050 4
              000103020604050 7
              000103070504060 2
              000105040607030 2
              000105040602030 7
              000105070302060 4
              000203010504060 7
              000203010507060 4
              000203070604050 1
              000206070301050 4
              000206040507030 1
              000206040501030 7
              000405070602030 1
              000405010302060 7
              000405010307060 2
              000406070501030 2
              000406020301050 7
              000406020307050 1
              010002030706040 5
              010002030705040 6
              010002060405070 3
              010004050706020 3
              010004050703020 6
              010004060203070 5
              010302000405070 6
              010302000406070 5
              010302060705040 0
              010307060200040 5
              010307050406020 0
              010307050400020 6
              010504060703020 0
              010504000203070 6
              010504000206070 3
              010507060400020 3
              010507030200040 6
              010507030206040 0
              020301000405070 6
              [...]
              060200040501030 7
              070604050100020 3
              070604050103020 0
              070604000203010 5
              070602030100040 5
              070602030105040 0
              070602000405010 3
              070504060203010 0
              070504060200010 3
              070504000103020 6
              070501000406020 3
              070501030200040 6
              070501030206040 0
              070302000105040 6
              070302060405010 0
              070302060400010 5
              070301000206040 5
              070301050406020 0
              070301050400020 6

              Regards,
              Bengt Richter

              Comment

              • Glen Wheeler

                #22
                Re: How to use a 5 or 6 bit integer in Python?

                On 20 Dec 2003 18:35:31 GMT, bokr@oz.net (Bengt Richter) wrote:
                [color=blue][color=green]
                >>[/color]
                >Just to see if I understand your problem statement, does the following (though it's slow)
                >do what you are trying to do?[/color]

                Not exactly, although if you wc -l the result and then wrote down
                the number, then it would. I am only interested in the final result.
                [color=blue]
                >If so, what would be the point of generating
                >all those gray code sequences? IOW, how are you going to use them? Assuming my
                >prog generated all the 6-bit grey code seqences to a monster file, all the 64-number
                >sequences packed into 64-byte strings end to end, then you would either have to
                >read that file sequentially, or seek to some point and read 64 bytes to get a
                >given gray code sequence. If the latter, how about an algorithm that would
                >do a virtual seek into the whole data as a virtual file, and then generate
                >what you would read? I.e., there would be no point in storing all that data
                >if you could easily generate an arbitrary segment of it.[/color]

                Generating a single code of arbitrary digits is a relatively simple
                task; I am interested in generating the total *number*, not every
                code. Each of the codes I am generating also has (as one of the
                integers in it's data tuple) a `multiplicity' which is how many other
                Gray codes of that length are symmetrically equivalent.
                I am merely generating a very small subset which I can recombine (as
                a polynomial, say) and then work out how many there will be in total.
                [color=blue]
                >Of course, maybe that's
                >not so easy ;-) But I'm wonder what the use is going to be, and the access pattern,
                >if you assumed availability in a monster file, virtual or not.
                >[/color]

                Incidentally, I did try the method you have below. Almost identical
                I might add! It ran for around 3 weeks, and generated just over a
                million 5-digit Gray codes. For some interest, if G(n) represents the
                number of n-digit Gray codes then :

                G(1) = 2
                G(2) = 8 (as you show below)
                G(3) = 144
                G(4) = 91 392
                G(5) = 187 499 658 240 (I can calculate this in under 8 hours)
                G(6) = ?? (but it's around 10^24)

                So if you allocate each code a minimal number of bits, storing every
                5 or 6 digit BGC is a very different and much more impossible task
                than that which I am trying to achieve.
                [color=blue]
                >If you really do want to generate all that data, I suppose the loop and recursions
                >could be fanned out to parallel processing. But first, do I understand the problem?
                >
                >====< gray.py >============== =============== =============== =============[/color]

                If instead of printing the successful code, or storing it, you
                simply incremented an internal counter, then yes that is exactly the
                problem I need to solve. For 6 digits.
                My representations and algorithm come from alot of theory that me
                and other associates have proven, although I can post it here or to
                your e-mail if you would like. I haven't included any of these
                tuples-as-strings speedups yet, as I'm working on integrating some new
                theory into the program, but comments would always be appreciated.
                It would be ironic if you did take me up on this, as I recently
                deleted all the comments from my code because I felt they were
                cluttering the code itself :).

                --
                Glen

                Comment

                Working...