how to get the thighest bit position in big integers?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • M.-A. Lemburg

    #16
    Re: how to get the thighest bit position in big integers?

    On 2008-10-05 17:26, mmgarvey@gmx.de wrote:
    Hi,
    >
    I'm using python to develop some proof-of-concept code for a
    cryptographic application. My code makes extended use of python's
    native bignum capabilities.
    >
    In many cryptographic applications there is the need for a function
    'get_highest_bi t_num' that returns the position number of the highest
    set bit of a given integer. For example:
    >
    get_highest_bit _num( (1 << 159)) == 159
    get_highest_bit _num( (1 << 160) - 1) == 159
    get_highest_bit _num( (1 << 160)) == 160
    >
    I implemented this the following way:
    >
    def get_highest_bit _num(r):
    i = -1
    while r 0:
    r >>= 1
    i = i + 1
    return i
    >
    This works, but it is a very unsatisfying solution, because it is so
    slow.
    >
    My second try was using the math.log function:
    >
    import math
    r = (1 << 160) - 1
    print highest_bit_num (r) # prints out 159
    print math.floor(math .log(r, 2)) # prints out 160.0
    >
    We see that math.log can only serve as a heuristic for the highest bit
    position. For small r, for example r = (1 << 16) - 1, the result from
    math.log(, 2) is correct, for big r it isn't any more.
    >
    My question to the group: Does anyone know of a non-hackish way to
    determine the required bit position in python? I know that my two
    ideas
    can be combined to get something working. But is there a *better* way,
    that isn't that hackish?
    In Python 2.6, you can use the bin() builtin:
    >>len(bin(1<<15 9)) - 3
    159

    --
    Marc-Andre Lemburg
    eGenix.com

    Professional Python Services directly from the Source (#1, Oct 06 2008)
    >>Python/Zope Consulting and Support ... http://www.egenix.com/
    >>mxODBC.Zope.D atabase.Adapter ... http://zope.egenix.com/
    >>mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
    _______________ _______________ _______________ _______________ ____________

    :::: Try mxODBC.Zope.DA for Windows,Linux,S olaris,MacOSX for free ! ::::


    eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
    Registered at Amtsgericht Duesseldorf: HRB 46611

    Comment

    • Fredrik Johansson

      #17
      Re: how to get the thighest bit position in big integers?

      On Oct 5, 5:26 pm, mmgar...@gmx.de wrote:
      Hi,
      >
      My question to the group: Does anyone know of a non-hackish way to
      determine the required bit position in python? I know that my two
      ideas
      can be combined to get something working. But is there a *better* way,
      that isn't that hackish?
      No non-hackish way.

      I've myself needed a version of this function for a speed-critical
      application. For my purposes, it needs to be fast both for small and
      extremely large numbers; the following is the fastest version I've
      come up with. The idea to use bisect with a precomputed list when
      possible, and math.log (with a correction) for larger numbers. There's
      a tradeoff: a longer bisect table makes it fast for larger number, but
      the bisection also becomes slower for small numbers. For my
      application, 300 turned out to be a reasonable compromise.

      from bisect import bisect
      from math import log

      powers = [1<<_ for _ in range(300)]

      def bitcount(n):
      bc = bisect(powers, n)
      if bc != 300:
      return bc
      bc = int(log(n, 2)) - 4
      return bc + bctable[n>>bc]

      bctable = map(bitcount, range(1024))

      Calling this function is still slow, so when possible, I track the
      approximate number of bits in a number across operations, and do (bc
      +bctable[n>>bc]) inline (where bc is a low estimate of the number of
      bits) to obtain the exact bit count.

      Fredrik

      Comment

      • Aaron \Castironpi\ Brady

        #18
        Re: how to get the thighest bit position in big integers?

        On Oct 6, 3:37 am, Mark Dickinson <dicki...@gmail .comwrote:
        On Oct 5, 11:40 pm, Terry Reedy <tjre...@udel.e duwrote:
        >
        Your point, that taking floor(log2(x)) is redundant, is a good catch.
        However, you should have added 'untested' ;-).  When value has more
        significant bits than the fp mantissa can hold, this expression can be 1
        off (but no more, I assume).   The following correction should be
        sufficient:
        >
        res = math.frexp(valu e)[1] - EXP_OF_ONE
        test = 1<<res
        if test r:     res -= 1
        elif 2*test < r: res += 1
        >
        For value = 2**n -1, n 53, it is always 1 too high on my Intel
        machine, so the first correction is sometimes needed.  I do not know if
        the second is or not.
        >
        See alsohttp://bugs.python.org/issue3439
        where there's a proposal to expose the _PyLong_NumBits method.  This
        would give an O(1) solution.
        >
        Mark
        That generates an odd error with ctypes.
        >>from ctypes import *
        >>_PyLong_NumBi ts= PYFUNCTYPE( c_int, py_object )( pythonapi._PyLo ng_NumBits )
        >>_PyLong_NumBi ts( 1<<50 )
        Traceback (most recent call last):
        File "_ctypes/callbacks.c", line 295, in 'calling callback function'
        ctypes.Argument Error: argument 1: <type 'exceptions.Ove rflowError'>:
        long int to
        o long to convert
        2227205
        >>>
        Seems 'ctypes' tries to call PyLong_AsUnsign edLong for you. Anyone
        know a way around it?

        Comment

        • Thomas Heller

          #19
          Re: how to get the thighest bit position in big integers?

          Mark Dickinson schrieb:
          On Oct 5, 11:40 pm, Terry Reedy <tjre...@udel.e duwrote:
          >Your point, that taking floor(log2(x)) is redundant, is a good catch.
          >However, you should have added 'untested' ;-). When value has more
          >significant bits than the fp mantissa can hold, this expression can be 1
          >off (but no more, I assume). The following correction should be
          >sufficient:
          >>
          >res = math.frexp(valu e)[1] - EXP_OF_ONE
          >test = 1<<res
          >if test r: res -= 1
          >elif 2*test < r: res += 1
          >>
          >For value = 2**n -1, n 53, it is always 1 too high on my Intel
          >machine, so the first correction is sometimes needed. I do not know if
          >the second is or not.
          >
          See also http://bugs.python.org/issue3439
          where there's a proposal to expose the _PyLong_NumBits method. This
          would give an O(1) solution.
          >
          Mark
          Here is ctypes code that works (tested with Python 2.5):

          from ctypes import *

          _PyLong_NumBits = pythonapi._PyLo ng_NumBits
          _PyLong_NumBits .argtypes = py_object,
          _PyLong_NumBits .restype = c_int

          for i in range(64):
          x = 1 << i
          print i, _PyLong_NumBits (long(x))


          Thomas

          Comment

          • mmgarvey@gmx.de

            #20
            Re: how to get the thighest bit position in big integers?

            See also http://bugs.python.org/issue3439
            where there's a proposal to expose the _PyLong_NumBits method.  This
            would give an O(1) solution.
            In every case I can think of, I've wanted (0).numbits() to be 0.
            Here is surely the wrong place to respond to

            but I want to make clear that I think that (0).numbits()==-1
            is the natural solution. At least for all square-and-multiply-like
            algorithms needed in cryptography (modular exponentiation, point
            multiplication
            on elliptic curves, generation of lucas numbers, etc) this property
            of .numbits()
            would be needed.

            Comment

            • Mark Dickinson

              #21
              Re: how to get the thighest bit position in big integers?

              On Oct 7, 5:19 pm, mmgar...@gmx.de wrote:
              but I want to make clear that I think that (0).numbits()==-1
              is the natural solution. At least for all square-and-multiply-like
              algorithms needed in [...]
              Can you clarify this? Why is -1 the natural solution? I
              can see a case for 0, for -infinity (whatever that means),
              or for raising an exception, but I can't see why -1 would
              be at all useful or natural.

              Here's a straightforward (mildly inefficient)
              implementation of the left-to-right square-and-multiply algorithm
              for powering. It works just fine with nbits(0) == 0.

              def nbits(n):
              return 0 if n == 0 else len(bin(abs(n)) )-2

              def LtoRpow(a, n):
              acc = 1
              for i in reversed(xrange (nbits(n))):
              acc *= acc
              if n & (1<<i):
              acc *= a
              return acc

              Comment

              • MRAB

                #22
                Re: how to get the thighest bit position in big integers?

                On Oct 8, 9:56 am, Mark Dickinson <dicki...@gmail .comwrote:
                On Oct 7, 5:19 pm, mmgar...@gmx.de wrote:
                >
                but I want to make clear that I think that (0).numbits()==-1
                is the natural solution. At least for all square-and-multiply-like
                algorithms needed in [...]
                >
                Can you clarify this?  Why is -1 the natural solution? I
                can see a case for 0, for -infinity (whatever that means),
                or for raising an exception, but I can't see why -1 would
                be at all useful or natural.
                >
                [snip]
                Compare it with, say, str.find, which returns the position if found
                (>=0) or -1 if not found.

                int.numbits should return the position of the highest set bit (>=0) or
                -1 if none found.

                On the other hand, if you compare it wirh str.index then int.numbits
                should raise ValueError if none found.

                Actually, the name "numbits" suggests that it returns the number of
                bits, so (1).numbits() should return 1 and (0).numbits() should return
                0. If it's called "highbit" then it suggests that it returns the
                position of the highest (set) bit, so (1).highbit() should return 0
                and (0).highbit() should return -1 (or raise an exception).

                Comment

                • Terry Reedy

                  #23
                  Re: how to get the thighest bit position in big integers?

                  MRAB wrote:
                  On Oct 8, 9:56 am, Mark Dickinson <dicki...@gmail .comwrote:
                  >On Oct 7, 5:19 pm, mmgar...@gmx.de wrote:
                  >>
                  >>but I want to make clear that I think that (0).numbits()==-1
                  >>is the natural solution. At least for all square-and-multiply-like
                  >>algorithms needed in [...]
                  >Can you clarify this? Why is -1 the natural solution? I
                  >can see a case for 0, for -infinity (whatever that means),
                  >or for raising an exception, but I can't see why -1 would
                  >be at all useful or natural.
                  >>
                  [snip]
                  Compare it with, say, str.find, which returns the position if found
                  (>=0) or -1 if not found.
                  str.find is an historical anomaly that should not be copied. It
                  was(is?) a wrapper for C's string find function. C routinely uses -1 to
                  mean None for functions statically typed to return ints. The Python
                  version logically should return None and usually does for other functions.

                  Comment

                  • greg

                    #24
                    Re: how to get the thighest bit position in big integers?

                    Terry Reedy wrote:
                    str.find is an historical anomaly that should not be copied. It
                    was(is?) a wrapper for C's string find function. C routinely uses -1 to
                    mean None for functions statically typed to return ints. The Python
                    version logically should return None and usually does for other functions.
                    Although Guido has defended it on the grounds that it can
                    be inconvenient having a function that returns different
                    types under different circumstances. Also it discourages
                    making the mistake of treating the return value as a
                    boolean.

                    --
                    Greg

                    Comment

                    • Terry Reedy

                      #25
                      Re: how to get the thighest bit position in big integers?

                      greg wrote:
                      Terry Reedy wrote:
                      >
                      >str.find is an historical anomaly that should not be copied. It
                      >was(is?) a wrapper for C's string find function. C routinely uses -1
                      >to mean None for functions statically typed to return ints. The
                      >Python version logically should return None and usually does for other
                      >functions.
                      I consider str.find to be an anomaly in two senses.

                      First, it is the only function/method I can think of that directly
                      duplicates another function/method by replace raising an exception with
                      returning an error indicator. Some developers wanted to drop it from
                      3.0, and I too wish it had been. We get along fine without list.find
                      and a whole slew of other such duplicates that one could think of.

                      Second, it is the only function/method I can think of that follows the
                      Unix/C convention of using '-1' as an error/exception/no-can-do
                      indicator. When it is so used, it is not really an int, but an error
                      indication represented as an int because there is no other choice. In
                      Python, there is another choice -- None -- which is used routinely,
                      including its use as the default return when there is nothing else to
                      return.
                      Although Guido has defended it on the grounds that it can
                      be inconvenient having a function that returns different
                      types under different circumstances. Also it discourages
                      making the mistake of treating the return value as a
                      boolean.
                      I consider this backwards. Since -1 is a legal index in Python (unlike
                      some other languages) as well as a legal int, returning -1 allows if not
                      encourages the mistake of treating the fake return value as if it were a
                      real value. Returning None would completely prevent any use of the
                      error indicator other than to test it, which is the only thing one
                      should be able to do with it. It should not be usable as an index or
                      number in further calculations. I consider it a principle that error
                      indicators that are returned rather than raised should be an
                      illegal-to-use value if not a different type.

                      As for convenience, "if s.find(t) is None:" is as easily to write (and
                      clearer to read) as "if s.find(t) == -1:". As far as I can see,
                      different return types are only an issue, if at all, if values of both
                      types are to be used in further calculations.

                      Terry Jan Reedy


                      Comment

                      • Aaron \Castironpi\ Brady

                        #26
                        Re: how to get the thighest bit position in big integers?

                        On Oct 8, 7:21 pm, greg <g...@cosc.cant erbury.ac.nzwro te:
                        Terry Reedy wrote:
                        str.find is an historical anomaly that should not be copied.  It
                        was(is?) a wrapper for C's string find function.  C routinely uses -1to
                        mean None for functions statically typed to return ints.  The Python
                        version logically should return None and usually does for other functions.
                        >
                        ....
                        [I]t can
                        be inconvenient having a function that returns different
                        types under different circumstances.
                        ....

                        No. It has precedent and there is no cost to convenience in pure
                        Python.

                        Perhaps you are passing the result straight to a C extension, and
                        parsing it straight to integer, but I can't attest to how common that
                        use case is.

                        Comment

                        • mmgarvey@gmx.de

                          #27
                          Re: how to get the thighest bit position in big integers?

                          On Oct 8, 10:56 am, Mark Dickinson <dicki...@gmail .comwrote:
                          >I think that (0).numbits()==-1 is the natural solution.
                          >
                          Can you clarify this?  Why is -1 the natural solution? I
                          can see a case for 0, for -infinity (whatever that means),
                          or for raising an exception, but I can't see why -1 would
                          be at all useful or natural.
                          You are right. I misunderstood the definition of nbits(). I initially
                          had asked for a function get_highest_bin _num(), which is always one
                          off
                          nbits() as it seems. I correct my sentence to

                          The choice get_highest_bin _num(0) == -1 is the natural one (in
                          concerns of cryptography). This property is needed for the
                          algorithms I listed above.

                          If get_highest_bit _num(n) == nbits(n) - 1 holds for all n >= 0 then
                          we
                          both agree.

                          I cite from MRAB's post:
                          int.numbits should return the position of the highest set bit (>=0)
                          or -1 if none found.
                          It seems that MRAB had the same misunderstandin g.

                          Perhaps one should implement both: numbits() and get_highest_bit _num()
                          where
                          numbits(0) = 0 and get_highest_bit _num(0) raises an exception?

                          Comment

                          • Nick Mellor

                            #28
                            Re: how to get the thighest bit position in big integers?

                            On Oct 6, 3:40 am, Gary Herron <gher...@island training.comwro te:
                            mmgar...@gmx.de wrote:
                            Hi,
                            >
                            I'm using python to develop some proof-of-concept code for a
                            cryptographic application. My code makes extended use of python's
                            native bignum capabilities.
                            >
                            In many cryptographic applications there is the need for a function
                            'get_highest_bi t_num' that returns the position number of the highest
                            set bit of a given integer. For example:
                            >
                               get_highest_bit _num( (1 << 159))     == 159
                               get_highest_bit _num( (1 << 160) - 1) == 159
                               get_highest_bit _num( (1 << 160))     == 160
                            >
                            How about a binary search?
                            >
                            >
                            >
                            >from bisect import bisect
                            >BITS = [1<<n for n in range(1,1000)]  # Use max number of bits here.
                            >bisect(BITS, 1<<159)
                            159
                            >bisect(BITS, 1<<160-1)
                            159
                            >bisect(BITS, 1<<160)
                            160
                            >
                            I have no clue if this is faster or not.  The comparison function used
                            in the search is (potentially) slow, but the search is O(log n) on the
                            number of bits rather than O(n), so its worth a test.
                            >
                            If you run timing test, let us know the results.
                            >
                            Gary Herron
                            >
                            I implemented this the following way:
                            >
                            def get_highest_bit _num(r):
                                i = -1
                                while r 0:
                                    r >>= 1
                                    i = i + 1
                                return i
                            >
                            This works, but it is a very unsatisfying solution, because it is so
                            slow.
                            >
                            My second try was using the math.log function:
                            >
                            import math
                            r = (1 << 160) - 1
                            print highest_bit_num (r)              # prints out 159
                            print math.floor(math .log(r, 2))      # prints out 160.0
                            >
                            We see that math.log can only serve as a heuristic for the highest bit
                            position. For small r, for example r = (1 << 16) - 1, the result from
                            math.log(, 2) is correct, for big r it isn't any more.
                            >
                            My question to the group: Does anyone know of a non-hackish way to
                            determine the required bit position in python? I know that my two
                            ideas
                            can be combined to get something working. But is there a *better* way,
                            that isn't that hackish?
                            >>
                            >

                            The following might be worth a try. It's faster the fewer set bits
                            there are in the original number, and lightning fast if there are only
                            one or two:

                            def get_highest_bit _num(i):
                            while i>0: highestbit, i = i, i & (i-1)
                            return highestbit
                            >>highestbit(1< <31)
                            2147483648L
                            >>highestbit(1< <4)
                            16
                            >>highestbit(3< <4)
                            32

                            Note that it returns the value of the highest bit, not its position.

                            All it's doing is successively ANDing i with (i-1) until i is zero,
                            then returning the previous value of i.

                            It works because i & (i-1) has a useful property: it returns i less
                            its least significant set bit:

                            i=6 (binary 110) =i & (i-1)==4 (binary 100)
                            i=3328 =(binary 1101_0000_0000) then i & (i-1)==3072 (binary
                            1100_0000_0000)

                            (underscores for readability)

                            As far as I know there isn't another piece of logic that helps you
                            extract the most significant bit as simply :-)

                            Best wishes,

                            Nick

                            Comment

                            • Mensanator

                              #29
                              Re: how to get the thighest bit position in big integers?

                              On Oct 29, 1:26 am, Nick Mellor <nick-gro...@back-pain-self-help.com>
                              wrote:
                              On Oct 6, 3:40 am, Gary Herron <gher...@island training.comwro te:
                              >
                              >
                              >
                              >
                              >
                              mmgar...@gmx.de wrote:
                              Hi,
                              >
                              I'm using python to develop some proof-of-concept code for a
                              cryptographic application. My code makes extended use of python's
                              native bignum capabilities.
                              >
                              In many cryptographic applications there is the need for a function
                              'get_highest_bi t_num' that returns the position number of the highest
                              set bit of a given integer. For example:
                              >
                                 get_highest_bit _num( (1 << 159))     == 159
                                 get_highest_bit _num( (1 << 160) - 1) == 159
                                 get_highest_bit _num( (1 << 160))     == 160
                              >
                              How about a binary search?
                              >
                              >>from bisect import bisect
                              >>BITS = [1<<n for n in range(1,1000)]  # Use max number of bits here.
                              >>bisect(BITS , 1<<159)
                              159
                              >>bisect(BITS , 1<<160-1)
                              159
                              >>bisect(BITS , 1<<160)
                              160
                              >
                              I have no clue if this is faster or not.  The comparison function used
                              in the search is (potentially) slow, but the search is O(log n) on the
                              number of bits rather than O(n), so its worth a test.
                              >
                              If you run timing test, let us know the results.
                              >
                              Gary Herron
                              >
                              I implemented this the following way:
                              >
                              def get_highest_bit _num(r):
                                  i = -1
                                  while r 0:
                                      r >>= 1
                                      i = i + 1
                                  return i
                              >
                              This works, but it is a very unsatisfying solution, because it is so
                              slow.
                              >
                              My second try was using the math.log function:
                              >
                              import math
                              r = (1 << 160) - 1
                              print highest_bit_num (r)              # prints out 159
                              print math.floor(math .log(r, 2))      # prints out 160.0
                              >
                              We see that math.log can only serve as a heuristic for the highest bit
                              position. For small r, for example r = (1 << 16) - 1, the result from
                              math.log(, 2) is correct, for big r it isn't any more.
                              >
                              My question to the group: Does anyone know of a non-hackish way to
                              determine the required bit position in python? I know that my two
                              ideas
                              can be combined to get something working. But is there a *better* way,
                              that isn't that hackish?
                              >>
                              The following might be worth a try. It's faster the fewer set bits
                              there are in the original number, and lightning fast if there are only
                              one or two:
                              >
                              def get_highest_bit _num(i):
                                  while i>0: highestbit, i = i, i & (i-1)
                                  return highestbit
                              >
                              >highestbit(1<< 31)
                              2147483648L
                              >highestbit(1<< 4)
                              16
                              >highestbit(3<< 4)
                              >
                              32
                              >
                              Note that it returns the value of the highest bit, not its position.
                              >
                              All it's doing is successively ANDing i with (i-1) until i is zero,
                              then returning the previous value of i.
                              >
                              It works because i & (i-1) has a useful property: it returns i less
                              its least significant set bit:
                              >
                              i=6 (binary 110) =i & (i-1)==4 (binary 100)
                              i=3328 =(binary 1101_0000_0000) then i & (i-1)==3072 (binary
                              1100_0000_0000)
                              >
                              (underscores for readability)
                              >
                              As far as I know there isn't another piece of logic that helps you
                              extract the most significant bit as simply :-)
                              import gmpy
                              print 'highest bit position: %d' % (len(gmpy.digit s(3328,2))-1)

                              highest bit position: 11


                              or in 2.6

                              print 'highest bit position: %d' % (len(bin(3328)[2:])-1)

                              highest bit position: 11

                              >
                              Best wishes,
                              >
                              Nick

                              Comment

                              • Mensanator

                                #30
                                Re: how to get the thighest bit position in big integers?

                                On Oct 29, 4:16 pm, Glenn Linderman <v+pyt...@g.nev cal.comwrote:
                                On approximately 10/29/2008 11:51 AM, came the following characters from
                                the keyboard of Mensanator:
                                >
                                or in 2.6
                                >
                                print 'highest bit position: %d' % (len(bin(3328)[2:])-1)
                                >
                                highest bit position: 11
                                >
                                This works, but where is it documented?  
                                Good question. Now that I think about it, I believe I learned of
                                it here, never saw it in the docs.
                                Searching the Python 2.6 docs
                                for bin found lots of things that start with bin, but none of them were
                                bin.  Searching the builtins page manual didn't find it.  Is this a bug
                                in the documentation?
                                Well ,it's mentioned in "What's new in Python 2.6":
                                <quote>
                                Python 3.0 adds several new built-in functions
                                and change the semantics of some existing built-ins.
                                Entirely new functions such as bin() have simply
                                been added to Python 2.6, but existing built-ins
                                haven’t been changed;
                                </quote>

                                You would think when you add a new function, you would
                                also add it's documentation, but maybe that was an
                                oversight. I don't have 3.0, but maybe it can be found
                                in that set of docs.
                                >
                                --
                                Glenn --http://nevcal.com/
                                =============== ============
                                A protocol is complete when there is nothing left to remove.
                                -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

                                Comment

                                Working...