unsigned 32 bit arithmetic type?

**Martin v. Löwis** · Oct 25 '06, 11:15 AM

Re: unsigned 32 bit arithmetic type?

Robin Becker schrieb:

Hi, just trying to avoid wheel reinvention. I have need of an unsigned
32 bit arithmetic type to carry out a checksum operation and wondered if
anyone had already defined such a beast.
>
Our current code works with 32 bit cpu's, but is failing with 64 bit
comparisons; it's clearly wrong as we are comparing a number with a
negated number; the bits might drop off in 32 bits, but not in 64.

Not sure what operations you are doing: In Python, bits never drop off
(at least not in recent versions).

If you need to drop bits, you need to do so explicitly, by using the
bit mask operations. I could tell you more if you'd tell us what
the specific operations are.

Regards,
Martin

**Robin Becker** · Oct 25 '06, 11:45 AM

Re: unsigned 32 bit arithmetic type?

Martin v. Löwis wrote:

Robin Becker schrieb:

>Hi, just trying to avoid wheel reinvention. I have need of an unsigned
>32 bit arithmetic type to carry out a checksum operation and wondered if
>anyone had already defined such a beast.
>>
>Our current code works with 32 bit cpu's, but is failing with 64 bit
>comparisons; it's clearly wrong as we are comparing a number with a
>negated number; the bits might drop off in 32 bits, but not in 64.

>
Not sure what operations you are doing: In Python, bits never drop off
(at least not in recent versions).
>
If you need to drop bits, you need to do so explicitly, by using the
bit mask operations. I could tell you more if you'd tell us what
the specific operations are.

This code is in a contribution to the reportlab toolkit that handles TTF fonts.
The fonts contain checksums computed using 32bit arithmetic. The original
Cdefintion is as follows

ULONG CalcTableChecks um(ULONG *Table, ULONG Length)
{
ULONG Sum = 0L;
ULONG *Endptr = Table+((Length+ 3) & ~3) / sizeof(ULONG);
>
while (Table < EndPtr)
Sum += *Table++;
return Sum;
}

so effectively we're doing only additions and letting bits roll off the end.

Of course the actual semantics is dependent on what C unsigned arithmetic does
so we're relying on that being the same everywhere.

This algorithm was pretty simple in Python until 2.3 when shifts over the end of
ints started going wrong. For some reason we didn't do the obvious and just do
everything in longs and just mask off the upper bits. For some reason (probably
my fault) we seem to have accumulated code like

def _L2U32(L):
'''convert a long to u32'''
return unpack('l',pack ('L',L))[0]

if sys.hexversion> =0x02030000:
def add32(x, y):
"Calculate (x + y) modulo 2**32"
return _L2U32((long(x) +y) & 0xffffffffL)
else:
def add32(x, y):
"Calculate (x + y) modulo 2**32"
lo = (x & 0xFFFF) + (y & 0xFFFF)
hi = (x >16) + (y >16) + (lo >16)
return (hi << 16) | (lo & 0xFFFF)

def calcChecksum(da ta):
"""Calculat es TTF-style checksums"""
if len(data)&3: data = data + (4-(len(data)&3))* "\0"
sum = 0
for n in unpack(">%dl" % (len(data)>>2), data):
sum = add32(sum,n)
return sum

and also silly stuff like

def testAdd32(self) :
"Test add32"
self.assertEqua ls(add32(10, -6), 4)
self.assertEqua ls(add32(6, -10), -4)
self.assertEqua ls(add32(_L2U32 (0x80000000L), -1), 0x7FFFFFFF)
self.assertEqua ls(add32(0x7FFF FFFF, 1), _L2U32(0x800000 00L))

def testChecksum(se lf):
"Test calcChecksum function"
self.assertEqua ls(calcChecksum (""), 0)
self.assertEqua ls(calcChecksum ("\1"), 0x01000000)
self.assertEqua ls(calcChecksum ("\x01\x02\x03\ x04\x10\x20\x30 \x40"), 0x11223344)
self.assertEqua ls(calcChecksum ("\x81"), _L2U32(0x810000 00L))
_L2U32(0x800000 00L))

where while it might be reasonable to do testing it seems the tests aren't very
sensible eg what is -6 doing in a u32 test? This stuff just about works on a 32
bit machine, but is failing miserably on a 64bit AMD. As far as I can see I just
need to use masked longs throughout.

In a C extension I can still do the computation exfficiently on a 32bit machine,
but I need to do masking for a 64 bit machine.
--
Robin Becker

**Martin v. Löwis** · Oct 25 '06, 05:05 PM

Re: unsigned 32 bit arithmetic type?

Robin Becker schrieb:

Of course the actual semantics is dependent on what C unsigned
arithmetic does so we're relying on that being the same everywhere.

Assuming that ULONG has the same width on all systems, the outcome
is actually mandated by the C standard: unsigned arithmetic is
defined to operate modulo (max_uint+1) (even if that is not a power
of two).

This algorithm was pretty simple in Python until 2.3 when shifts over
the end of ints started going wrong.

Actually, they start going *right* :-) Addition of two positive numbers
never gives a negative result, in mathematics.

where while it might be reasonable to do testing it seems the tests
aren't very sensible eg what is -6 doing in a u32 test? This stuff just
about works on a 32 bit machine, but is failing miserably on a 64bit
AMD. As far as I can see I just need to use masked longs throughout.

Exactly.

In a C extension I can still do the computation exfficiently on a 32bit
machine, but I need to do masking for a 64 bit machine.

Well, no. You just need to find a 32-bit unsigned integer type on the
64-bit machine. Typically, "unsigned int" should work fine (with
only the Cray being a notable exception, AFAIK). IOW, replace ULONG
with uint32_t wherever you really mean an unsigned 32-bit type,
then use stdint.h where available, else define it to unsigned int
(with a build-time or run-time test whether sizeof(unsigned int)==4).

Regards,
Martin

**sturlamolden** · Oct 25 '06, 05:05 PM

Re: unsigned 32 bit arithmetic type?

Robin Becker wrote:

>

ULONG CalcTableChecks um(ULONG *Table, ULONG Length)
{
ULONG Sum = 0L;
ULONG *Endptr = Table+((Length+ 3) & ~3) / sizeof(ULONG);

while (Table < EndPtr)
Sum += *Table++;
return Sum;
}

Is this what you want?

import numpy
def CalcTableChecks um(Table, Length=None):
tmp = numpy.array(Tab le,dtype=numpy. uint32)
if Length == None: Length = tmp.size
endptr = ((Length+3) & ~3) / 4
return (tmp[0:endptr]).sum()

as nx
type(nx.array([1,2,3],dtype=nx.uint3 2)[0])

so effectively we're doing only additions and letting bits roll off the end.
>
Of course the actual semantics is dependent on what C unsigned arithmetic does
so we're relying on that being the same everywhere.
>
This algorithm was pretty simple in Python until 2.3 when shifts over the end of
ints started going wrong. For some reason we didn't do the obvious and just do
everything in longs and just mask off the upper bits. For some reason (probably
my fault) we seem to have accumulated code like
>
def _L2U32(L):
'''convert a long to u32'''
return unpack('l',pack ('L',L))[0]
>
>
if sys.hexversion> =0x02030000:
def add32(x, y):
"Calculate (x + y) modulo 2**32"
return _L2U32((long(x) +y) & 0xffffffffL)
else:
def add32(x, y):
"Calculate (x + y) modulo 2**32"
lo = (x & 0xFFFF) + (y & 0xFFFF)
hi = (x >16) + (y >16) + (lo >16)
return (hi << 16) | (lo & 0xFFFF)
>
def calcChecksum(da ta):
"""Calculat es TTF-style checksums"""
if len(data)&3: data = data + (4-(len(data)&3))* "\0"
sum = 0
for n in unpack(">%dl" % (len(data)>>2), data):
sum = add32(sum,n)
return sum
>
and also silly stuff like
>
def testAdd32(self) :
"Test add32"
self.assertEqua ls(add32(10, -6), 4)
self.assertEqua ls(add32(6, -10), -4)
self.assertEqua ls(add32(_L2U32 (0x80000000L), -1), 0x7FFFFFFF)
self.assertEqua ls(add32(0x7FFF FFFF, 1), _L2U32(0x800000 00L))
>
def testChecksum(se lf):
"Test calcChecksum function"
self.assertEqua ls(calcChecksum (""), 0)
self.assertEqua ls(calcChecksum ("\1"), 0x01000000)
self.assertEqua ls(calcChecksum ("\x01\x02\x03\ x04\x10\x20\x30 \x40"), 0x11223344)
self.assertEqua ls(calcChecksum ("\x81"), _L2U32(0x810000 00L))
_L2U32(0x800000 00L))
>
where while it might be reasonable to do testing it seems the tests aren't very
sensible eg what is -6 doing in a u32 test? This stuff just about works on a 32
bit machine, but is failing miserably on a 64bit AMD. As far as I can see I just
need to use masked longs throughout.
>
In a C extension I can still do the computation exfficiently on a 32bit machine,
but I need to do masking for a 64 bit machine.
--
Robin Becker

**sturlamolden** · Oct 25 '06, 05:05 PM

Re: unsigned 32 bit arithmetic type?

Robin Becker wrote:

ULONG CalcTableChecks um(ULONG *Table, ULONG Length)
{
ULONG Sum = 0L;
ULONG *Endptr = Table+((Length+ 3) & ~3) / sizeof(ULONG);

while (Table < EndPtr)
Sum += *Table++;
return Sum;
}

Is this what you want?

import numpy
def CalcTableChecks um(Table, Length=None):
tmp = numpy.array(Tab le,dtype=numpy. uint32)
if Length == None: Length = tmp.size
endptr = ((Length+3) & ~3) / 4
return (tmp[0:endptr]).sum()

**Robin Becker** · Oct 25 '06, 05:35 PM

Re: unsigned 32 bit arithmetic type?

sturlamolden wrote:

import numpy
def CalcTableChecks um(Table, Length=None):
tmp = numpy.array(Tab le,dtype=numpy. uint32)
if Length == None: Length = tmp.size
endptr = ((Length+3) & ~3) / 4
return (tmp[0:endptr]).sum()
>

it's probably wonderful, but I don't think I can ask people to add numpy to the
list of requirements for reportlab :)

I used to love its predecessor Numeric, but it was quite large.

--
Robin Becker

**sturlamolden** · Oct 25 '06, 06:25 PM

Re: unsigned 32 bit arithmetic type?

Robin Becker wrote:

it's probably wonderful, but I don't think I can ask people to add numpy to the
list of requirements for reportlab :)

Maybe NumPy makes it into the core Python tree one day. At some point
other Python users than die-hard scientists and mathematicans will
realise that for and while loops are the root of all evil when doing
CPU bound operations in an interpreted language. Array slicing and
vectorised statements can be faster by astronomical proportions.

Here is one example: http://tinyurl.com/y79zhc

This statement that required twenty seconds to execute

dim = size(infocbcr);

image = zeros(dim(1), dim(2));

for i = 1:dim(1)
for j = 1:dim(2)
cb = double(infocbcr (i,j,2));
cr = double(infocbcr (i,j,3));
x = [(cb-media_b); (cr-media_r)];
%this gives a mult of 1*2 * 2*2 * 2*1
image(i,j) = exp(-0.5* x'*inv(brcov)* x);
end
end

could be replaced with an equivalent condensed statement that only
required a fraction of a second:

image = reshape(exp(-0.5*sum(((chol( brcov)')\ ...
((reshape(doubl e(infocbcr(:,:, 2:3)),dim(1)*di m(2),2)')...
-repmat([media_b;media_r],1,dim(1)*dim(2 )))).^2)'),dim( 1),dim(2));

This was Matlab, but the same holds for Python and NumPy. The overhead
in the first code sniplet comes from calling the interpreter inside a
tight loop. That is why loops are the root of evilness when doung CPU
bound tasks in an interpreted language. I would think that 9 out of 10
tasks most Python users think require a C extension is actually more
easily solved with NumPy. This is old knowledge from the Matlab
community: even if you think you need a "MEX file" (that is, a C
extension for Matlab), you probably don't. Vectorize and it will be
fast enough.

**Robin Becker** · Oct 26 '06, 08:35 AM

Re: unsigned 32 bit arithmetic type?

sturlamolden wrote:

Robin Becker wrote:
>

>it's probably wonderful, but I don't think I can ask people to add numpy to the
>list of requirements for reportlab :)

>

.........

This was Matlab, but the same holds for Python and NumPy. The overhead
in the first code sniplet comes from calling the interpreter inside a
tight loop. That is why loops are the root of evilness when doung CPU
bound tasks in an interpreted language. I would think that 9 out of 10
tasks most Python users think require a C extension is actually more
easily solved with NumPy. This is old knowledge from the Matlab
community: even if you think you need a "MEX file" (that is, a C
extension for Matlab), you probably don't. Vectorize and it will be
fast enough.
>

I think you're preaching to the converted. The very first serious thing I did in
python involved a generational accounting model calculation that was translated
from matlab into Numeric/python. It ran about 10 times faster than matlab and
about 5 times faster than a matlab compiler.
--
Robin Becker

unsigned 32 bit arithmetic type?

unsigned 32 bit arithmetic type?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment