Re: D.E. Knuth's strlen

**jacob navia** · Oct 16 '08, 05:05 PM

Re: D.E. Knuth's strlen

Jean-Marc Bourguet wrote:

jacob navia <jacob@nospam.c omwrites:
>
>

>#include <stdio.h>
>#include <string.h>
>#define H 0x8080808080808 080ULL
>#define L 0x0101010101010 101ULL
>size_t myStrlen(char *s)
>{
> unsigned long long t;
> char *save = s;
>>
> while (1) {
> // This supposes that the input string is aligned
> // or that the machine doesn't trap when reading
> // a 8 byte integer at a random position like the
> // x86
> t = *(unsigned long long *)s;
> if (H & (t - L) & ~t)
> break;
> s += sizeof(long long);
> }

>
I'd first align and then use that. You may get a trap with unaligned
access even on machine where unaligned access doesn't normally trap if you
stump on a page boundary with the next page not mapped in memory. And some
machine allows unaligned access but at a performance penality, which would
be against the goal of using that trick.
>
Yours,
>

Yes, aligning could be done first:
{
intptr_t i = (intptr_t)s;
int n = i&(sizeof(lon g long)-1);
while (n 0 && *s)
s++,n--;
}

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique

403 Forbidden

http://www.cs.virginia.edu/~lcc-win32

**Keith Thompson** · Oct 17 '08, 12:25 AM

Re: D.E. Knuth's strlen

jacob navia <jacob@nospam.c omwrites:
[...]

Yes, aligning could be done first:
{
intptr_t i = (intptr_t)s;
int n = i&(sizeof(lon g long)-1);
while (n 0 && *s)
s++,n--;
}

The first line is non-portable, of course. But if your goal is to
implement the fastest possible strlen *for a given system*, that's not
necessarily a problem.

Why is n an int rather than a size_t?

--
Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

**Eric Sosman** · Oct 17 '08, 01:25 AM

Re: D.E. Knuth's strlen

Keith Thompson wrote:

jacob navia <jacob@nospam.c omwrites:
[...]

>Yes, aligning could be done first:
>{
> intptr_t i = (intptr_t)s;
> int n = i&(sizeof(lon g long)-1);
> while (n 0 && *s)
> s++,n--;
>}

>
The first line is non-portable, of course. But if your goal is to
implement the fastest possible strlen *for a given system*, that's not
necessarily a problem.
>
Why is n an int rather than a size_t?

Because it's the right choice, of course.

--
Eric Sosman
esosman@ieee-dot-org.invalid

**jacob navia** · Oct 17 '08, 07:45 AM

Re: D.E. Knuth's strlen

Eric Sosman wrote:

Keith Thompson wrote:

>jacob navia <jacob@nospam.c omwrites:
>[...]

>>Yes, aligning could be done first:
>>{
>> intptr_t i = (intptr_t)s;
>> int n = i&(sizeof(lon g long)-1);
>> while (n 0 && *s)
>> s++,n--;
>>}

>>
>The first line is non-portable, of course. But if your goal is to
>implement the fastest possible strlen *for a given system*, that's not
>necessarily a problem.
>>
>Why is n an int rather than a size_t?

>
Because it's the right choice, of course.
>

In any case it can't be bigger than 7 or smaller
than 0, so it would fit in ANY (char,short,int ,long,long long)

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique

403 Forbidden

http://www.cs.virginia.edu/~lcc-win32

**jacob navia** · Oct 17 '08, 07:45 AM

Re: D.E. Knuth's strlen

Keith Thompson wrote:

jacob navia <jacob@nospam.c omwrites:
[...]

>Yes, aligning could be done first:
>{
> intptr_t i = (intptr_t)s;
> int n = i&(sizeof(lon g long)-1);
> while (n 0 && *s)
> s++,n--;
>}

>
The first line is non-portable, of course. But if your goal is to
implement the fastest possible strlen *for a given system*, that's not
necessarily a problem.
>
Why is n an int rather than a size_t?
>

n is an int and not a size_t because if it were a size_t
I would have written an infinite loop!

while (n 0) n--;

is an infinite loop if n is unsigned and you should know that.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique

403 Forbidden

http://www.cs.virginia.edu/~lcc-win32

**Chris Dollin** · Oct 17 '08, 07:45 AM

Re: D.E. Knuth's strlen

jacob navia wrote:

n is an int and not a size_t because if it were a size_t
I would have written an infinite loop!
>
while (n 0) n--;
>
is an infinite loop if n is unsigned and you should know that.

More sleep or more coffee: you need at least one of them.

I think you meant to write `>=`, not `>`.

--
'It changed the future .. and it changed us.' /Babylon 5/

Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England

**jacob navia** · Oct 17 '08, 08:05 AM

Re: D.E. Knuth's strlen

Chris Dollin wrote:

jacob navia wrote:
>

>n is an int and not a size_t because if it were a size_t
>I would have written an infinite loop!
>>
>while (n 0) n--;
>>
>is an infinite loop if n is unsigned and you should know that.

>
More sleep or more coffee: you need at least one of them.
>
I think you meant to write `>=`, not `>`.
>

The original loop I wrote is

while (n)

and that will infinite loop. I copied it wrongly
into my message

Sorry about that.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique

403 Forbidden

http://www.cs.virginia.edu/~lcc-win32

**Chris Dollin** · Oct 17 '08, 08:15 AM

Re: D.E. Knuth's strlen

jacob navia wrote:

Chris Dollin wrote:

>jacob navia wrote:
>>

>>n is an int and not a size_t because if it were a size_t
>>I would have written an infinite loop!
>>>
>>while (n 0) n--;
>>>
>>is an infinite loop if n is unsigned and you should know that.

>>
>More sleep or more coffee: you need at least one of them.
>>
>I think you meant to write `>=`, not `>`.

>
The original loop I wrote is
>
while (n)
>
and that will infinite loop. I copied it wrongly
into my message

You're surely not saying that `while (n) n -= 1;` is an infinite
loop for unsigned n?

[Note: my use of `n -= 1` rather than `n--` is purely stylistic
and not part of my point.]

More sleep or more coffee. Perhaps it's me?

--
'It changed the future .. and it changed us.' /Babylon 5/

Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England

**jacob navia** · Oct 17 '08, 08:25 AM

Re: D.E. Knuth's strlen

Chris Dollin wrote:

jacob navia wrote:
>

>Chris Dollin wrote:

>>jacob navia wrote:
>>>
>>>n is an int and not a size_t because if it were a size_t
>>>I would have written an infinite loop!
>>>>
>>>while (n 0) n--;
>>>>
>>>is an infinite loop if n is unsigned and you should know that.
>>More sleep or more coffee: you need at least one of them.
>>>
>>I think you meant to write `>=`, not `>`.

>The original loop I wrote is
>>
>while (n)
>>
>and that will infinite loop. I copied it wrongly
>into my message

>
You're surely not saying that `while (n) n -= 1;` is an infinite
loop for unsigned n?
>
[Note: my use of `n -= 1` rather than `n--` is purely stylistic
and not part of my point.]
>
More sleep or more coffee. Perhaps it's me?
>

You are right. More sleep and less coffee...

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique

403 Forbidden

http://www.cs.virginia.edu/~lcc-win32

**Eric Sosman** · Oct 17 '08, 12:25 PM

Re: D.E. Knuth's strlen

jacob navia wrote:

Eric Sosman wrote:

>Keith Thompson wrote:

>>jacob navia <jacob@nospam.c omwrites:
>>[...]
>>>Yes, aligning could be done first:
>>>{
>>> intptr_t i = (intptr_t)s;
>>> int n = i&(sizeof(lon g long)-1);
>>> while (n 0 && *s)
>>> s++,n--;
>>>}
>>>
>>The first line is non-portable, of course. But if your goal is to
>>implement the fastest possible strlen *for a given system*, that's not
>>necessarily a problem.
>>>
>>Why is n an int rather than a size_t?

>>
> Because it's the right choice, of course.
>>

>
In any case it can't be bigger than 7 or smaller
than 0, so it would fit in ANY (char,short,int ,long,long long)

Exactly: And that's why using "the natural size suggested
by the architecture" is the right choice.

Somebody's sure to make the calculation, so here's an
attempt to preempt it: The choice would be wrong on a machine
with 16-bit int and 262144-bit long long. Not even the DS9K
designers are *that* daft, and if they were, this technique
for strlen() would be a poor choice anyhow.

--
Eric Sosman
esosman@ieee-dot-org.invalid

**Keith Thompson** · Oct 17 '08, 02:45 PM

Re: D.E. Knuth's strlen

jacob navia <jacob@nospam.c omwrites:

Eric Sosman wrote:

>Keith Thompson wrote:

>>jacob navia <jacob@nospam.c omwrites:
>>[...]
>>>Yes, aligning could be done first:
>>>{
>>> intptr_t i = (intptr_t)s;
>>> int n = i&(sizeof(lon g long)-1);
>>> while (n 0 && *s)
>>> s++,n--;
>>>}
>>>
>>The first line is non-portable, of course. But if your goal is to
>>implement the fastest possible strlen *for a given system*, that's not
>>necessarily a problem.
>>>
>>Why is n an int rather than a size_t?

> Because it's the right choice, of course.

>
In any case it can't be bigger than 7 or smaller
than 0, so it would fit in ANY (char,short,int ,long,long long)

You're right, int is the best choice.

--
Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

**James Dow Allen** · Oct 18 '08, 08:05 AM

Re: D.E. Knuth's strlen

On Oct 17, 12:00 am, jacob navia <ja...@nospam.c omwrote:

jacob navia <ja...@nospam.c omwrites:

>

#define H 0x8080808080808 080ULL
#define L 0x0101010101010 101ULL
[snip]
if (H & (t - L) & ~t)

This clever method seemed familiar; I checked FSF's
strlen() and indeed it uses (almost!) the same cleverness.

I compared the routines. One of the differences between
Mr. navia's strlen() and FSF's seemed so amazing I
felt obligated to mention it here. (I'll also mention
a few other differences that might be of interest.)
I've modified the FSF code excerpts for brevity or
consistency with Mr. navia's; in particular I
write "Ulong" where FSF writes "unsigned long int".

+ whereas Mr. navia's code terminates with
while (*s) s++;
fsf unrolls this into (sizeof (Ulong)) tests and
continues in the big loop when all tests fail.

Now the amazing difference:
+ whereas Mr. navia uses
(H & (t - L) & ~t)
as his "magic" test, the fsf version omits the
" & ~t " here! There is some more complicated code,
presumably for a similar purpose, but it's
disabled via "#if 0".

This rather startled me! Were Knuth and navia spendthrifts,
throwing away cycles on some ideological whim? Or had
the millions of FSF strlen() invocations I've done to
build my website all been flawed? Was it all some
cuteness to allow the same code to run on both ones-
and twos-complement machines? Worst of all, since
it takes me at least three cups of coffee to understand
such bit-twiddling code, would I end up with
irritable bowel syndrome again?

I followed the advice occasionally offered in the ng,
and ran a simple test program. The strings I use for
my website are printable and have no "negative" characters
(or greater than 0x7f, for those not too lazy to type
"unsigned") . FSF's abbreviated magic seemed to work OK
with those. Whew!, my website was safe!

It's on "negative" characters that Mr. navia's "& ~t"
demonstrates its utility. Why is FSF's code OK then?
When the magic test fails, it performs the (sizeof (Ulong))
individual tests and if no null-character is found,
it simply resumes the main loop! Why this peculiarity??
I'll leave c.l.c denizens to their own conclusions,
but it *could* be a clever speed optimization, useful
when at least 99.2% of characters passed to strlen() are
"positive"!

In addition to the obvious correction Mr. navia admits to

Yes, aligning could be done first ...

other differences of FSF's code include:

+ fsf uses 'const' in each pointer declaration
+ fsf sets its constants via (essentially) the odd-looking:
Ulong himagic;
himagic = 0x80808080L;
if (sizeof (Ulong) 4) {
/* Do the shift in two steps to avoid a
* warning if long has 32 bits.
*/
himagic = ((himagic << 16) << 16) | himagic;
}
(I guess the warning suppressed by the double shift
would apply only when that code is unexecuted anyway.)

I suppose Mr. navia didn't check the FSF implementation.
Is this one of those deals where if the lawyers prove
you read certain source code, you're bound by certain
license restrictions?

(BTW, The only reason I had glibc source on my laptop
is that I wanted to inspect their hsearch() and see
if they ever fixed its *very* egregious performance
bug. The only file I really wanted was misc/hsearch_r.c,
but this is a *huge* file, 6410 bytes. Fortunately,
FSF is smart enough to use compression to reduce
download effort and doesn't allow access to this huge
uncompressed file. All I needed to download was
glibc-2.7.tar.gz, which is a mere 21,241,912 bytes.)

Since I've pointed out bugs in his hashlib before, I
hope it will cheer Chuck up to learn that FSF hsearch()
*still* has a design defect so laughable as to make Chuck's
seem almost well-written by comparison! The first probe
after a collision in their fancy "based-on-Knuth" reprobing
is *always* to the last table entry. (Uh, the whole idea
of fancy reprobing is to avoid secondary collisions.)

James Hussein Allen

**CBFalconer** · Oct 19 '08, 12:55 AM

Re: D.E. Knuth's strlen

James Dow Allen wrote:

>

.... snip ...

>
Since I've pointed out bugs in his hashlib before, I
hope it will cheer Chuck up to learn that FSF hsearch()
*still* has a design defect so laughable as to make Chuck's
seem almost well-written by comparison! The first probe
after a collision in their fancy "based-on-Knuth" reprobing
is *always* to the last table entry. (Uh, the whole idea
of fancy reprobing is to avoid secondary collisions.)

I cannot recall ever receiving a bug report from you on hashlib.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home .att.net>
Try the download section.

**Ben Bacarisse** · Oct 19 '08, 01:55 AM

Re: D.E. Knuth's strlen

James Dow Allen <jdallen2000@ya hoo.comwrites:

(BTW, The only reason I had glibc source on my laptop
is that I wanted to inspect their hsearch() and see
if they ever fixed its *very* egregious performance
bug.

[off-topic] I could not find your bug report, so I filed one.

The only file I really wanted was misc/hsearch_r.c,
but this is a *huge* file, 6410 bytes. Fortunately,
FSF is smart enough to use compression to reduce
download effort and doesn't allow access to this huge
uncompressed file. All I needed to download was
glibc-2.7.tar.gz, which is a mere 21,241,912 bytes.)

[off-topic] You can get just one file from the CVS browser. E.g.:

http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/misc/?cvsroot=glibc

Since I've pointed out bugs in his hashlib before, I
hope it will cheer Chuck up to learn that FSF hsearch()
*still* has a design defect so laughable as to make Chuck's
seem almost well-written by comparison! The first probe
after a collision in their fancy "based-on-Knuth" reprobing
is *always* to the last table entry. (Uh, the whole idea
of fancy reprobing is to avoid secondary collisions.)

I don't think it is a design defect. It looks to me to be a
common-all-garden bug. Someone thought they could reduce the hash mod
the table size so as to detect the wrap-around without another
variable, but that meant the secondary hash was not as per Knuth.

BTW, the probe is not always to the last entry, but is in all bar two
cases.

[This whole post is nearly off-topic but learners might like to look at
hsearch_r.c at the link above to see how easily one can make an error
with a hash table implementation.]

--
Ben.

Re: D.E. Knuth's strlen

Re: D.E. Knuth's strlen

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment