Re: D.E. Knuth's strlen

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jean-Marc Bourguet

    Re: D.E. Knuth's strlen

    jacob navia <jacob@nospam.c omwrites:

    #include <stdio.h>
    #include <string.h>
    #define H 0x8080808080808 080ULL
    #define L 0x0101010101010 101ULL
    size_t myStrlen(char *s)
    {
    unsigned long long t;
    char *save = s;
    >
    while (1) {
    // This supposes that the input string is aligned
    // or that the machine doesn't trap when reading
    // a 8 byte integer at a random position like the
    // x86
    t = *(unsigned long long *)s;
    if (H & (t - L) & ~t)
    break;
    s += sizeof(long long);
    }
    I'd first align and then use that. You may get a trap with unaligned
    access even on machine where unaligned access doesn't normally trap if you
    stump on a page boundary with the next page not mapped in memory. And some
    machine allows unaligned access but at a performance penality, which would
    be against the goal of using that trick.

    Yours,

    --
    Jean-Marc
  • jacob navia

    #2
    Re: D.E. Knuth's strlen

    Jean-Marc Bourguet wrote:
    jacob navia <jacob@nospam.c omwrites:
    >
    >
    >#include <stdio.h>
    >#include <string.h>
    >#define H 0x8080808080808 080ULL
    >#define L 0x0101010101010 101ULL
    >size_t myStrlen(char *s)
    >{
    > unsigned long long t;
    > char *save = s;
    >>
    > while (1) {
    > // This supposes that the input string is aligned
    > // or that the machine doesn't trap when reading
    > // a 8 byte integer at a random position like the
    > // x86
    > t = *(unsigned long long *)s;
    > if (H & (t - L) & ~t)
    > break;
    > s += sizeof(long long);
    > }
    >
    I'd first align and then use that. You may get a trap with unaligned
    access even on machine where unaligned access doesn't normally trap if you
    stump on a page boundary with the next page not mapped in memory. And some
    machine allows unaligned access but at a performance penality, which would
    be against the goal of using that trick.
    >
    Yours,
    >
    Yes, aligning could be done first:
    {
    intptr_t i = (intptr_t)s;
    int n = i&(sizeof(lon g long)-1);
    while (n 0 && *s)
    s++,n--;
    }


    --
    jacob navia
    jacob at jacob point remcomp point fr
    logiciels/informatique

    Comment

    • Keith Thompson

      #3
      Re: D.E. Knuth's strlen

      jacob navia <jacob@nospam.c omwrites:
      [...]
      Yes, aligning could be done first:
      {
      intptr_t i = (intptr_t)s;
      int n = i&(sizeof(lon g long)-1);
      while (n 0 && *s)
      s++,n--;
      }
      The first line is non-portable, of course. But if your goal is to
      implement the fastest possible strlen *for a given system*, that's not
      necessarily a problem.

      Why is n an int rather than a size_t?

      --
      Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
      Nokia
      "We must do something. This is something. Therefore, we must do this."
      -- Antony Jay and Jonathan Lynn, "Yes Minister"

      Comment

      • Eric Sosman

        #4
        Re: D.E. Knuth's strlen

        Keith Thompson wrote:
        jacob navia <jacob@nospam.c omwrites:
        [...]
        >Yes, aligning could be done first:
        >{
        > intptr_t i = (intptr_t)s;
        > int n = i&(sizeof(lon g long)-1);
        > while (n 0 && *s)
        > s++,n--;
        >}
        >
        The first line is non-portable, of course. But if your goal is to
        implement the fastest possible strlen *for a given system*, that's not
        necessarily a problem.
        >
        Why is n an int rather than a size_t?
        Because it's the right choice, of course.

        --
        Eric Sosman
        esosman@ieee-dot-org.invalid

        Comment

        • jacob navia

          #5
          Re: D.E. Knuth's strlen

          Eric Sosman wrote:
          Keith Thompson wrote:
          >jacob navia <jacob@nospam.c omwrites:
          >[...]
          >>Yes, aligning could be done first:
          >>{
          >> intptr_t i = (intptr_t)s;
          >> int n = i&(sizeof(lon g long)-1);
          >> while (n 0 && *s)
          >> s++,n--;
          >>}
          >>
          >The first line is non-portable, of course. But if your goal is to
          >implement the fastest possible strlen *for a given system*, that's not
          >necessarily a problem.
          >>
          >Why is n an int rather than a size_t?
          >
          Because it's the right choice, of course.
          >
          In any case it can't be bigger than 7 or smaller
          than 0, so it would fit in ANY (char,short,int ,long,long long)

          --
          jacob navia
          jacob at jacob point remcomp point fr
          logiciels/informatique

          Comment

          • jacob navia

            #6
            Re: D.E. Knuth's strlen

            Keith Thompson wrote:
            jacob navia <jacob@nospam.c omwrites:
            [...]
            >Yes, aligning could be done first:
            >{
            > intptr_t i = (intptr_t)s;
            > int n = i&(sizeof(lon g long)-1);
            > while (n 0 && *s)
            > s++,n--;
            >}
            >
            The first line is non-portable, of course. But if your goal is to
            implement the fastest possible strlen *for a given system*, that's not
            necessarily a problem.
            >
            Why is n an int rather than a size_t?
            >
            n is an int and not a size_t because if it were a size_t
            I would have written an infinite loop!

            while (n 0) n--;

            is an infinite loop if n is unsigned and you should know that.

            --
            jacob navia
            jacob at jacob point remcomp point fr
            logiciels/informatique

            Comment

            • Chris Dollin

              #7
              Re: D.E. Knuth's strlen

              jacob navia wrote:
              n is an int and not a size_t because if it were a size_t
              I would have written an infinite loop!
              >
              while (n 0) n--;
              >
              is an infinite loop if n is unsigned and you should know that.
              More sleep or more coffee: you need at least one of them.

              I think you meant to write `>=`, not `>`.

              --
              'It changed the future .. and it changed us.' /Babylon 5/

              Hewlett-Packard Limited Cain Road, Bracknell, registered no:
              registered office: Berks RG12 1HN 690597 England

              Comment

              • jacob navia

                #8
                Re: D.E. Knuth's strlen

                Chris Dollin wrote:
                jacob navia wrote:
                >
                >n is an int and not a size_t because if it were a size_t
                >I would have written an infinite loop!
                >>
                >while (n 0) n--;
                >>
                >is an infinite loop if n is unsigned and you should know that.
                >
                More sleep or more coffee: you need at least one of them.
                >
                I think you meant to write `>=`, not `>`.
                >
                The original loop I wrote is

                while (n)

                and that will infinite loop. I copied it wrongly
                into my message

                Sorry about that.


                --
                jacob navia
                jacob at jacob point remcomp point fr
                logiciels/informatique

                Comment

                • Chris Dollin

                  #9
                  Re: D.E. Knuth's strlen

                  jacob navia wrote:
                  Chris Dollin wrote:
                  >jacob navia wrote:
                  >>
                  >>n is an int and not a size_t because if it were a size_t
                  >>I would have written an infinite loop!
                  >>>
                  >>while (n 0) n--;
                  >>>
                  >>is an infinite loop if n is unsigned and you should know that.
                  >>
                  >More sleep or more coffee: you need at least one of them.
                  >>
                  >I think you meant to write `>=`, not `>`.
                  >
                  The original loop I wrote is
                  >
                  while (n)
                  >
                  and that will infinite loop. I copied it wrongly
                  into my message
                  You're surely not saying that `while (n) n -= 1;` is an infinite
                  loop for unsigned n?

                  [Note: my use of `n -= 1` rather than `n--` is purely stylistic
                  and not part of my point.]

                  More sleep or more coffee. Perhaps it's me?

                  --
                  'It changed the future .. and it changed us.' /Babylon 5/

                  Hewlett-Packard Limited Cain Road, Bracknell, registered no:
                  registered office: Berks RG12 1HN 690597 England

                  Comment

                  • jacob navia

                    #10
                    Re: D.E. Knuth's strlen

                    Chris Dollin wrote:
                    jacob navia wrote:
                    >
                    >Chris Dollin wrote:
                    >>jacob navia wrote:
                    >>>
                    >>>n is an int and not a size_t because if it were a size_t
                    >>>I would have written an infinite loop!
                    >>>>
                    >>>while (n 0) n--;
                    >>>>
                    >>>is an infinite loop if n is unsigned and you should know that.
                    >>More sleep or more coffee: you need at least one of them.
                    >>>
                    >>I think you meant to write `>=`, not `>`.
                    >The original loop I wrote is
                    >>
                    >while (n)
                    >>
                    >and that will infinite loop. I copied it wrongly
                    >into my message
                    >
                    You're surely not saying that `while (n) n -= 1;` is an infinite
                    loop for unsigned n?
                    >
                    [Note: my use of `n -= 1` rather than `n--` is purely stylistic
                    and not part of my point.]
                    >
                    More sleep or more coffee. Perhaps it's me?
                    >
                    You are right. More sleep and less coffee...


                    --
                    jacob navia
                    jacob at jacob point remcomp point fr
                    logiciels/informatique

                    Comment

                    • Eric Sosman

                      #11
                      Re: D.E. Knuth's strlen

                      jacob navia wrote:
                      Eric Sosman wrote:
                      >Keith Thompson wrote:
                      >>jacob navia <jacob@nospam.c omwrites:
                      >>[...]
                      >>>Yes, aligning could be done first:
                      >>>{
                      >>> intptr_t i = (intptr_t)s;
                      >>> int n = i&(sizeof(lon g long)-1);
                      >>> while (n 0 && *s)
                      >>> s++,n--;
                      >>>}
                      >>>
                      >>The first line is non-portable, of course. But if your goal is to
                      >>implement the fastest possible strlen *for a given system*, that's not
                      >>necessarily a problem.
                      >>>
                      >>Why is n an int rather than a size_t?
                      >>
                      > Because it's the right choice, of course.
                      >>
                      >
                      In any case it can't be bigger than 7 or smaller
                      than 0, so it would fit in ANY (char,short,int ,long,long long)
                      Exactly: And that's why using "the natural size suggested
                      by the architecture" is the right choice.

                      Somebody's sure to make the calculation, so here's an
                      attempt to preempt it: The choice would be wrong on a machine
                      with 16-bit int and 262144-bit long long. Not even the DS9K
                      designers are *that* daft, and if they were, this technique
                      for strlen() would be a poor choice anyhow.

                      --
                      Eric Sosman
                      esosman@ieee-dot-org.invalid

                      Comment

                      • Keith Thompson

                        #12
                        Re: D.E. Knuth's strlen

                        jacob navia <jacob@nospam.c omwrites:
                        Eric Sosman wrote:
                        >Keith Thompson wrote:
                        >>jacob navia <jacob@nospam.c omwrites:
                        >>[...]
                        >>>Yes, aligning could be done first:
                        >>>{
                        >>> intptr_t i = (intptr_t)s;
                        >>> int n = i&(sizeof(lon g long)-1);
                        >>> while (n 0 && *s)
                        >>> s++,n--;
                        >>>}
                        >>>
                        >>The first line is non-portable, of course. But if your goal is to
                        >>implement the fastest possible strlen *for a given system*, that's not
                        >>necessarily a problem.
                        >>>
                        >>Why is n an int rather than a size_t?
                        > Because it's the right choice, of course.
                        >
                        In any case it can't be bigger than 7 or smaller
                        than 0, so it would fit in ANY (char,short,int ,long,long long)
                        You're right, int is the best choice.

                        --
                        Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                        Nokia
                        "We must do something. This is something. Therefore, we must do this."
                        -- Antony Jay and Jonathan Lynn, "Yes Minister"

                        Comment

                        • James Dow Allen

                          #13
                          Re: D.E. Knuth's strlen


                          On Oct 17, 12:00 am, jacob navia <ja...@nospam.c omwrote:
                          jacob navia <ja...@nospam.c omwrites:
                          >
                          #define H 0x8080808080808 080ULL
                          #define L 0x0101010101010 101ULL
                          [snip]
                                         if (H & (t - L) & ~t)
                          This clever method seemed familiar; I checked FSF's
                          strlen() and indeed it uses (almost!) the same cleverness.

                          I compared the routines. One of the differences between
                          Mr. navia's strlen() and FSF's seemed so amazing I
                          felt obligated to mention it here. (I'll also mention
                          a few other differences that might be of interest.)
                          I've modified the FSF code excerpts for brevity or
                          consistency with Mr. navia's; in particular I
                          write "Ulong" where FSF writes "unsigned long int".

                          + whereas Mr. navia's code terminates with
                          while (*s) s++;
                          fsf unrolls this into (sizeof (Ulong)) tests and
                          continues in the big loop when all tests fail.

                          Now the amazing difference:
                          + whereas Mr. navia uses
                          (H & (t - L) & ~t)
                          as his "magic" test, the fsf version omits the
                          " & ~t " here! There is some more complicated code,
                          presumably for a similar purpose, but it's
                          disabled via "#if 0".

                          This rather startled me! Were Knuth and navia spendthrifts,
                          throwing away cycles on some ideological whim? Or had
                          the millions of FSF strlen() invocations I've done to
                          build my website all been flawed? Was it all some
                          cuteness to allow the same code to run on both ones-
                          and twos-complement machines? Worst of all, since
                          it takes me at least three cups of coffee to understand
                          such bit-twiddling code, would I end up with
                          irritable bowel syndrome again?

                          I followed the advice occasionally offered in the ng,
                          and ran a simple test program. The strings I use for
                          my website are printable and have no "negative" characters
                          (or greater than 0x7f, for those not too lazy to type
                          "unsigned") . FSF's abbreviated magic seemed to work OK
                          with those. Whew!, my website was safe!

                          It's on "negative" characters that Mr. navia's "& ~t"
                          demonstrates its utility. Why is FSF's code OK then?
                          When the magic test fails, it performs the (sizeof (Ulong))
                          individual tests and if no null-character is found,
                          it simply resumes the main loop! Why this peculiarity??
                          I'll leave c.l.c denizens to their own conclusions,
                          but it *could* be a clever speed optimization, useful
                          when at least 99.2% of characters passed to strlen() are
                          "positive"!

                          In addition to the obvious correction Mr. navia admits to
                          Yes, aligning could be done first ...
                          other differences of FSF's code include:

                          + fsf uses 'const' in each pointer declaration
                          + fsf sets its constants via (essentially) the odd-looking:
                          Ulong himagic;
                          himagic = 0x80808080L;
                          if (sizeof (Ulong) 4) {
                          /* Do the shift in two steps to avoid a
                          * warning if long has 32 bits.
                          */
                          himagic = ((himagic << 16) << 16) | himagic;
                          }
                          (I guess the warning suppressed by the double shift
                          would apply only when that code is unexecuted anyway.)

                          I suppose Mr. navia didn't check the FSF implementation.
                          Is this one of those deals where if the lawyers prove
                          you read certain source code, you're bound by certain
                          license restrictions?

                          (BTW, The only reason I had glibc source on my laptop
                          is that I wanted to inspect their hsearch() and see
                          if they ever fixed its *very* egregious performance
                          bug. The only file I really wanted was misc/hsearch_r.c,
                          but this is a *huge* file, 6410 bytes. Fortunately,
                          FSF is smart enough to use compression to reduce
                          download effort and doesn't allow access to this huge
                          uncompressed file. All I needed to download was
                          glibc-2.7.tar.gz, which is a mere 21,241,912 bytes.)

                          Since I've pointed out bugs in his hashlib before, I
                          hope it will cheer Chuck up to learn that FSF hsearch()
                          *still* has a design defect so laughable as to make Chuck's
                          seem almost well-written by comparison! The first probe
                          after a collision in their fancy "based-on-Knuth" reprobing
                          is *always* to the last table entry. (Uh, the whole idea
                          of fancy reprobing is to avoid secondary collisions.)

                          James Hussein Allen

                          Comment

                          • CBFalconer

                            #14
                            Re: D.E. Knuth's strlen

                            James Dow Allen wrote:
                            >
                            .... snip ...
                            >
                            Since I've pointed out bugs in his hashlib before, I
                            hope it will cheer Chuck up to learn that FSF hsearch()
                            *still* has a design defect so laughable as to make Chuck's
                            seem almost well-written by comparison! The first probe
                            after a collision in their fancy "based-on-Knuth" reprobing
                            is *always* to the last table entry. (Uh, the whole idea
                            of fancy reprobing is to avoid secondary collisions.)
                            I cannot recall ever receiving a bug report from you on hashlib.

                            --
                            [mail]: Chuck F (cbfalconer at maineline dot net)
                            [page]: <http://cbfalconer.home .att.net>
                            Try the download section.

                            Comment

                            • Ben Bacarisse

                              #15
                              Re: D.E. Knuth's strlen

                              James Dow Allen <jdallen2000@ya hoo.comwrites:
                              (BTW, The only reason I had glibc source on my laptop
                              is that I wanted to inspect their hsearch() and see
                              if they ever fixed its *very* egregious performance
                              bug.
                              [off-topic] I could not find your bug report, so I filed one.
                              The only file I really wanted was misc/hsearch_r.c,
                              but this is a *huge* file, 6410 bytes. Fortunately,
                              FSF is smart enough to use compression to reduce
                              download effort and doesn't allow access to this huge
                              uncompressed file. All I needed to download was
                              glibc-2.7.tar.gz, which is a mere 21,241,912 bytes.)
                              [off-topic] You can get just one file from the CVS browser. E.g.:

                              Since I've pointed out bugs in his hashlib before, I
                              hope it will cheer Chuck up to learn that FSF hsearch()
                              *still* has a design defect so laughable as to make Chuck's
                              seem almost well-written by comparison! The first probe
                              after a collision in their fancy "based-on-Knuth" reprobing
                              is *always* to the last table entry. (Uh, the whole idea
                              of fancy reprobing is to avoid secondary collisions.)
                              I don't think it is a design defect. It looks to me to be a
                              common-all-garden bug. Someone thought they could reduce the hash mod
                              the table size so as to detect the wrap-around without another
                              variable, but that meant the secondary hash was not as per Knuth.

                              BTW, the probe is not always to the last entry, but is in all bar two
                              cases.

                              [This whole post is nearly off-topic but learners might like to look at
                              hsearch_r.c at the link above to see how easily one can make an error
                              with a hash table implementation.]

                              --
                              Ben.

                              Comment

                              Working...