Highly efficient string reversal code

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Harald van =?UTF-8?b?RMSzaw==?=

    Re: Highly efficient string reversal code

    On Sat, 27 Sep 2008 17:15:00 -0700, vippstar wrote:
    On Sep 27, 11:27 pm, CBFalconer <cbfalco...@yah oo.comwrote:
    >vipps...@gmail .com wrote:
    Harald van Djik <true...@gmail. comwrote:
    >Barry Schwarz wrote:
    >>CBFalconer <cbfalco...@yah oo.comwrote:
    >>>No it doesn't. foo is a char array. foo[1] is a char. By
    >>>definition , a char cannot have unused bits. Therefore *(p1 + 1)
    >>>is a valid char even though uninitialized.
    >>
    >>The issue is not whether foo[1] could possibly be a trap
    >>representatio n (or otherwise invalid) but whether it is
    >>indetermina te or not. Evaluating an indeterminate value invokes
    >>undefined behavior.
    [snip]
    I think mr Schwarz agreed with what I said, in which case he's also
    correct.
    [snip]
    Where have I claimed that it's undefined because the program accesses
    that value?
    Ignoring some of the confusion over wording, you didn't. I was replying to
    a message of Barry Schwarz, who did, and you thought you and he were in
    agreement. Are we all clear now?

    Comment

    • Antoninus Twink

      Re: Highly efficient string reversal code

      On 22 Sep 2008 at 22:41, Richard wrote:
      You know I really had to look hard at this to understand it. Horrible
      variables names almost purposely made "no standard" as far as C around
      the world goes for such a function. Unnecessary length check.
      Unnecessary size_t. Multiple assignments on one line making perusal
      with a debugger almost impossible.
      Yep - CBF is a living master-class in anti-style.
      Much nicer IMO:
      >
      #include <string.h>
      >
      char * my_strrev(char *s){
      char t;
      char *e = s+strlen(s)-1;
      while(e>s){
      t = *s;
      *s++ = *e;
      *e-- = t;
      }
      }
      Much nicer! Of course the "UB" that got the regs in a state will never
      be a problem in the real world.

      As efficiency was a concern for the OP, it might be worth dealing
      specially with the favorable case when both the string length and its
      starting location match up with the machine's natural alignment, so that
      one can use only aligned accesses.

      For example, on a 32-bit machine this might be more efficient (though of
      course only profiling will tell for sure):



      #define SWAP32(x) ((((x) & 0xff000000) >24) | (((x) & 0x00ff0000) >8) | \
      (((x) & 0x0000ff00) << 8) | (((x) & 0x000000ff) << 24))

      char *my_strrev(char *s)
      {
      size_t len = strlen(s);
      if(((unsigned long) s & len & 0x11)==0) {
      /* s aligned on 4-byte boundary, and length a multiple of 4 */
      unsigned *start = (unsigned *) s;
      unsigned *end = start + (len>>2) - 1;
      unsigned temp;
      while(end start) {
      temp = SWAP32(*start);
      *start++ = SWAP32(*end);
      *end-- = temp;
      }
      }
      else {
      /* use generic string reverser */
      }
      return s;
      }

      Comment

      • vippstar@gmail.com

        Re: Highly efficient string reversal code

        On Sep 28, 3:05 pm, Harald van D©¦k <true...@gmail. comwrote:
        On Sat, 27 Sep 2008 17:15:00 -0700, vippstar wrote:
        On Sep 27, 11:27 pm, CBFalconer <cbfalco...@yah oo.comwrote:
        vipps...@gmail. com wrote:
        Harald van Djik <true...@gmail. comwrote:
        Barry Schwarz wrote:
        >CBFalconer <cbfalco...@yah oo.comwrote:
        >>No it doesn't. foo is a char array. foo[1] is a char. By
        >>definition, a char cannot have unused bits. Therefore *(p1 + 1)
        >>is a valid char even though uninitialized.
        >
        >The issue is not whether foo[1] could possibly be a trap
        >representati on (or otherwise invalid) but whether it is
        >indeterminat e or not. Evaluating an indeterminate value invokes
        >undefined behavior.
        >
        [snip]
        >
        I think mr Schwarz agreed with what I said, in which case he's also
        correct.
        >
        [snip]
        >
        Where have I claimed that it's undefined because the program accesses
        that value?
        >
        Ignoring some of the confusion over wording, you didn't. I was replying to
        a message of Barry Schwarz, who did, and you thought you and he were in
        agreement. Are we all clear now?
        Yes, sorry for any confusion.

        Comment

        • Ben Bacarisse

          Re: Highly efficient string reversal code

          Antoninus Twink <nospam@nospam. invalidwrites:
          On 22 Sep 2008 at 22:41, Richard wrote:
          >You know I really had to look hard at this to understand it. Horrible
          >variables names almost purposely made "no standard" as far as C around
          >the world goes for such a function. Unnecessary length check.
          >Unnecessary size_t. Multiple assignments on one line making perusal
          >with a debugger almost impossible.
          <snip>
          As efficiency was a concern for the OP, it might be worth dealing
          specially with the favorable case when both the string length and its
          starting location match up with the machine's natural alignment, so that
          one can use only aligned accesses.
          >
          For example, on a 32-bit machine this might be more efficient (though of
          course only profiling will tell for sure):
          >
          >
          >
          #define SWAP32(x) ((((x) & 0xff000000) >24) | (((x) & 0x00ff0000) >8) | \
          (((x) & 0x0000ff00) << 8) | (((x) & 0x000000ff) << 24))
          >
          char *my_strrev(char *s)
          {
          size_t len = strlen(s);
          if(((unsigned long) s & len & 0x11)==0) {
          /* s aligned on 4-byte boundary, and length a multiple of 4 */
          unsigned *start = (unsigned *) s;
          unsigned *end = start + (len>>2) - 1;
          unsigned temp;
          while(end start) {
          temp = SWAP32(*start);
          *start++ = SWAP32(*end);
          *end-- = temp;
          }
          }
          else {
          /* use generic string reverser */
          }
          return s;
          }
          I presume you share Richard's preference for interactive debugging?

          --
          Ben.

          Comment

          • Antoninus Twink

            Re: Highly efficient string reversal code

            On 28 Sep 2008 at 21:23, Ben Bacarisse wrote:
            I presume you share Richard's preference for interactive debugging?
            Hmm? Apart from the fact that a | has wrongly become a & in this line...
            > if(((unsigned long) s & len & 0x11)==0) {
            ....(and as a result of the correction extra parentheses are needed), I
            don't see any obvious bug.

            Comment

            • Ben Bacarisse

              Re: Highly efficient string reversal code

              Antoninus Twink <nospam@nospam. invalidwrites:
              On 28 Sep 2008 at 21:23, Ben Bacarisse wrote:
              >I presume you share Richard's preference for interactive debugging?
              >
              Hmm? Apart from the fact that a | has wrongly become a & in this line...
              >
              >> if(((unsigned long) s & len & 0x11)==0) {
              >
              ...(and as a result of the correction extra parentheses are needed), I
              don't see any obvious bug.
              Nothing happens when len == 4.

              --
              Ben.

              Comment

              • Antoninus Twink

                Re: Highly efficient string reversal code

                On 28 Sep 2008 at 22:17, Ben Bacarisse wrote:
                Nothing happens when len == 4.
                Oops, yes, thanks.

                I did the usual amount of debugging before posting to Usenet (i.e.
                zero)...

                Comment

                • Tim Rentsch

                  Re: Highly efficient string reversal code

                  Barry Schwarz <schwarzb@dqel. comwrites:
                  On Fri, 26 Sep 2008 21:20:11 -0400, CBFalconer <cbfalconer@yah oo.com>
                  wrote:
                  [...]
                  >
                  No it doesn't. foo is a char array. foo[1] is a char. By
                  definition, a char cannot have unused bits. Therefore *(p1 + 1)
                  is a valid char even though uninitialized.
                  >
                  The issue is not whether foo[1] could possibly be a trap
                  representation (or otherwise invalid) but whether it is indeterminate
                  or not. Evaluating an indeterminate value invokes undefined behavior.
                  I believe you're wrong. Types such as unsigned char that don't
                  have trap representations only have unspecified values if
                  accessed, not undefined behavior. Please support your claim
                  with a section citation.

                  Comment

                  • Barry Schwarz

                    Re: Highly efficient string reversal code

                    On 09 Oct 2008 00:12:21 -0700, Tim Rentsch <txr@alumnus.ca ltech.edu>
                    wrote:
                    >Barry Schwarz <schwarzb@dqel. comwrites:
                    >
                    >On Fri, 26 Sep 2008 21:20:11 -0400, CBFalconer <cbfalconer@yah oo.com>
                    >wrote:
                    >[...]
                    >>
                    >No it doesn't. foo is a char array. foo[1] is a char. By
                    >definition, a char cannot have unused bits. Therefore *(p1 + 1)
                    >is a valid char even though uninitialized.
                    >>
                    >The issue is not whether foo[1] could possibly be a trap
                    >representati on (or otherwise invalid) but whether it is indeterminate
                    >or not. Evaluating an indeterminate value invokes undefined behavior.
                    >
                    >I believe you're wrong. Types such as unsigned char that don't
                    >have trap representations only have unspecified values if
                    >accessed, not undefined behavior. Please support your claim
                    >with a section citation.
                    The fact that char (all three types) does not have traps is covered in
                    6.2.6.1-5 and again in the 11th clause of J.2. That is completely
                    separate from the issue of indeterminate values which is covered in
                    the 10th clause of J.2 with references to 6.2.4, 6.7.8, and 6.8.
                    Unfortunately, those sections only state that the value is
                    indeterminate. And I could not find anything n1336 that improves the
                    situation.

                    --
                    Remove del for email

                    Comment

                    • Tim Rentsch

                      Re: Highly efficient string reversal code

                      Barry Schwarz <schwarzb@dqel. comwrites:
                      On 09 Oct 2008 00:12:21 -0700, Tim Rentsch <txr@alumnus.ca ltech.edu>
                      wrote:
                      >
                      Barry Schwarz <schwarzb@dqel. comwrites:
                      On Fri, 26 Sep 2008 21:20:11 -0400, CBFalconer <cbfalconer@yah oo.com>
                      wrote:
                      [...]
                      >
                      No it doesn't. foo is a char array. foo[1] is a char. By
                      definition, a char cannot have unused bits. Therefore *(p1 + 1)
                      is a valid char even though uninitialized.
                      >
                      The issue is not whether foo[1] could possibly be a trap
                      representation (or otherwise invalid) but whether it is indeterminate
                      or not. Evaluating an indeterminate value invokes undefined behavior.
                      I believe you're wrong. Types such as unsigned char that don't
                      have trap representations only have unspecified values if
                      accessed, not undefined behavior. Please support your claim
                      with a section citation.
                      >
                      The fact that char (all three types) does not have traps is covered in
                      6.2.6.1-5 and again in the 11th clause of J.2. That is completely
                      separate from the issue of indeterminate values which is covered in
                      the 10th clause of J.2 with references to 6.2.4, 6.7.8, and 6.8.
                      Unfortunately, those sections only state that the value is
                      indeterminate. And I could not find anything n1336 that improves the
                      situation.
                      6.8 p 3 says "(including storing an indeterminate value in
                      objects without an initializer)". Since (unsigned char) doesn't
                      have any trap representations , storing an indeterminate value
                      means storing an unspecified value. The comment in J.2 item 10
                      is a non-normative simplification.

                      On implementations where other types (e.g., unsigned int) don't
                      have trap representations , leaving such variables uninitialized
                      also produces only unspecified values, not undefined behavior.

                      Incidentally, I think it's generally agreed that (char) and
                      (signed char) can have trap representations , even though
                      (unsigned char) cannot.

                      Comment

                      Working...