Can I Trust Pointer Arithmetic In Re-Allocated Memory?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Bill Reid

    Can I Trust Pointer Arithmetic In Re-Allocated Memory?

    Bear with me, as I am not a "profession al" programmer, but I was
    working on part of program that reads parts of four text files into
    a buffer which I re-allocate the size as I read each file. I read some
    of the items from the bottom up of the buffer, and some from the
    top down, moving the bottom items back to the new re-allocated
    bottom on every file read.

    Then when I've read all four files, I sort the top and bottom items
    separately using qsort(), which takes a pointer to a list of items, and
    write the two sorted lists to two new files.

    Problem is, I worry that if I just supply a pointer to the first item
    in the bottom list to qsort(), it might point out to bozo-land during
    the sort because I thought that dynamically re-allocated memory
    is not necessarily contiguous. So I've done a little two step where
    I write the bottom list to another buffer to do the sorting and writing,
    and everything works great, but I'm wondering if I'm wasting time
    and worrying about nothing...after all, if I can't trust a pointer to an
    arbitrary point in the list, how can I trust a pointer to the start of
    the list?

    Any light you can shed on how pointers are handled in dynamically
    allocated memory would be interesting and helpful...thank s.

    ---
    William Ernest Reid



  • Barry Schwarz

    #2
    Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?

    On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
    <hormelfree@hap pyhealthy.netwr ote:
    >Bear with me, as I am not a "profession al" programmer, but I was
    >working on part of program that reads parts of four text files into
    >a buffer which I re-allocate the size as I read each file. I read some
    >of the items from the bottom up of the buffer, and some from the
    >top down, moving the bottom items back to the new re-allocated
    >bottom on every file read.
    I don't quite follow this description.
    >
    >Then when I've read all four files, I sort the top and bottom items
    >separately using qsort(), which takes a pointer to a list of items, and
    >write the two sorted lists to two new files.
    >
    >Problem is, I worry that if I just supply a pointer to the first item
    >in the bottom list to qsort(), it might point out to bozo-land during
    >the sort because I thought that dynamically re-allocated memory
    >is not necessarily contiguous. So I've done a little two step where
    The block of memory whose non-NULL address is returned from
    malloc/realloc/calloc is guaranteed to be contiguous. You memory is
    allocated from address to address+size-1. Furthermore, calculating
    the value address+size is always allowed but you may not dereference
    this address.
    >I write the bottom list to another buffer to do the sorting and writing,
    >and everything works great, but I'm wondering if I'm wasting time
    >and worrying about nothing...after all, if I can't trust a pointer to an
    >arbitrary point in the list, how can I trust a pointer to the start of
    >the list?
    >
    >Any light you can shed on how pointers are handled in dynamically
    >allocated memory would be interesting and helpful...thank s.
    A pointer value between the limits mentioned above is within range of
    the allocated memory. You have to insure alignment but if the pointer
    has the correct type the compiler will do this for you.


    Remove del for email

    Comment

    • Bill Reid

      #3
      Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?


      Barry Schwarz <schwarzb@doezl .netwrote in message
      news:k93od21vgd 6n6fhrg6tooem3r 5j06ejrq4@4ax.c om...
      On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
      <hormelfree@hap pyhealthy.netwr ote:
      >
      Bear with me, as I am not a "profession al" programmer, but I was
      working on part of program that reads parts of four text files into
      a buffer which I re-allocate the size as I read each file. I read some
      of the items from the bottom up of the buffer, and some from the
      top down, moving the bottom items back to the new re-allocated
      bottom on every file read.
      >
      I don't quite follow this description.
      >
      Yeah, it's a little confusing, and not that relevant to what I'm
      asking...the
      bottom line is I want to separately sort two parts of a list...

      Then when I've read all four files, I sort the top and bottom items
      separately using qsort(), which takes a pointer to a list of items, and
      write the two sorted lists to two new files.

      Problem is, I worry that if I just supply a pointer to the first item
      in the bottom list to qsort(), it might point out to bozo-land during
      the sort because I thought that dynamically re-allocated memory
      is not necessarily contiguous. So I've done a little two step where
      >
      The block of memory whose non-NULL address is returned from
      malloc/realloc/calloc is guaranteed to be contiguous.
      OK, that's the answer, I was just plain wrong that the memory
      might not be contiguous...I' ve probably only read that guarantee
      about 100000000000 times but just forgot it.

      I think I got that confused with the idea that the re-allocated
      block may have a different location than the original malloc, which
      would mean...
      You memory is
      allocated from address to address+size-1. Furthermore, calculating
      the value address+size is always allowed but you may not dereference
      this address.
      >
      ....you wouldn't want to dereference an address, right.
      I write the bottom list to another buffer to do the sorting and writing,
      and everything works great, but I'm wondering if I'm wasting time
      and worrying about nothing...after all, if I can't trust a pointer to an
      arbitrary point in the list, how can I trust a pointer to the start of
      the list?

      Any light you can shed on how pointers are handled in dynamically
      allocated memory would be interesting and helpful...thank s.
      >
      A pointer value between the limits mentioned above is within range of
      the allocated memory. You have to insure alignment but if the pointer
      has the correct type the compiler will do this for you.
      >
      OK, so this should be completely legal and flawless:

      /* sort the symbol list alphabetically */
      qsort((void *)curr_instrs,n um_symbols,128, sort_alpha_list );

      then...

      /* sort the no-symbol list alphabetically */
      qsort((void *)curr_instrs+n um_symbols,num_ no_symbols,128, sort_alpha_list );

      First qsort() sorts down to the end of the symbols part of the list,
      the second sorts down from the start of the no-symbols part of the
      list to the end of the list. I guess it was the (void *) cast that scared
      me...thanks.

      ---
      William Ernest Reid



      Comment

      • Keith Thompson

        #4
        Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?

        "Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
        Barry Schwarz <schwarzb@doezl .netwrote in message
        news:k93od21vgd 6n6fhrg6tooem3r 5j06ejrq4@4ax.c om...
        >On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
        ><hormelfree@ha ppyhealthy.netw rote:
        >>
        >Bear with me, as I am not a "profession al" programmer, but I was
        >working on part of program that reads parts of four text files into
        >a buffer which I re-allocate the size as I read each file. I read some
        >of the items from the bottom up of the buffer, and some from the
        >top down, moving the bottom items back to the new re-allocated
        >bottom on every file read.
        >>
        >I don't quite follow this description.
        >>
        Yeah, it's a little confusing, and not that relevant to what I'm
        asking...the
        bottom line is I want to separately sort two parts of a list...
        >
        >Then when I've read all four files, I sort the top and bottom items
        >separately using qsort(), which takes a pointer to a list of items, and
        >write the two sorted lists to two new files.
        >
        >Problem is, I worry that if I just supply a pointer to the first item
        >in the bottom list to qsort(), it might point out to bozo-land during
        >the sort because I thought that dynamically re-allocated memory
        >is not necessarily contiguous. So I've done a little two step where
        >>
        >The block of memory whose non-NULL address is returned from
        >malloc/realloc/calloc is guaranteed to be contiguous.
        >
        OK, that's the answer, I was just plain wrong that the memory
        might not be contiguous...I' ve probably only read that guarantee
        about 100000000000 times but just forgot it.
        >
        I think I got that confused with the idea that the re-allocated
        block may have a different location than the original malloc, which
        would mean...
        One thing that I found a little confusing in your original message is
        that you talked about "re-allocated" memory, but you didn't mention
        the "realloc" function. The more specific your description, the more
        likely it is that we can help.

        [...]
        OK, so this should be completely legal and flawless:
        >
        /* sort the symbol list alphabetically */
        qsort((void *)curr_instrs,n um_symbols,128, sort_alpha_list );
        >
        then...
        >
        /* sort the no-symbol list alphabetically */
        qsort((void *)curr_instrs+n um_symbols,num_ no_symbols,128, sort_alpha_list );
        Um, no.

        Don't be afraid of whitespace. I put blanks around most operator
        symbols, and after every comma. If I have to split something across
        lines, that's ok. So I'd write your qsort call as:

        qsort((void *)curr_instrs + num_symbols,
        num_no_symbols,
        128,
        sort_alpha_list );

        The third argument, 128, is a "magic number". It's very difficult to
        tell what it means or whether it's even correct. Define a constant:
        #define WHATEVER 128
        so you only need to change it in one place (but pick a better name, of
        course).

        The first argument to qsort is:

        (void *)curr_instrs + num_symbols

        You can't do pointer arithmetic on a void* value. (Some compilers may
        allow it; if you're using gcc, try "-ansi -pedantic -Wall -W", or
        replace "-ansi" with "-std=c99").

        If you're trying to get the address pointed to by curr_instrs plus an
        offset of num_symbols bytes, you'll need to to the arithmetic using
        char*:

        qsort((char*)cu rr_instrs + num_symbols,
        /* other args */);

        assuming that curr_instrs isn't already a char*. Note that I didn't
        cast the expression to void*; any pointer-to-object type can be
        converted to void*, or vice versa.

        --
        Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
        San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
        We must do something. This is something. Therefore, we must do this.

        Comment

        • Bill Reid

          #5
          Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?


          Keith Thompson <kst-u@mib.orgwrote in message
          news:lnk65fydta .fsf@nuthaus.mi b.org...
          "Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
          Barry Schwarz <schwarzb@doezl .netwrote in message
          news:k93od21vgd 6n6fhrg6tooem3r 5j06ejrq4@4ax.c om...
          On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
          <hormelfree@hap pyhealthy.netwr ote:
          >
          Bear with me, as I am not a "profession al" programmer, but I was
          working on part of program that reads parts of four text files into
          a buffer which I re-allocate the size as I read each file. I read
          some
          of the items from the bottom up of the buffer, and some from the
          top down, moving the bottom items back to the new re-allocated
          bottom on every file read.
          >
          I don't quite follow this description.
          >
          Yeah, it's a little confusing, and not that relevant to what I'm
          asking...the
          bottom line is I want to separately sort two parts of a list...

          Then when I've read all four files, I sort the top and bottom items
          separately using qsort(), which takes a pointer to a list of items,
          and
          write the two sorted lists to two new files.

          Problem is, I worry that if I just supply a pointer to the first item
          in the bottom list to qsort(), it might point out to bozo-land during
          the sort because I thought that dynamically re-allocated memory
          is not necessarily contiguous. So I've done a little two step where
          >
          The block of memory whose non-NULL address is returned from
          malloc/realloc/calloc is guaranteed to be contiguous.
          OK, that's the answer, I was just plain wrong that the memory
          might not be contiguous...I' ve probably only read that guarantee
          about 100000000000 times but just forgot it.

          I think I got that confused with the idea that the re-allocated
          block may have a different location than the original malloc, which
          would mean...
          >
          One thing that I found a little confusing in your original message is
          that you talked about "re-allocated" memory, but you didn't mention
          the "realloc" function. The more specific your description, the more
          likely it is that we can help.
          >
          Well, OK, maybe, here's canonical specificity:

          /* now re-allocate memory for the instrument strings */
          if((curr_instrs =(instr_strs *)
          realloc(curr_in strs,num_instrs *sizeof(instr_s trs)))==NULL) {
          printf("Not enough memory for instruments buffer\n");
          goto CloseFiles;
          }

          Does that help you help me?
          [...]
          >
          OK, so this should be completely legal and flawless:

          /* sort the symbol list alphabetically */
          qsort((void *)curr_instrs,n um_symbols,128, sort_alpha_list );

          then...

          /* sort the no-symbol list alphabetically */
          qsort((void
          *)curr_instrs+n um_symbols,num_ no_symbols,128, sort_alpha_list );
          >
          Um, no.
          >
          By "legal and flawless" I DID mean "100% guaranteed functional",
          not "pleasing to thine eyes"...
          Don't be afraid of whitespace. I put blanks around most operator
          symbols, and after every comma. If I have to split something across
          lines, that's ok. So I'd write your qsort call as:
          >
          qsort((void *)curr_instrs + num_symbols,
          num_no_symbols,
          128,
          sort_alpha_list );
          >
          That's the way YOU'D do it, I do it differently, and since I'm the only
          one reading it (except in this one rare instance, or occasionally I'll post
          some code somewhere on the net), I can read it just fine, and of
          course it compiles all the same...
          The third argument, 128, is a "magic number". It's very difficult to
          tell what it means or whether it's even correct. Define a constant:
          #define WHATEVER 128
          In qsort(), it's basically 128 (character) bytes.

          I've actually got "128" defined globally (and I do mean globally, for
          several hundred thousand lines of code) for the purposes of reading
          and writing strings of certain lengths. And those damned defines
          have managed to screw me up royally several times, including a
          really irritating "intermitte nt" problem I had when I first wrote this
          particular section of code. So lately I've been using them less
          and less...
          so you only need to change it in one place (but pick a better name, of
          course).
          >
          Even at file scope right now I'm more comfortable with the way it
          is...
          The first argument to qsort is:
          >
          (void *)curr_instrs + num_symbols
          >
          You can't do pointer arithmetic on a void* value. (Some compilers may
          allow it; if you're using gcc, try "-ansi -pedantic -Wall -W", or
          replace "-ansi" with "-std=c99").
          >
          Then how does qsort() do it? I'm assuming now that it must just
          use pointer arithmetic internally, because it doesn't seem to want or
          recognize my typedef of a 128-character string:

          typedef char instr_strs[128];
          instr_strs *curr_instrs;
          If you're trying to get the address pointed to by curr_instrs plus an
          offset of num_symbols bytes, you'll need to to the arithmetic using
          char*:
          >
          qsort((char*)cu rr_instrs + num_symbols,
          /* other args */);
          >
          assuming that curr_instrs isn't already a char*.
          Nope, a pointer to the first of many 128-character strings, as above, so
          are you saying the pointer cast should be (instr_strs *)? I have no problem
          with that, as long as it works, and I must stress again at this point that
          the current code:

          /* sort the symbol list alphabetically */
          qsort((void *)curr_instrs,n um_symbols,128, sort_alpha_list );

          Has worked flawlessly for months now; it's part of a particular section
          of code that downloads about 3/4 meg of raw data from the net every
          day at a specific time, parses out about 100,000 data items, and writes
          them to a custom database in a matter of seconds.

          The only reason I asked the original question was because I went
          back and reviewed the code and wondered if I could shave a few
          more milliseconds off the execution time...

          Note that I didn't
          cast the expression to void*; any pointer-to-object type can be
          converted to void*, or vice versa.
          >
          Yeah, I noticed that, I just use (void *) because that's what
          I thought qsort() wanted, and it definitely WORKS that way
          (I've used qsort() dozens of times EXACTLY that way without
          problems).

          Now to get back to this:
          If you're trying to get the address pointed to by curr_instrs plus an
          offset of num_symbols bytes, you'll need to to the arithmetic using
          char*:
          >
          qsort((char*)cu rr_instrs + num_symbols,
          /* other args */);
          I think I see what you're saying, maybe...and maybe not...

          If curr_instrs is pointer to a 128-character string type, wouldn't
          curr_instrs+num _symbols then point to a location offset from
          curr_instrs by (num_symbols*12 8 bytes)? And if so, what's
          the point of cast (char *) if qsort() already works by sorting
          some specified number of sequences of some specified
          number of character bytes?

          I thought I had the answer to my original question, and then it
          slipped away from me...

          ---
          William Ernest Reid



          Comment

          • Keith Thompson

            #6
            Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?

            "Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
            Keith Thompson <kst-u@mib.orgwrote in message
            news:lnk65fydta .fsf@nuthaus.mi b.org...
            [...]
            >One thing that I found a little confusing in your original message is
            >that you talked about "re-allocated" memory, but you didn't mention
            >the "realloc" function. The more specific your description, the more
            >likely it is that we can help.
            >>
            Well, OK, maybe, here's canonical specificity:
            >
            /* now re-allocate memory for the instrument strings */
            if((curr_instrs =(instr_strs *)
            realloc(curr_in strs,num_instrs *sizeof(instr_s trs)))==NULL) {
            printf("Not enough memory for instruments buffer\n");
            goto CloseFiles;
            }
            >
            Does that help you help me?
            A little, but there are still a bunch of identifiers whose
            declarations I haven't seen.

            I will make one comment: Don't cast the result of malloc() or
            realloc(). See section 7 of the comp.lang.c FAQ,
            <http://www.c-faq.com/>, particularly questions 7.7b.
            >[...]
            >>
            OK, so this should be completely legal and flawless:
            >
            /* sort the symbol list alphabetically */
            qsort((void *)curr_instrs,n um_symbols,128, sort_alpha_list );
            >
            then...
            >
            /* sort the no-symbol list alphabetically */
            qsort((void
            *)curr_instrs+n um_symbols,num_ no_symbols,128, sort_alpha_list );
            >>
            >Um, no.
            >>
            By "legal and flawless" I DID mean "100% guaranteed functional",
            not "pleasing to thine eyes"...
            The code isn't 100% guaranteed functional". You're performing
            arithmetic on a void*. That's not allowed in standard C.
            >Don't be afraid of whitespace. I put blanks around most operator
            >symbols, and after every comma. If I have to split something across
            >lines, that's ok. So I'd write your qsort call as:
            >>
            > qsort((void *)curr_instrs + num_symbols,
            > num_no_symbols,
            > 128,
            > sort_alpha_list );
            >>
            That's the way YOU'D do it, I do it differently, and since I'm the only
            one reading it (except in this one rare instance, or occasionally I'll post
            some code somewhere on the net), I can read it just fine, and of
            course it compiles all the same...
            Ok, but I find it more difficult to read without the whitespace.
            Whenever you post code here, you can expect comments on its style.
            You're under no obligation to pay attention.
            >The third argument, 128, is a "magic number". It's very difficult to
            >tell what it means or whether it's even correct. Define a constant:
            > #define WHATEVER 128
            >
            In qsort(), it's basically 128 (character) bytes.
            Ok, but why 128 rather than 127, or 100, or 256? That's a rhetorical
            question; you don't need to answer it, but ideally your code should.
            (And yes, it's a style issue.)
            I've actually got "128" defined globally (and I do mean globally, for
            several hundred thousand lines of code) for the purposes of reading
            and writing strings of certain lengths. And those damned defines
            have managed to screw me up royally several times, including a
            really irritating "intermitte nt" problem I had when I first wrote this
            particular section of code. So lately I've been using them less
            and less...
            >
            >so you only need to change it in one place (but pick a better name, of
            >course).
            >>
            Even at file scope right now I'm more comfortable with the way it
            is...
            Ok, it's your code, but I'm quite surprised that defining symbolic
            constants would cause more problems than it would solve.

            If someone else needs to maintain your code (and "someone else" could
            be you a year from now), it's not going to be obvious that the 128 in
            this function corresponds to the 128 (or 127) in another function, but
            the 128 in that function over there is just coincidental. There's a
            good discussion at <http://c-faq.com/~scs/cclass/notes/sx9b.html>.
            >The first argument to qsort is:
            >>
            > (void *)curr_instrs + num_symbols
            >>
            >You can't do pointer arithmetic on a void* value. (Some compilers may
            >allow it; if you're using gcc, try "-ansi -pedantic -Wall -W", or
            >replace "-ansi" with "-std=c99").
            >>
            Then how does qsort() do it? I'm assuming now that it must just
            use pointer arithmetic internally, because it doesn't seem to want or
            recognize my typedef of a 128-character string:
            >
            typedef char instr_strs[128];
            instr_strs *curr_instrs;
            qsort behaves in a manner consistent with its specification. That's
            all you really need to know. It needn't even be implemented in C, and
            if it is, it's free to use compiler-specific extensions.

            But if it's implemented in standard C (which is entirely possible), it
            presumably would convert the void* arguments to char* before
            performing arithmetic on them. (Since char* and void* have the same
            representation, the conversion doesn't cost anything at run time.)
            >If you're trying to get the address pointed to by curr_instrs plus an
            >offset of num_symbols bytes, you'll need to to the arithmetic using
            >char*:
            >>
            > qsort((char*)cu rr_instrs + num_symbols,
            > /* other args */);
            >>
            >assuming that curr_instrs isn't already a char*.
            >
            Nope, a pointer to the first of many 128-character strings, as above, so
            are you saying the pointer cast should be (instr_strs *)? I have no problem
            with that, as long as it works, and I must stress again at this point that
            the current code:
            >
            /* sort the symbol list alphabetically */
            qsort((void *)curr_instrs,n um_symbols,128, sort_alpha_list );
            >
            Has worked flawlessly for months now; it's part of a particular section
            of code that downloads about 3/4 meg of raw data from the net every
            day at a specific time, parses out about 100,000 data items, and writes
            them to a custom database in a matter of seconds.
            That qsort() call isn't the one with the problem.

            Incidentally, a piece of code is either correct or not. The number of
            times it "works" really doesn't prove anything. If your compiler
            accepts some non-portable code, it's probably going to keep working
            the same way indefinitely -- but it will fail the first time you
            compile it with a different compiler, or with the same compiler and
            different options. Correctness is not statistical.

            One pitfall of C is that there are a lot of errors that your compiler
            isn't required to tell you about. Many things invoke undefined
            behavior; they may appear to work, but the language doesn't guarantee
            anything. Other things may be compiler-specific extensions. The
            language requires a conforming implementation to issue a diagnostic
            message for many of these -- but many compilers (including gcc) are
            not conforming in their default mode. Typically you can use
            command-line options to enable a conforming mode and provide
            additional warnings.
            The only reason I asked the original question was because I went
            back and reviewed the code and wondered if I could shave a few
            more milliseconds off the execution time...
            >
            Note that I didn't
            >cast the expression to void*; any pointer-to-object type can be
            >converted to void*, or vice versa.
            >>
            Yeah, I noticed that, I just use (void *) because that's what
            I thought qsort() wanted, and it definitely WORKS that way
            (I've used qsort() dozens of times EXACTLY that way without
            problems).
            Yes, it works, but it's not necessary. As a general rule, casts
            should be avoided unless they're actually required. A cast is, among
            other things, a promise to the compiler that you know what you're
            doing, and will often inhibit warnings and error messages. In this
            case, the argument will be implicitly converted to void* without the
            cast (assuming you have a visible prototype for qsort() -- i.e., you
            haven't forgotten the "#include <stdlib.h>".) The code is perfectly
            correct either way, but the form with the cast is more "brittle". If
            the cast had specified the wrong type, for example, the compiler
            likely wouldn't have told you about the error.
            Now to get back to this:
            >
            >If you're trying to get the address pointed to by curr_instrs plus an
            >offset of num_symbols bytes, you'll need to to the arithmetic using
            >char*:
            >>
            > qsort((char*)cu rr_instrs + num_symbols,
            > /* other args */);
            >
            I think I see what you're saying, maybe...and maybe not...
            >
            If curr_instrs is pointer to a 128-character string type, wouldn't
            curr_instrs+num _symbols then point to a location offset from
            curr_instrs by (num_symbols*12 8 bytes)? And if so, what's
            the point of cast (char *) if qsort() already works by sorting
            some specified number of sequences of some specified
            number of character bytes?
            I haven't seen the full context of your code (or if I have, I've
            forgotten it). Your original code had

            (void*)curr_ins trs + num_symbols

            which is illegal, because you can't perform pointer arithmetic on
            void* (the cast applies to "curr_instr s", not to "curr_instr s +
            num_symbols"). Pointer arithmetic, as you probably know, is scaled by
            the size of the pointed-to type.

            Are you using gcc? If so, it supports arithmetic on void* as an
            extension; it acts like arithmetic on char*. (IMHO, this extension is
            a bad idea.) By casting curr_instrs to void*, you cause the "+
            num_symbols" to denote an offset of num_symbols *bytes*. I had
            guessed that that's what you wanted, but apparently it isn't.

            I think what you *really* wanted was for the addition to be scaled by
            sizeof *curr_instrs (128 bytes?). If so, you probably meant to use

            (void*)(curr_in strs + num_symbols)

            which should work. But since the argument will be implicitly
            converted to void* anyway, all you need is

            curr_instrs + num_symbols

            In other words, all you need to do is drop the cast. This avoids
            depending on a compiler-specific extension *and* corrects a bug. It's
            also a very nice demonstration of why unnecessary casts should be
            avoided.

            --
            Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
            San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
            We must do something. This is something. Therefore, we must do this.

            Comment

            • Bill Reid

              #7
              Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?


              Keith Thompson <kst-u@mib.orgwrote in message
              news:lnzmebw00v .fsf@nuthaus.mi b.org...
              "Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
              Keith Thompson <kst-u@mib.orgwrote in message
              news:lnk65fydta .fsf@nuthaus.mi b.org...
              [...]
              One thing that I found a little confusing in your original message is
              that you talked about "re-allocated" memory, but you didn't mention
              the "realloc" function. The more specific your description, the more
              likely it is that we can help.
              >
              Well, OK, maybe, here's canonical specificity:

              /* now re-allocate memory for the instrument strings */
              if((curr_instrs =(instr_strs *)
              realloc(curr_in strs,num_instrs *sizeof(instr_s trs)))==NULL) {
              printf("Not enough memory for instruments buffer\n");
              goto CloseFiles;
              }

              Does that help you help me?
              >
              A little, but there are still a bunch of identifiers whose
              declarations I haven't seen.
              >
              Exactly why I didn't want to post any code in the first place,
              just wanted to ask a verbal question (see Subject). I'm calling
              on all types of custom libraries for this data downloading function,
              and the more you see, the more you won't recognize...
              I will make one comment: Don't cast the result of malloc() or
              realloc(). See section 7 of the comp.lang.c FAQ,
              <http://www.c-faq.com/>, particularly questions 7.7b.
              >
              OK, I think I've heard some type of debate about this, I thought
              (based on the DOCUMENTATION THAT CAME WITH MY
              FRIGGIN' DEVELOPMENT PACKAGE) that was what you
              were supposed to do; it has seemed to work OK...
              [...]
              >
              OK, so this should be completely legal and flawless:

              /* sort the symbol list alphabetically */
              qsort((void *)curr_instrs,n um_symbols,128, sort_alpha_list );

              then...

              /* sort the no-symbol list alphabetically */
              qsort((void
              *)curr_instrs+n um_symbols,num_ no_symbols,128, sort_alpha_list );
              >
              Um, no.
              >
              By "legal and flawless" I DID mean "100% guaranteed functional",
              not "pleasing to thine eyes"...
              >
              The code isn't 100% guaranteed functional". You're performing
              arithmetic on a void*. That's not allowed in standard C.
              >
              Yes, sort of, I recognized my mistake after I last hit "send", and
              way below, you hit on the actual error I made...
              Don't be afraid of whitespace. I put blanks around most operator
              symbols, and after every comma. If I have to split something across
              lines, that's ok. So I'd write your qsort call as:
              >
              qsort((void *)curr_instrs + num_symbols,
              num_no_symbols,
              128,
              sort_alpha_list );
              >
              That's the way YOU'D do it, I do it differently, and since I'm the only
              one reading it (except in this one rare instance, or occasionally I'll
              post
              some code somewhere on the net), I can read it just fine, and of
              course it compiles all the same...
              >
              Ok, but I find it more difficult to read without the whitespace.
              Whenever you post code here, you can expect comments on its style.
              You're under no obligation to pay attention.
              >
              The third argument, 128, is a "magic number". It's very difficult to
              tell what it means or whether it's even correct. Define a constant:
              #define WHATEVER 128
              In qsort(), it's basically 128 (character) bytes.
              >
              Ok, but why 128 rather than 127, or 100, or 256? That's a rhetorical
              question; you don't need to answer it, but ideally your code should.
              (And yes, it's a style issue.)
              >
              I have my reasons, the most important of which I would think would
              be obvious, and a secondary reason which should also be both apparent
              and not really important at the same time...
              I've actually got "128" defined globally (and I do mean globally, for
              several hundred thousand lines of code) for the purposes of reading
              and writing strings of certain lengths. And those damned defines
              have managed to screw me up royally several times, including a
              really irritating "intermitte nt" problem I had when I first wrote this
              particular section of code. So lately I've been using them less
              and less...
              so you only need to change it in one place (but pick a better name, of
              course).
              >
              Even at file scope right now I'm more comfortable with the way it
              is...
              >
              Ok, it's your code, but I'm quite surprised that defining symbolic
              constants would cause more problems than it would solve.
              >
              As I've alluded, it partly depends on the scope. What I've found
              the hard way is that you should really know EXACTLY what you
              need AT THE MOST LOCAL LEVEL OF SCOPE. And I
              found I kept forgetting what my global defines were, and I would
              use an inappropriate one where I knew EXACTLY what I needed
              RIGHT THERE. Among other problems...

              In this case, maybe a sizeof() would be better...
              If someone else needs to maintain your code (and "someone else" could
              be you a year from now), it's not going to be obvious that the 128 in
              this function corresponds to the 128 (or 127) in another function, but
              the 128 in that function over there is just coincidental. There's a
              good discussion at <http://c-faq.com/~scs/cclass/notes/sx9b.html>.
              >
              Everything is local to a single 1000-line function for this data downloading
              operation; any (many) calls out to my custom libraries don't care about
              string lengths because they do a strlen() on the passed string pointer
              on entry.
              The first argument to qsort is:
              >
              (void *)curr_instrs + num_symbols
              >
              You can't do pointer arithmetic on a void* value. (Some compilers may
              allow it; if you're using gcc, try "-ansi -pedantic -Wall -W", or
              replace "-ansi" with "-std=c99").
              >
              Then how does qsort() do it? I'm assuming now that it must just
              use pointer arithmetic internally, because it doesn't seem to want or
              recognize my typedef of a 128-character string:

              typedef char instr_strs[128];
              instr_strs *curr_instrs;
              >
              qsort behaves in a manner consistent with its specification. That's
              all you really need to know. It needn't even be implemented in C, and
              if it is, it's free to use compiler-specific extensions.
              >
              But if it's implemented in standard C (which is entirely possible), it
              presumably would convert the void* arguments to char* before
              performing arithmetic on them. (Since char* and void* have the same
              representation, the conversion doesn't cost anything at run time.)
              >
              Yeah, apparently it only processes a char* at a time, and the
              declaration of void* just prevents somebody from stupidly passing
              the wrong starting point, or something...
              If you're trying to get the address pointed to by curr_instrs plus an
              offset of num_symbols bytes, you'll need to to the arithmetic using
              char*:
              >
              qsort((char*)cu rr_instrs + num_symbols,
              /* other args */);
              >
              assuming that curr_instrs isn't already a char*.
              Nope, a pointer to the first of many 128-character strings, as above, so
              are you saying the pointer cast should be (instr_strs *)? I have no
              problem
              with that, as long as it works, and I must stress again at this point
              that
              the current code:

              /* sort the symbol list alphabetically */
              qsort((void *)curr_instrs,n um_symbols,128, sort_alpha_list );

              Has worked flawlessly for months now; it's part of a particular section
              of code that downloads about 3/4 meg of raw data from the net every
              day at a specific time, parses out about 100,000 data items, and writes
              them to a custom database in a matter of seconds.
              >
              That qsort() call isn't the one with the problem.
              >
              Exactly. It was the one I was going to add that had a problem...
              Incidentally, a piece of code is either correct or not. The number of
              times it "works" really doesn't prove anything. If your compiler
              accepts some non-portable code, it's probably going to keep working
              the same way indefinitely -- but it will fail the first time you
              compile it with a different compiler, or with the same compiler and
              different options. Correctness is not statistical.
              >
              Try telling that to a third-grade teacher grading tests...
              One pitfall of C is that there are a lot of errors that your compiler
              isn't required to tell you about. Many things invoke undefined
              behavior; they may appear to work, but the language doesn't guarantee
              anything. Other things may be compiler-specific extensions. The
              language requires a conforming implementation to issue a diagnostic
              message for many of these -- but many compilers (including gcc) are
              not conforming in their default mode. Typically you can use
              command-line options to enable a conforming mode and provide
              additional warnings.
              >
              Most importantly in my case, a compiler is not a mind-reader...
              The only reason I asked the original question was because I went
              back and reviewed the code and wondered if I could shave a few
              more milliseconds off the execution time...

              Note that I didn't
              cast the expression to void*; any pointer-to-object type can be
              converted to void*, or vice versa.
              >
              Yeah, I noticed that, I just use (void *) because that's what
              I thought qsort() wanted, and it definitely WORKS that way
              (I've used qsort() dozens of times EXACTLY that way without
              problems).
              >
              Yes, it works, but it's not necessary.
              Are you sure? Somehow, it seems like I tried it without the cast, and
              got an error, but if that actually happened, it was years, hell, decades
              ago.

              Like most people, I'm a victim of experience: I just keep doing what
              works...
              As a general rule, casts
              should be avoided unless they're actually required.
              The question here would be: does qsort() require it? Here's the
              documentation:

              Syntax

              #include <stdlib.h>
              void qsort(void *base, size_t nelem, size_t width,
              int (_USERENTRY *fcmp)(const void *, const void *));

              and the example from the documentation:

              int sort_function( const void *a, const void *b);
              char list[5][4] = { "cat", "car", "cab", "cap", "can" };

              int main(void)
              {
              int x;

              qsort((void *)list, 5, sizeof(list[0]), sort_function);
              for (x = 0; x < 5; x++)
              printf("%s\n", list[x]);
              return 0;
              }

              int sort_function( const void *a, const void *b)
              {
              return( strcmp((char *)a,(char *)b) );
              }

              Unlike some example documentation that I can think of, THAT one
              actually works as advertised...bu t doesn't mean that the cast is
              required...
              A cast is, among
              other things, a promise to the compiler that you know what you're
              doing, and will often inhibit warnings and error messages. In this
              case, the argument will be implicitly converted to void* without the
              cast (assuming you have a visible prototype for qsort() -- i.e., you
              haven't forgotten the "#include <stdlib.h>".) The code is perfectly
              correct either way, but the form with the cast is more "brittle". If
              the cast had specified the wrong type, for example, the compiler
              likely wouldn't have told you about the error.
              >
              Well, OK, I just compiled it without the cast, and it came
              up clean. Then I immediately pasted the cast back in place, since
              this is "production code", and from a perfectly practical standpoint,
              the void* cast is 100% functional IN THIS CASE, so I'm loathe
              to mess anything up...
              Now to get back to this:
              If you're trying to get the address pointed to by curr_instrs plus an
              offset of num_symbols bytes, you'll need to to the arithmetic using
              char*:
              >
              qsort((char*)cu rr_instrs + num_symbols,
              /* other args */);
              I think I see what you're saying, maybe...and maybe not...

              If curr_instrs is pointer to a 128-character string type, wouldn't
              curr_instrs+num _symbols then point to a location offset from
              curr_instrs by (num_symbols*12 8 bytes)? And if so, what's
              the point of cast (char *) if qsort() already works by sorting
              some specified number of sequences of some specified
              number of character bytes?
              >
              I haven't seen the full context of your code (or if I have, I've
              forgotten it). Your original code had
              >
              (void*)curr_ins trs + num_symbols
              >
              which is illegal, because you can't perform pointer arithmetic on
              void* (the cast applies to "curr_instr s", not to "curr_instr s +
              num_symbols"). Pointer arithmetic, as you probably know, is scaled by
              the size of the pointed-to type.
              >
              Actually, SEQUENCE POINTS!!!! Yes, I know now this is
              wrong...
              Are you using gcc? If so, it supports arithmetic on void* as an
              extension; it acts like arithmetic on char*. (IMHO, this extension is
              a bad idea.) By casting curr_instrs to void*, you cause the "+
              num_symbols" to denote an offset of num_symbols *bytes*. I had
              guessed that that's what you wanted, but apparently it isn't.
              >
              Nope, 128-character strings...
              I think what you *really* wanted was for the addition to be scaled by
              sizeof *curr_instrs (128 bytes?). If so, you probably meant to use
              >
              (void*)(curr_in strs + num_symbols)
              >
              which should work.
              EXACTLY!
              But since the argument will be implicitly
              converted to void* anyway, all you need is
              >
              curr_instrs + num_symbols
              >
              In other words, all you need to do is drop the cast. This avoids
              depending on a compiler-specific extension *and* corrects a bug. It's
              also a very nice demonstration of why unnecessary casts should be
              avoided.
              >
              OK, I WILL try that to replace this nonsense, along with the unneeded
              malloc and free for the no-symbols list:

              /* put bottom of instruments list into no-symbols list */
              swap_idx=0;
              no_symbol_idx=n um_symbols;
              while(no_symbol _idx<num_instrs ) {
              strcpy(curr_no_ symbls[swap_idx],
              curr_instrs[no_symbol_idx]);
              no_symbol_idx++ ;
              swap_idx++;
              }

              /* sort the no-symbol list alphabetically */
              qsort((void *)curr_no_symbl s,num_no_symbls ,128,sort_alpha _list);

              Hopefully everything will go well at 6pm EST when it downloads
              the data...

              ---
              William Ernest Reid



              Comment

              • Keith Thompson

                #8
                Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?

                "Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
                Keith Thompson <kst-u@mib.orgwrote in message
                news:lnzmebw00v .fsf@nuthaus.mi b.org...
                >"Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
                [...]
                >I will make one comment: Don't cast the result of malloc() or
                >realloc(). See section 7 of the comp.lang.c FAQ,
                ><http://www.c-faq.com/>, particularly questions 7.7b.
                >>
                OK, I think I've heard some type of debate about this, I thought
                (based on the DOCUMENTATION THAT CAME WITH MY
                FRIGGIN' DEVELOPMENT PACKAGE) that was what you
                were supposed to do; it has seemed to work OK...
                Then the DOCUMENTATION THAT CAME WITH YOUR FRIGGIN' DEVELOPMENT
                PACKAGE is advising you to do something that's unnecessary and
                potentially dangerous. (Unless it's intended to be called from C++,
                which doesn't do implicit conversions to and from void* as freely as C
                does, but that's a different language.)

                [...]
                >Ok, it's your code, but I'm quite surprised that defining symbolic
                >constants would cause more problems than it would solve.
                >>
                As I've alluded, it partly depends on the scope. What I've found
                the hard way is that you should really know EXACTLY what you
                need AT THE MOST LOCAL LEVEL OF SCOPE. And I
                found I kept forgetting what my global defines were, and I would
                use an inappropriate one where I knew EXACTLY what I needed
                RIGHT THERE. Among other problems...
                Yes, one problem with macros is that they're not scoped.

                If you want an integer constant (within the range of type int), there
                is a trick you can use in C to limit it to the scope you want:

                enum { WHATEVER = 128 };

                It's arguably an abuse of the "enum" feature (you're doing it for the
                sake of the constant, and not actualy using the type), but it does
                work, and it's not an uncommon idiom.

                Or you can use a macro and be careful about how you use it.
                In this case, maybe a sizeof() would be better...
                Probably so.
                >If someone else needs to maintain your code (and "someone else" could
                >be you a year from now), it's not going to be obvious that the 128 in
                >this function corresponds to the 128 (or 127) in another function, but
                >the 128 in that function over there is just coincidental. There's a
                >good discussion at <http://c-faq.com/~scs/cclass/notes/sx9b.html>.
                >>
                Everything is local to a single 1000-line function for this data downloading
                operation; any (many) calls out to my custom libraries don't care about
                string lengths because they do a strlen() on the passed string pointer
                on entry.
                >
                >The first argument to qsort is:
                >>
                > (void *)curr_instrs + num_symbols
                >>
                >You can't do pointer arithmetic on a void* value. (Some compilers may
                >allow it; if you're using gcc, try "-ansi -pedantic -Wall -W", or
                >replace "-ansi" with "-std=c99").
                >>
                Then how does qsort() do it? I'm assuming now that it must just
                use pointer arithmetic internally, because it doesn't seem to want or
                recognize my typedef of a 128-character string:
                >
                typedef char instr_strs[128];
                instr_strs *curr_instrs;
                >>
                >qsort behaves in a manner consistent with its specification. That's
                >all you really need to know. It needn't even be implemented in C, and
                >if it is, it's free to use compiler-specific extensions.
                >>
                >But if it's implemented in standard C (which is entirely possible), it
                >presumably would convert the void* arguments to char* before
                >performing arithmetic on them. (Since char* and void* have the same
                >representation , the conversion doesn't cost anything at run time.)
                >>
                Yeah, apparently it only processes a char* at a time, and the
                declaration of void* just prevents somebody from stupidly passing
                the wrong starting point, or something...
                void* is a generic pointer type. In fact, it's *the* generic pointer
                type (pointer-to-object, actually; you can't portably use it for
                pointers to functions). That's why qsort() uses it. (Earlier
                versions of qsort(), before the 1989 ANSI standard, probably would
                have used char*.)

                I'm not sure what you mean by "it only processes a char* at a time".
                qsort() works with whatever size of data you tell it to. It likely
                uses memcpy() or something similar to copy data around within the
                array.

                [...]
                >Incidentally , a piece of code is either correct or not. The number of
                >times it "works" really doesn't prove anything. If your compiler
                >accepts some non-portable code, it's probably going to keep working
                >the same way indefinitely -- but it will fail the first time you
                >compile it with a different compiler, or with the same compiler and
                >different options. Correctness is not statistical.
                >>
                Try telling that to a third-grade teacher grading tests...
                I'm not sure I see the point.

                [...]
                Yeah, I noticed that, I just use (void *) because that's what
                I thought qsort() wanted, and it definitely WORKS that way
                (I've used qsort() dozens of times EXACTLY that way without
                problems).
                >>
                >Yes, it works, but it's not necessary.
                >
                Are you sure? Somehow, it seems like I tried it without the cast, and
                got an error, but if that actually happened, it was years, hell, decades
                ago.
                Yes, I'm sure.
                Like most people, I'm a victim of experience: I just keep doing what
                works...
                >
                >As a general rule, casts
                >should be avoided unless they're actually required.
                >
                The question here would be: does qsort() require it? Here's the
                documentation:
                >
                Syntax
                >
                #include <stdlib.h>
                void qsort(void *base, size_t nelem, size_t width,
                int (_USERENTRY *fcmp)(const void *, const void *));
                >
                and the example from the documentation:
                >
                int sort_function( const void *a, const void *b);
                char list[5][4] = { "cat", "car", "cab", "cap", "can" };
                >
                int main(void)
                {
                int x;
                >
                qsort((void *)list, 5, sizeof(list[0]), sort_function);
                for (x = 0; x < 5; x++)
                printf("%s\n", list[x]);
                return 0;
                }
                >
                int sort_function( const void *a, const void *b)
                {
                return( strcmp((char *)a,(char *)b) );
                }
                >
                Unlike some example documentation that I can think of, THAT one
                actually works as advertised...bu t doesn't mean that the cast is
                required...
                Yes, it works with a cast. It also works without a cast, and there's
                just no reason to use one.

                What you quoted above is not *the* documentation for qsort(). You'll
                find that in the C standard, and it doesn't say anything about casting
                arguments.
                >A cast is, among
                >other things, a promise to the compiler that you know what you're
                >doing, and will often inhibit warnings and error messages. In this
                >case, the argument will be implicitly converted to void* without the
                >cast (assuming you have a visible prototype for qsort() -- i.e., you
                >haven't forgotten the "#include <stdlib.h>".) The code is perfectly
                >correct either way, but the form with the cast is more "brittle". If
                >the cast had specified the wrong type, for example, the compiler
                >likely wouldn't have told you about the error.
                >>
                Well, OK, I just compiled it without the cast, and it came
                up clean. Then I immediately pasted the cast back in place, since
                this is "production code", and from a perfectly practical standpoint,
                the void* cast is 100% functional IN THIS CASE, so I'm loathe
                to mess anything up...
                Sure, if it already works, any change you make has a chance of
                breaking something. But keep this in mind for any new code you write,
                and when tracking down bugs in existing code. And if you're fixing a
                piece of code anyway, you might as well remove any unnecessary casts
                while you're at it; it will make the code more robust in the long run.

                [...]
                >I haven't seen the full context of your code (or if I have, I've
                >forgotten it). Your original code had
                >>
                > (void*)curr_ins trs + num_symbols
                >>
                >which is illegal, because you can't perform pointer arithmetic on
                >void* (the cast applies to "curr_instr s", not to "curr_instr s +
                >num_symbols" ). Pointer arithmetic, as you probably know, is scaled by
                >the size of the pointed-to type.
                >>
                Actually, SEQUENCE POINTS!!!! Yes, I know now this is
                wrong...
                No, sequence points aren't involved. It's just a matter of operator
                precedence (how an expression is parsed, and which operations apply to
                which operands).

                [snip]

                --
                Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
                We must do something. This is something. Therefore, we must do this.

                Comment

                • Barry Schwarz

                  #9
                  Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?

                  On Fri, 11 Aug 2006 06:09:21 GMT, "Bill Reid"
                  <hormelfree@hap pyhealthy.netwr ote:
                  >
                  >Barry Schwarz <schwarzb@doezl .netwrote in message
                  >news:k93od21vg d6n6fhrg6tooem3 r5j06ejrq4@4ax. com...
                  >On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
                  ><hormelfree@ha ppyhealthy.netw rote:
                  >>
                  snip
                  >You memory is
                  >allocated from address to address+size-1. Furthermore, calculating
                  >the value address+size is always allowed but you may not dereference
                  >this address.
                  >>
                  >...you wouldn't want to dereference an address, right.
                  It's a very common thing to do. How else do you get the value at that
                  address? All subscripts involve an implied dereference.



                  Remove del for email

                  Comment

                  • Bill Reid

                    #10
                    Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?


                    Keith Thompson <kst-u@mib.orgwrote in message
                    news:lny7tuvrj4 .fsf@nuthaus.mi b.org...
                    "Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
                    Keith Thompson <kst-u@mib.orgwrote in message
                    news:lnzmebw00v .fsf@nuthaus.mi b.org...
                    "Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
                    [...]
                    I will make one comment: Don't cast the result of malloc() or
                    realloc(). See section 7 of the comp.lang.c FAQ,
                    <http://www.c-faq.com/>, particularly questions 7.7b.
                    >
                    OK, I think I've heard some type of debate about this, I thought
                    (based on the DOCUMENTATION THAT CAME WITH MY
                    FRIGGIN' DEVELOPMENT PACKAGE) that was what you
                    were supposed to do; it has seemed to work OK...
                    >
                    Then the DOCUMENTATION THAT CAME WITH YOUR FRIGGIN' DEVELOPMENT
                    PACKAGE is advising you to do something that's unnecessary and
                    potentially dangerous. (Unless it's intended to be called from C++,
                    which doesn't do implicit conversions to and from void* as freely as C
                    does, but that's a different language.)
                    >
                    Mmmmm, well it's actually a C++ package (with a lot of "Object Pascal"
                    crap laying around apparently just in a vain attempt to create a Microsoft
                    style monopoly--three guesses who made it), and I do call back and
                    forth between C and C++ and D----i, so maybe I DO want to keep
                    the "unneeded" casts...

                    To its credit, it seems to always issue warnings for any declarations
                    not in scope, so 7.7b has little to no practical relevance...
                    >
                    Ok, it's your code, but I'm quite surprised that defining symbolic
                    constants would cause more problems than it would solve.
                    >
                    As I've alluded, it partly depends on the scope. What I've found
                    the hard way is that you should really know EXACTLY what you
                    need AT THE MOST LOCAL LEVEL OF SCOPE. And I
                    found I kept forgetting what my global defines were, and I would
                    use an inappropriate one where I knew EXACTLY what I needed
                    RIGHT THERE. Among other problems...
                    >
                    Yes, one problem with macros is that they're not scoped.
                    >
                    If you want an integer constant (within the range of type int), there
                    is a trick you can use in C to limit it to the scope you want:
                    >
                    enum { WHATEVER = 128 };
                    >
                    It's arguably an abuse of the "enum" feature (you're doing it for the
                    sake of the constant, and not actualy using the type), but it does
                    work, and it's not an uncommon idiom.
                    >
                    I'm not sure about using that particular trick, but I will say I did a
                    major overhaul of my code a few years back where I ditched about
                    80% of my defines and replaced them with enums and have saved
                    tremendous amounts of wasted effort as a result.
                    Or you can use a macro and be careful about how you use it.
                    >
                    The real point is always that you always have to be careful and
                    there is no magic trick that will completely relieve you of the duty
                    to know what the hell you are doing.
                    In this case, maybe a sizeof() would be better...
                    >
                    Probably so.
                    >
                    If someone else needs to maintain your code (and "someone else" could
                    be you a year from now), it's not going to be obvious that the 128 in
                    this function corresponds to the 128 (or 127) in another function, but
                    the 128 in that function over there is just coincidental. There's a
                    good discussion at <http://c-faq.com/~scs/cclass/notes/sx9b.html>.
                    >
                    No offense but that is sooooo "old school" and "Mickey Mouse"...it
                    might have impressed me in 1975 writing a "hello world!" program in
                    my diappies, but I have much bigger fish to fry these days...I try to use
                    what tools are available in the best way possible, defines still have a
                    place in my code and always will, but I'm not kidding when I say
                    I got sick and tired of dealing with them, with one very important
                    exception; this is from the top of my c_inclds.h file that is included
                    in every C program I write:

                    #ifndef c_incldsH
                    #define c_incldsH

                    /* boolean boo-yah */
                    #define TRUE 1 /* what about negative logic? */
                    #define FALSE 0 /* not to mention situational ethics... */

                    After that there are about another 50 defines, including line length
                    maxs and crap like that, that I'd just as soon flush down the bit-crapper
                    than ever use again...
                    >
                    qsort behaves in a manner consistent with its specification. That's
                    all you really need to know.
                    Again, I may become a "victim" of the "documentation" ...but what're
                    ya goin' to do? As I've said, if it gets the job done flawlessly after
                    being
                    compiled, I don't care what anything does...
                    >
                    Incidentally, a piece of code is either correct or not. The number of
                    times it "works" really doesn't prove anything. If your compiler
                    accepts some non-portable code, it's probably going to keep working
                    the same way indefinitely -- but it will fail the first time you
                    compile it with a different compiler, or with the same compiler and
                    different options.
                    Yeah, but I can do everything as "correctly" as possible and will
                    still have portability issues, so again, what're ya goin' to do?
                    Correctness is not statistical.
                    >
                    Try telling that to a third-grade teacher grading tests...
                    >
                    I'm not sure I see the point.
                    >
                    In all walks of life, and in so much of my own work, everything is
                    "graded". Some things are measurably "better" than others, you know,
                    like Japanese cars are better than American cars, because, you know,
                    they actually use this thing called "statistica l quality control" and
                    other disciplines, while Americans don't so much, even though it was
                    invented here...

                    I value speed and flawless execution in computer programs, and
                    have implemented a methodology for some level of portability, modularity,
                    and maintainability , but those are secondary concerns...
                    Yeah, I noticed that, I just use (void *) because that's what
                    I thought qsort() wanted, and it definitely WORKS that way
                    (I've used qsort() dozens of times EXACTLY that way without
                    problems).
                    >
                    Yes, it works, but it's not necessary.
                    How about if I call it from C++ like you mentioned about malloc()?
                    I believe I actually do call malloc() in some xxx.cpp files...
                    >
                    Like most people, I'm a victim of experience: I just keep doing what
                    works...
                    As a general rule, casts
                    should be avoided unless they're actually required.
                    The question here would be: does qsort() require it? Here's the
                    documentation:

                    Syntax

                    #include <stdlib.h>
                    void qsort(void *base, size_t nelem, size_t width,
                    int (_USERENTRY *fcmp)(const void *, const void *));

                    and the example from the documentation:

                    int sort_function( const void *a, const void *b);
                    char list[5][4] = { "cat", "car", "cab", "cap", "can" };

                    int main(void)
                    {
                    int x;

                    qsort((void *)list, 5, sizeof(list[0]), sort_function);
                    for (x = 0; x < 5; x++)
                    printf("%s\n", list[x]);
                    return 0;
                    }

                    int sort_function( const void *a, const void *b)
                    {
                    return( strcmp((char *)a,(char *)b) );
                    }

                    Unlike some example documentation that I can think of, THAT one
                    actually works as advertised...bu t doesn't mean that the cast is
                    required...
                    >
                    Yes, it works with a cast. It also works without a cast, and there's
                    just no reason to use one.
                    >
                    What you quoted above is not *the* documentation for qsort(). You'll
                    find that in the C standard, and it doesn't say anything about casting
                    arguments.
                    >
                    Again, might be the C++ thing, or an urban legend or something...
                    >
                    Well, OK, I just compiled it without the cast, and it came
                    up clean. Then I immediately pasted the cast back in place, since
                    this is "production code", and from a perfectly practical standpoint,
                    the void* cast is 100% functional IN THIS CASE, so I'm loathe
                    to mess anything up...
                    >
                    Sure, if it already works, any change you make has a chance of
                    breaking something. But keep this in mind for any new code you write,
                    and when tracking down bugs in existing code. And if you're fixing a
                    piece of code anyway, you might as well remove any unnecessary casts
                    while you're at it; it will make the code more robust in the long run.
                    >
                    Unless I call it from C++?
                    I haven't seen the full context of your code (or if I have, I've
                    forgotten it). Your original code had
                    >
                    (void*)curr_ins trs + num_symbols
                    >
                    which is illegal, because you can't perform pointer arithmetic on
                    void* (the cast applies to "curr_instr s", not to "curr_instr s +
                    num_symbols"). Pointer arithmetic, as you probably know, is scaled by
                    the size of the pointed-to type.
                    >
                    Actually, SEQUENCE POINTS!!!! Yes, I know now this is
                    wrong...
                    >
                    No, sequence points aren't involved. It's just a matter of operator
                    precedence (how an expression is parsed, and which operations apply to
                    which operands).
                    >
                    Oh, I thought that was "sequence points", but yeah, what I wrote
                    wouldn't work right.

                    Oh, while I've got you here, here's another issue I noticed that I'm
                    not sure about concerning realloc(). Here's the NON-documentation:

                    Syntax

                    #include <stdlib.h>
                    void *realloc(void *block, size_t size);

                    ....

                    If block is a NULL pointer, realloc works just like malloc.

                    ....

                    I read this years ago, and thought "Great, I don't necessarily have to
                    malloc something first, I can use realloc in a loop and the first pass
                    through the loop it'll just be like malloc."

                    Problem is, it didn't seem to work out that way, and I'm not sure
                    what I did wrong, but I think I tried a number of things, such as
                    explicitly initializing my memory pointer to NULL, and always got
                    an error...is it actually possible to use realloc() to act like malloc
                    with a NULL pointer?

                    ---
                    William Ernest Reid



                    Comment

                    • Keith Thompson

                      #11
                      Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?

                      "Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
                      Keith Thompson <kst-u@mib.orgwrote in message
                      news:lny7tuvrj4 .fsf@nuthaus.mi b.org...
                      [...]
                      Mmmmm, well it's actually a C++ package (with a lot of "Object Pascal"
                      crap laying around apparently just in a vain attempt to create a Microsoft
                      style monopoly--three guesses who made it), and I do call back and
                      forth between C and C++ and D----i, so maybe I DO want to keep
                      the "unneeded" casts...
                      If you have a genuine need to compile the same code as both C and C++,
                      that's a valid reason to cast the result of the *alloc() functions.

                      Very very few people have such a genuine need. We can count the ones
                      we've seen here on the fingers of P.J. Plauger's right hand (and even
                      that's overkill).

                      C++ provides mechanisms for interfacing to C code. Unless you're
                      providing a library to be used with either C or C++ code, you're
                      probably better off picking a language for each piece of your program
                      and using the appropriate compiler for it.

                      [...]
                      How about if I call it from C++ like you mentioned about malloc()?
                      I believe I actually do call malloc() in some xxx.cpp files...
                      Why? C++ has "new" and "delete". But in any case, C++ is a different
                      language, and comp.lang.c++ down the hall on the left, just past the
                      water cooler.

                      [...]
                      >Yes, it works with a cast. It also works without a cast, and there's
                      >just no reason to use one.
                      >>
                      >What you quoted above is not *the* documentation for qsort(). You'll
                      >find that in the C standard, and it doesn't say anything about casting
                      >arguments.
                      >>
                      Again, might be the C++ thing, or an urban legend or something...
                      The Solaris man page has similar wording.

                      [...]
                      Oh, while I've got you here, here's another issue I noticed that I'm
                      not sure about concerning realloc(). Here's the NON-documentation:
                      >
                      Syntax
                      >
                      #include <stdlib.h>
                      void *realloc(void *block, size_t size);
                      >
                      ...
                      >
                      If block is a NULL pointer, realloc works just like malloc.
                      >
                      ...
                      >
                      I read this years ago, and thought "Great, I don't necessarily have to
                      malloc something first, I can use realloc in a loop and the first pass
                      through the loop it'll just be like malloc."
                      Yes. If it doesn't work that way, your implementation is broken.
                      (But that's an unlikely bug, since the behavior is clearly documented
                      in the standard.)
                      Problem is, it didn't seem to work out that way, and I'm not sure
                      what I did wrong, but I think I tried a number of things, such as
                      explicitly initializing my memory pointer to NULL, and always got
                      an error...is it actually possible to use realloc() to act like malloc
                      with a NULL pointer?
                      Yes. I can't guess why you were unable to get it to work.

                      --
                      Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                      San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
                      We must do something. This is something. Therefore, we must do this.

                      Comment

                      • Bill Reid

                        #12
                        Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?


                        Keith Thompson <kst-u@mib.orgwrote in message
                        news:ln64gyvgtc .fsf@nuthaus.mi b.org...
                        "Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
                        Keith Thompson <kst-u@mib.orgwrote in message
                        news:lny7tuvrj4 .fsf@nuthaus.mi b.org...
                        [...]
                        Mmmmm, well it's actually a C++ package (with a lot of "Object Pascal"
                        crap laying around apparently just in a vain attempt to create a
                        Microsoft
                        style monopoly--three guesses who made it), and I do call back and
                        forth between C and C++ and D----i, so maybe I DO want to keep
                        the "unneeded" casts...
                        >
                        If you have a genuine need to compile the same code as both C and C++,
                        that's a valid reason to cast the result of the *alloc() functions.
                        >
                        Nope, I don't think I ever do that. However, I do occasionally
                        call malloc in a xxx.cpp file, which is compiled by C++...
                        Very very few people have such a genuine need.
                        Yes, hard to imagine what the point of that would be, 'cept maybe
                        even greater programming confusion than I have!
                        We can count the ones
                        we've seen here on the fingers of P.J. Plauger's right hand (and even
                        that's overkill).
                        >
                        C++ provides mechanisms for interfacing to C code. Unless you're
                        providing a library to be used with either C or C++ code, you're
                        probably better off picking a language for each piece of your program
                        and using the appropriate compiler for it.
                        >
                        The only libraries I provide are for myself, and as you note it is
                        generally fairly painless to call into C++ object files from C and vice
                        versa.
                        How about if I call it from C++ like you mentioned about malloc()?
                        I believe I actually do call malloc() in some xxx.cpp files...
                        >
                        Why? C++ has "new" and "delete".
                        Good question, maybe there was a good reason, maybe not, but
                        since I'm not looking at that particular code right now, it probably
                        had to do with keeping certain data structures as similar as possible
                        when used in C++ as they are when used in C, and something about
                        "new" just "scared" me...
                        But in any case, C++ is a different
                        language, and comp.lang.c++ down the hall on the left, just past the
                        water cooler.
                        >
                        Well, I didn't bring it up, but my code base is about 50/50...
                        >
                        Yes, it works with a cast. It also works without a cast, and there's
                        just no reason to use one.
                        >
                        What you quoted above is not *the* documentation for qsort(). You'll
                        find that in the C standard, and it doesn't say anything about casting
                        arguments.
                        >
                        Again, might be the C++ thing, or an urban legend or something...
                        >
                        The Solaris man page has similar wording.
                        >
                        Well, the Solaris man page would be just the old ucb man page,
                        right? In any event, I am highly displeased with this particular
                        development
                        package, and high on my list of specific displeasures is the documentation.
                        It is in some cases wrong, many cases stupidly written, incomplete,
                        and just plain difficult to use. So I'm not at all surprised that they
                        included an unnecessary cast in the example, but at least the
                        example works, as I said...
                        Oh, while I've got you here, here's another issue I noticed that I'm
                        not sure about concerning realloc(). Here's the NON-documentation:

                        Syntax

                        #include <stdlib.h>
                        void *realloc(void *block, size_t size);

                        ...

                        If block is a NULL pointer, realloc works just like malloc.

                        ...

                        I read this years ago, and thought "Great, I don't necessarily have to
                        malloc something first, I can use realloc in a loop and the first pass
                        through the loop it'll just be like malloc."
                        >
                        Yes. If it doesn't work that way, your implementation is broken.
                        (But that's an unlikely bug, since the behavior is clearly documented
                        in the standard.)
                        >
                        I would think it unlikely it is broken, the package is irritatingly bad
                        in many ways but seems to generally put out clean functioning programs
                        after fighting the "tools", but who knows. I may have just done something
                        stupid, wouldn't be the first time...
                        Problem is, it didn't seem to work out that way, and I'm not sure
                        what I did wrong, but I think I tried a number of things, such as
                        explicitly initializing my memory pointer to NULL, and always got
                        an error...is it actually possible to use realloc() to act like malloc
                        with a NULL pointer?
                        >
                        Yes. I can't guess why you were unable to get it to work.
                        >
                        Maybe I'll try it again. I made the changes to my data downloading
                        code yesterday, including deleting the "unnecessar y casts", ran some tests,
                        everything worked fine, put it "into production", 6:15pm EST rolled
                        around and it did its thing apparently flawlessly, only about three
                        milliseconds quicker...

                        ---
                        William Ernest Reid



                        Comment

                        • Bill Reid

                          #13
                          Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?


                          Barry Schwarz <schwarzb@doezl .netwrote in message
                          news:nm7qd2dura 0jcplvn9s0g5o47 3eb149mo3@4ax.c om...
                          On Fri, 11 Aug 2006 06:09:21 GMT, "Bill Reid"
                          <hormelfree@hap pyhealthy.netwr ote:
                          Barry Schwarz <schwarzb@doezl .netwrote in message
                          news:k93od21vgd 6n6fhrg6tooem3r 5j06ejrq4@4ax.c om...
                          On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
                          <hormelfree@hap pyhealthy.netwr ote:
                          >
                          You memory is
                          allocated from address to address+size-1. Furthermore, calculating
                          the value address+size is always allowed but you may not dereference
                          this address.
                          >
                          ...you wouldn't want to dereference an address, right.
                          >
                          It's a very common thing to do. How else do you get the value at that
                          address? All subscripts involve an implied dereference.
                          >
                          OK, you were talking about dereferencing an address one element
                          past the end of the block, I thought you were talking about something like
                          saving the pointer, then trying to use it again after another realloc().
                          That WOULD be a recipe for diasaster, right?

                          So I'm not sure what distinction you're trying to make about
                          subscript "implied" dereferencing. Isn't "address+si ze" equivalent to
                          "address[size]"? Again, the only problem in doing anything with a
                          dereference of that address is that you're one element past the
                          end of the block...but that might actually work for you if you're Russian...

                          ---
                          William Ernest Reid



                          Comment

                          • Flash Gordon

                            #14
                            Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?

                            Bill Reid wrote:
                            Barry Schwarz <schwarzb@doezl .netwrote in message
                            news:nm7qd2dura 0jcplvn9s0g5o47 3eb149mo3@4ax.c om...
                            >On Fri, 11 Aug 2006 06:09:21 GMT, "Bill Reid"
                            ><hormelfree@ha ppyhealthy.netw rote:
                            >>Barry Schwarz <schwarzb@doezl .netwrote in message
                            >>news:k93od21v gd6n6fhrg6tooem 3r5j06ejrq4@4ax .com...
                            >>>On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
                            >>><hormelfree@ happyhealthy.ne twrote:
                            >>>You memory is
                            >>>allocated from address to address+size-1. Furthermore, calculating
                            >>>the value address+size is always allowed but you may not dereference
                            >>>this address.
                            >>>>
                            >>...you wouldn't want to dereference an address, right.
                            >It's a very common thing to do. How else do you get the value at that
                            >address? All subscripts involve an implied dereference.
                            >>
                            OK, you were talking about dereferencing an address one element
                            past the end of the block, I thought you were talking about something like
                            saving the pointer, then trying to use it again after another realloc().
                            That WOULD be a recipe for diasaster, right?
                            >
                            So I'm not sure what distinction you're trying to make about
                            subscript "implied" dereferencing. Isn't "address+si ze" equivalent to
                            "address[size]"?
                            No. "address[size]" and "*(address+size )" are equivalent. So the first
                            form does a dereference. "address+si ze" on the other hand does *not* do
                            a dereference, implied or otherwise.
                            Again, the only problem in doing anything with a
                            dereference of that address is that you're one element past the
                            end of the block...but that might actually work for you if you're Russian...
                            Never dereference beyond the end of the block. It is "not allowed" by
                            the standard, i.e. anything can happen including, unfortunately, what
                            you happen to expect.

                            Comment

                            • Bill Reid

                              #15
                              Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?


                              Bill Reid <hormelfree@hap pyhealthy.netwr ote in message
                              news:uTtDg.2417 94$mF2.19376@bg tnsc04-news.ops.worldn et.att.net...
                              Keith Thompson <kst-u@mib.orgwrote in message
                              news:ln64gyvgtc .fsf@nuthaus.mi b.org...
                              "Bill Reid" <hormelfree@hap pyhealthy.netwr ites:
                              >
                              Oh, while I've got you here, here's another issue I noticed that I'm
                              not sure about concerning realloc(). Here's the NON-documentation:
                              >
                              Syntax
                              >
                              #include <stdlib.h>
                              void *realloc(void *block, size_t size);
                              >
                              ...
                              >
                              If block is a NULL pointer, realloc works just like malloc.
                              >
                              ...
                              >
                              I read this years ago, and thought "Great, I don't necessarily have to
                              malloc something first, I can use realloc in a loop and the first pass
                              through the loop it'll just be like malloc."
                              Yes. If it doesn't work that way, your implementation is broken.
                              (But that's an unlikely bug, since the behavior is clearly documented
                              in the standard.)
                              I would think it unlikely it is broken, the package is irritatingly bad
                              in many ways but seems to generally put out clean functioning programs
                              after fighting the "tools", but who knows. I may have just done something
                              stupid, wouldn't be the first time...
                              >
                              Problem is, it didn't seem to work out that way, and I'm not sure
                              what I did wrong, but I think I tried a number of things, such as
                              explicitly initializing my memory pointer to NULL, and always got
                              an error...is it actually possible to use realloc() to act like malloc
                              with a NULL pointer?
                              Yes. I can't guess why you were unable to get it to work.
                              Maybe I'll try it again.
                              Oooooh, that was gnarly...

                              What I forgot was that if I don't malloc() the block first, if I
                              realloc() in a loop I get a memory access exception. I hate it
                              when that happens...

                              Maybe it IS a bug in the compiler, if it wasn't so easy to work
                              around, I might actually worry about it more. As it is, I did a
                              search on the compiler maker's web-site for any information
                              on known bugs, came up with nothing, and left a question on
                              the discussion forum about it, see if anybody knows anything...

                              ---
                              William Ernest Reid



                              Comment

                              Working...