Dynamically resizing a buffer

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Philip Potter

    Dynamically resizing a buffer

    Hello clc,

    I have a buffer in a program which I write to. The buffer has
    write-only, unsigned-char-at-a-time access, and the amount of space
    required isn't known a priori. Therefore I want the buffer to
    dynamically grow using realloc().

    A comment by Richard Heathfield in a thread here suggested that a good
    algorithm for this is to use realloc() to double the size of the buffer,
    but if realloc() fails request smaller size increments until realloc()
    succeeds or until realloc() has failed to increase the buffer by even
    one byte.

    The basic idea is below. The key function is MyBuffer_writeb yte(), which
    expects the incoming MyBuffer object to be in a consistent state.

    Are there any improvements I could make to this code? To me it feels
    clumsy, especially with the break in the 3-line while loop.

    struct mybuffer_t {
    unsigned char *data;
    size_t size; /* size of buffer allocated */
    size_t index; /* index of first unwritten member of data */
    };

    typedef struct mybuffer_t MyBuffer;

    void MyBuffer_writeb yte(MyBuffer *buf, unsigned char byte) {
    if(buf->size == buf->index) {
    /* need to allocate more space */
    size_t inc = buf->size;
    unsigned char *tmp;
    while(inc>0) {
    tmp = realloc(buf->data, buf->size + inc);
    if (tmp!=NULL) break; /* break to preserve the size of inc*/
    inc/=2;
    }
    if(tmp==NULL) {
    /* couldn't allocate any more space, print error and exit */
    exit(EXIT_FAILU RE);
    }
    buf->size += inc;
    }
    buf->data[buf->index++] = byte;
    }

    --
    Philip Potter pgp <atdoc.ic.ac. uk
  • Mark Bluemel

    #2
    Re: Dynamically resizing a buffer

    Philip Potter wrote:
    >
    struct mybuffer_t {
    unsigned char *data;
    size_t size; /* size of buffer allocated */
    size_t index; /* index of first unwritten member of data */
    };
    >
    typedef struct mybuffer_t MyBuffer;
    >
    void MyBuffer_writeb yte(MyBuffer *buf, unsigned char byte) {
    if(buf->size == buf->index) {
    /* need to allocate more space */
    size_t inc = buf->size;
    unsigned char *tmp;
    while(inc>0) {
    tmp = realloc(buf->data, buf->size + inc);
    if (tmp!=NULL) break; /* break to preserve the size of inc*/
    inc/=2;
    }
    if(tmp==NULL) {
    /* couldn't allocate any more space, print error and exit */
    exit(EXIT_FAILU RE);
    }
    You never replace buf->data... It will continue to point to the original
    space, which may have been freed ...
    buf->size += inc;
    }
    buf->data[buf->index++] = byte;
    }
    >
    I'd split out the resizing to a separate function, as it's a
    generalizable technique.

    You could combine the two forms of loop control (size of increment,
    success/failure of allocation) in the while.

    This (untested) is my rough hack :-

    void MyBuffer_writeb yte(MyBuffer *buf, unsigned char byte) {
    if(buf->size == buf->index) {
    if(!resizeBuffe r(buf)){
    exit(EXIT_FAILU RE); /* or some better error handling */
    }
    }
    buf->data[buf->index++] = byte;
    }

    /* returns new size */
    int resizeBuffer(My Buffer *buf) {
    size_t inc = buf->size;
    unsigned char *tmp = NULL;
    while(inc>0 &&
    (tmp = realloc(buf->data, buf->size + inc)) == NULL) {
    inc/=2;
    }
    if(tmp!=NULL) {
    buf->data = tmp;
    buf->size += inc;
    return buf->size;
    } else {
    return 0;
    }

    }

    Comment

    • Richard

      #3
      Re: Dynamically resizing a buffer

      Philip Potter <pgp@see.sig.in validwrites:
      Hello clc,
      >
      I have a buffer in a program which I write to. The buffer has
      write-only, unsigned-char-at-a-time access, and the amount of space
      required isn't known a priori. Therefore I want the buffer to
      dynamically grow using realloc().
      >
      A comment by Richard Heathfield in a thread here suggested that a good
      algorithm for this is to use realloc() to double the size of the
      buffer, but if realloc() fails request smaller size increments until
      realloc() succeeds or until realloc() has failed to increase the
      buffer by even one byte.
      Double the size of the buffer? Never IMO. Not if you "designed" the buffer
      to be "about right" in the first place. Suppose it doesn't fail but does
      exhaust memory? Well, bang goes your next malloc.

      Keep it simple. Have a delta increase and use that. I would suggest
      something like 10-20% of the initial size.

      Clearly if you KNOW that the buffer is going to double/quadruple etc
      then malloc it to start with.
      >
      The basic idea is below. The key function is MyBuffer_writeb yte(),
      which expects the incoming MyBuffer object to be in a consistent
      state.
      >
      Are there any improvements I could make to this code? To me it feels
      clumsy, especially with the break in the 3-line while loop.
      >
      struct mybuffer_t {
      unsigned char *data;
      size_t size; /* size of buffer allocated */
      size_t index; /* index of first unwritten member of data */
      };
      >
      typedef struct mybuffer_t MyBuffer;
      >
      void MyBuffer_writeb yte(MyBuffer *buf, unsigned char byte) {
      if(buf->size == buf->index) {
      /* need to allocate more space */
      size_t inc = buf->size;
      unsigned char *tmp;
      while(inc>0) {
      tmp = realloc(buf->data, buf->size + inc);
      if (tmp!=NULL) break; /* break to preserve the size of inc*/
      inc/=2;
      }
      if(tmp==NULL) {
      /* couldn't allocate any more space, print error and exit */
      exit(EXIT_FAILU RE);
      }
      buf->size += inc;
      }
      buf->data[buf->index++] = byte;
      }
      --

      Comment

      • Philip Potter

        #4
        Re: Dynamically resizing a buffer

        Richard wrote:
        Philip Potter <pgp@see.sig.in validwrites:
        >
        >Hello clc,
        >>
        >I have a buffer in a program which I write to. The buffer has
        >write-only, unsigned-char-at-a-time access, and the amount of space
        >required isn't known a priori. Therefore I want the buffer to
        >dynamically grow using realloc().
        >>
        >A comment by Richard Heathfield in a thread here suggested that a good
        >algorithm for this is to use realloc() to double the size of the
        >buffer, but if realloc() fails request smaller size increments until
        >realloc() succeeds or until realloc() has failed to increase the
        >buffer by even one byte.
        >
        Double the size of the buffer? Never IMO. Not if you "designed" the buffer
        to be "about right" in the first place. Suppose it doesn't fail but does
        exhaust memory? Well, bang goes your next malloc.
        >
        Keep it simple. Have a delta increase and use that. I would suggest
        something like 10-20% of the initial size.
        >
        Clearly if you KNOW that the buffer is going to double/quadruple etc
        then malloc it to start with.
        I've already stated that the final amount of space required isn't known
        a priori. It could be thousands of bytes, it could be millions. A
        linearly-growing buffer is not appropriate for this situation, which is
        why I chose an exponentially-growing buffer.

        Phil

        --
        Philip Potter pgp <atdoc.ic.ac. uk

        Comment

        • Philip Potter

          #5
          Re: Dynamically resizing a buffer

          Mark Bluemel wrote:
          Philip Potter wrote:
          >
          You never replace buf->data... It will continue to point to the original
          space, which may have been freed ...
          Whoops! That bug was also in my project code...
          > buf->size += inc;
          > }
          > buf->data[buf->index++] = byte;
          >}
          >>
          >
          I'd split out the resizing to a separate function, as it's a
          generalizable technique.
          >
          You could combine the two forms of loop control (size of increment,
          success/failure of allocation) in the while.
          I think this is clumsy too, but less so :)
          This (untested) is my rough hack :-
          >
          void MyBuffer_writeb yte(MyBuffer *buf, unsigned char byte) {
          if(buf->size == buf->index) {
          if(!resizeBuffe r(buf)){
          exit(EXIT_FAILU RE); /* or some better error handling */
          }
          }
          buf->data[buf->index++] = byte;
          }
          >
          /* returns new size */
          int resizeBuffer(My Buffer *buf) {
          size_t inc = buf->size;
          unsigned char *tmp = NULL;
          while(inc>0 &&
          (tmp = realloc(buf->data, buf->size + inc)) == NULL) {
          inc/=2;
          }
          if(tmp!=NULL) {
          buf->data = tmp;
          buf->size += inc;
          return buf->size;
          } else {
          return 0;
          }
          >
          }
          Surely resizeBuffer() should return size_t?

          --
          Philip Potter pgp <atdoc.ic.ac. uk

          Comment

          • Mark Bluemel

            #6
            Re: Dynamically resizing a buffer

            Philip Potter wrote:
            Mark Bluemel wrote:
            >I'd split out the resizing to a separate function, as it's a
            >generalizabl e technique.
            >
            >You could combine the two forms of loop control (size of increment,
            >success/failure of allocation) in the while.
            >
            I think this is clumsy too, but less so :)
            I'm inclined to agree. I'm still trying to think of a more elegant solution.
            >
            >This (untested) is my rough hack :-
            <snip>
            >/* returns new size */
            >int resizeBuffer(My Buffer *buf) {
            <snip>
            Surely resizeBuffer() should return size_t?
            Indeed it should.

            Comment

            • Mark Bluemel

              #7
              Re: Dynamically resizing a buffer

              Philip Potter wrote:
              A comment by Richard Heathfield in a thread here suggested that a good
              algorithm for this is to use realloc() to double the size of the buffer,
              but if realloc() fails request smaller size increments until realloc()
              succeeds or until realloc() has failed to increase the buffer by even
              one byte.
              >
              The basic idea is below. The key function is MyBuffer_writeb yte(), which
              expects the incoming MyBuffer object to be in a consistent state.
              >
              Are there any improvements I could make to this code? To me it feels
              clumsy, especially with the break in the 3-line while loop.
              >
              struct mybuffer_t {
              unsigned char *data;
              size_t size; /* size of buffer allocated */
              size_t index; /* index of first unwritten member of data */
              };
              >
              typedef struct mybuffer_t MyBuffer;
              >
              void MyBuffer_writeb yte(MyBuffer *buf, unsigned char byte) {
              if(buf->size == buf->index) {
              /* need to allocate more space */
              size_t inc = buf->size;
              unsigned char *tmp;
              while(inc>0) {
              tmp = realloc(buf->data, buf->size + inc);
              if (tmp!=NULL) break; /* break to preserve the size of inc*/
              inc/=2;
              }
              if(tmp==NULL) {
              /* couldn't allocate any more space, print error and exit */
              exit(EXIT_FAILU RE);
              }
              buf->size += inc;
              }
              buf->data[buf->index++] = byte;
              }
              >
              Here's a simple alternative :-

              void MyBuffer_writeb yte(MyBuffer *buf, unsigned char byte) {
              if(buf->size == buf->index) {
              /* need to allocate more space */
              size_t inc = buf->size;
              unsigned char *tmp;
              while(inc>0) {
              tmp = realloc(buf->data, buf->size + inc);
              if (tmp!=NULL) {
              buf->data = tmp;
              buf->size += inc;
              inc = 0; /* force exit */
              } else {
              inc /= 2;
              }
              }
              if(tmp==NULL) {
              /* couldn't allocate any more space, print error and exit */
              exit(EXIT_FAILU RE);
              }
              }
              buf->data[buf->index++] = byte;
              }

              Comment

              • cr88192

                #8
                Re: Dynamically resizing a buffer


                "Philip Potter" <pgp@see.sig.in validwrote in message
                news:fah5a6$qif $1@aioe.org...
                Hello clc,
                >
                I have a buffer in a program which I write to. The buffer has write-only,
                unsigned-char-at-a-time access, and the amount of space required isn't
                known a priori. Therefore I want the buffer to dynamically grow using
                realloc().
                >
                A comment by Richard Heathfield in a thread here suggested that a good
                algorithm for this is to use realloc() to double the size of the buffer,
                but if realloc() fails request smaller size increments until realloc()
                succeeds or until realloc() has failed to increase the buffer by even one
                byte.
                >
                doubling is probably too steep IMO.

                I usually use 50% each time ('size2=size+(s ize>>1);').
                25% may also be sane ('size2=size+(s ize>>2);').
                33% may also be good.

                33% is ugly though:
                size2=size+(siz e>>2)+(size>>4) +(size>>6); //bulky
                size2=size+size *33/100; //critical buffer size limit

                but is a nice 4/3 ratio...

                <snip>


                Comment

                • Eric Sosman

                  #9
                  Re: Dynamically resizing a buffer

                  Philip Potter wrote:
                  [...]
                  >
                  Are there any improvements I could make to this code? To me it feels
                  clumsy, especially with the break in the 3-line while loop.
                  [...]
                  while(inc>0) {
                  tmp = realloc(buf->data, buf->size + inc);
                  if (tmp!=NULL) break; /* break to preserve the size of inc*/
                  inc/=2;
                  }
                  if(tmp==NULL) {
                  /* couldn't allocate any more space, print error and exit */
                  exit(EXIT_FAILU RE);
                  }
                  buf->size += inc;
                  The loop has two termination conditions -- realloc()
                  succeeds, or increment goes to zero -- so I don't see how
                  you can avoid two tests. But as written there's an after-
                  the-loop test to (re-)discover why the loop ended. That,
                  at least, is easy to get rid of:

                  while ((tmp = realloc(buf->data, buf->size + inc)) == NULL) {
                  if ((inc /= 2) == 0)
                  exit(EXIT_FAILU RE);
                  }
                  buf->data = tmp;
                  buf->size += inc;

                  --
                  Eric Sosman
                  esosman@ieee-dot-org.invalid

                  Comment

                  • Richard Tobin

                    #10
                    Re: Dynamically resizing a buffer

                    In article <rcd4xfyfo0.fsf @homelinux.net> , Richard <rgrdev@gmail.c omwrote:
                    >Double the size of the buffer? Never IMO. Not if you "designed" the buffer
                    >to be "about right" in the first place. Suppose it doesn't fail but does
                    >exhaust memory? Well, bang goes your next malloc.
                    >
                    >Keep it simple. Have a delta increase and use that. I would suggest
                    >something like 10-20% of the initial size.
                    If you don't know the likely size (e.g. when reading some textual
                    data) it makes more sense to increase by a proportion of the current
                    size: if you started at 10 bytes are have now reached 100 it makes no
                    sense to go on incrementing by one or two.

                    Of course that's what the original proposal did: increase by 100% of
                    the current size. If you start from one, this means you only allocate
                    power-of-two sized blocks.

                    A problem with any approach is that it might interact badly with the
                    malloc() implementation. Imagine an implementation that always
                    allocates powers of two but uses sizeof(void *) bytes of it to record
                    the size - if you always allocated powers of two you would end up
                    allocating nearly four times as much as you need.

                    I recommend separating out the increment algorithm so that it can
                    easily be changed if it proves to be bad on some platform.

                    -- Richard
                    --
                    "Considerat ion shall be given to the need for as many as 32 characters
                    in some alphabets" - X3.4, 1963.

                    Comment

                    • Richard Heathfield

                      #11
                      Re: Dynamically resizing a buffer

                      Philip Potter said:
                      Richard wrote:
                      >Philip Potter <pgp@see.sig.in validwrites:
                      <snip>
                      >>A comment by Richard Heathfield in a thread here suggested that a
                      >>good algorithm for this is to use realloc() to double the size of
                      >>the buffer, but if realloc() fails request smaller size increments
                      >>until realloc() succeeds or until realloc() has failed to increase
                      >>the buffer by even one byte.
                      >>
                      >Double the size of the buffer? Never IMO. Not if you "designed" the
                      >buffer to be "about right" in the first place. Suppose it doesn't
                      >fail but does exhaust memory? Well, bang goes your next malloc.
                      >
                      >Keep it simple. Have a delta increase and use that. I would suggest
                      >something like 10-20% of the initial size.
                      >
                      >Clearly if you KNOW that the buffer is going to double/quadruple etc
                      >then malloc it to start with.
                      >
                      I've already stated that the final amount of space required isn't
                      known a priori. It could be thousands of bytes, it could be millions.
                      A linearly-growing buffer is not appropriate for this situation, which
                      is why I chose an exponentially-growing buffer.
                      Right. A linear increase is most unwise.

                      One of the problems with killfiling trolls is that one doesn't get to
                      see (and thus have the opportunity to correct) the ludicrous "advice"
                      they dish out, unless it happens to be quoted in a reply.

                      Incidentally, if memory is tight (relative to the length of the line
                      you're reading), memory exhaustion /is/ a risk, but one that is easily
                      dealt with. Once you know that the line has been read completely, you
                      can "shrink" the allocation to be an exact fit, thus taking up no more
                      memory than you actually need.

                      --
                      Richard Heathfield <http://www.cpax.org.uk >
                      Email: -www. +rjh@
                      Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
                      "Usenet is a strange place" - dmr 29 July 1999

                      Comment

                      • Richard Heathfield

                        #12
                        Re: Dynamically resizing a buffer

                        Eric Sosman said:

                        <snip>
                        while ((tmp = realloc(buf->data, buf->size + inc)) == NULL) {
                        if ((inc /= 2) == 0)
                        exit(EXIT_FAILU RE);
                        Do you really think that's a good idea? We've had this whole discussion
                        recently, I know, but it bears reiterating nonetheless that it is not a
                        library function's job to decide whether to terminate a program.

                        --
                        Richard Heathfield <http://www.cpax.org.uk >
                        Email: -www. +rjh@
                        Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
                        "Usenet is a strange place" - dmr 29 July 1999

                        Comment

                        • Richard

                          #13
                          Re: Dynamically resizing a buffer

                          Richard Heathfield <rjh@see.sig.in validwrites:
                          Philip Potter said:
                          >
                          >Richard wrote:
                          >>Philip Potter <pgp@see.sig.in validwrites:
                          >
                          <snip>
                          >
                          >>>A comment by Richard Heathfield in a thread here suggested that a
                          >>>good algorithm for this is to use realloc() to double the size of
                          >>>the buffer, but if realloc() fails request smaller size increments
                          >>>until realloc() succeeds or until realloc() has failed to increase
                          >>>the buffer by even one byte.
                          >>>
                          >>Double the size of the buffer? Never IMO. Not if you "designed" the
                          >>buffer to be "about right" in the first place. Suppose it doesn't
                          >>fail but does exhaust memory? Well, bang goes your next malloc.
                          > >
                          >>Keep it simple. Have a delta increase and use that. I would suggest
                          >>something like 10-20% of the initial size.
                          > >
                          >>Clearly if you KNOW that the buffer is going to double/quadruple etc
                          >>then malloc it to start with.
                          >>
                          >I've already stated that the final amount of space required isn't
                          >known a priori. It could be thousands of bytes, it could be millions.
                          >A linearly-growing buffer is not appropriate for this situation, which
                          >is why I chose an exponentially-growing buffer.
                          I missed that point. I also caveated my reply in the second
                          sentence. Apologies if you thought my reply was misleading.

                          "Not if you "designed" the buffer to be "about right" in the first place."
                          >
                          Right. A linear increase is most unwise.
                          >
                          One of the problems with killfiling trolls is that one doesn't get to
                          see (and thus have the opportunity to correct) the ludicrous "advice"
                          they dish out, unless it happens to be quoted in a reply.
                          Hilarious advice from someone who frequently posts incorrect advice.
                          >
                          Incidentally, if memory is tight (relative to the length of the line
                          you're reading), memory exhaustion /is/ a risk, but one that is easily
                          dealt with. Once you know that the line has been read completely, you
                          can "shrink" the allocation to be an exact fit, thus taking up no more
                          memory than you actually need.
                          And in the next episode, RH explains why "using less memory is better
                          for the system performance".

                          Comment

                          • Mark Bluemel

                            #14
                            Re: Dynamically resizing a buffer

                            Richard Heathfield wrote:
                            Eric Sosman said:
                            >
                            <snip>
                            >
                            >while ((tmp = realloc(buf->data, buf->size + inc)) == NULL) {
                            >if ((inc /= 2) == 0)
                            >exit(EXIT_FAIL URE);
                            >
                            Do you really think that's a good idea? We've had this whole discussion
                            recently, I know, but it bears reiterating nonetheless that it is not a
                            library function's job to decide whether to terminate a program.
                            In Eric's (and my:-) defence, I think he was just making the minimum
                            change to the OP's code to address the specific issue the OP had raised
                            (about the use of "break").

                            As I remarked in my earlier, less than elegant reply, I think the
                            resizing should be split out to a separate function which returns some
                            value which indicates success or failure.

                            Equally the "MyBuffer_write byte" routine might also be better returning
                            a success/failure result rather than terminating the program on failure.

                            Comment

                            • Philip Potter

                              #15
                              Re: Dynamically resizing a buffer

                              Richard wrote:
                              Richard Heathfield <rjh@see.sig.in validwrites:
                              >
                              >Philip Potter said:
                              >>I've already stated that the final amount of space required isn't
                              >>known a priori. It could be thousands of bytes, it could be millions.
                              >>A linearly-growing buffer is not appropriate for this situation, which
                              >>is why I chose an exponentially-growing buffer.
                              >
                              I missed that point. I also caveated my reply in the second
                              sentence. Apologies if you thought my reply was misleading.
                              >
                              "Not if you "designed" the buffer to be "about right" in the first place."
                              And if you knew the right size, why would you need a dynamic buffer in
                              the first place? The only situation in which a linear expansion is
                              useful is when you know an upper bound on the size requirements; in
                              which case why don't you just allocate that upper bound?
                              >Right. A linear increase is most unwise.
                              >>
                              >One of the problems with killfiling trolls is that one doesn't get to
                              >see (and thus have the opportunity to correct) the ludicrous "advice"
                              >they dish out, unless it happens to be quoted in a reply.
                              >
                              Hilarious advice from someone who frequently posts incorrect advice.
                              On the contrary, I find RH's advice to be some of the best here -- along
                              with a few others. He can be quite blunt with it sometimes, but we can't
                              have everything :)
                              >Incidentally , if memory is tight (relative to the length of the line
                              >you're reading), memory exhaustion /is/ a risk, but one that is easily
                              >dealt with. Once you know that the line has been read completely, you
                              >can "shrink" the allocation to be an exact fit, thus taking up no more
                              >memory than you actually need.
                              >
                              And in the next episode, RH explains why "using less memory is better
                              for the system performance".
                              ....and his advice continues to be correct.

                              Phil

                              --
                              Philip Potter pgp <atdoc.ic.ac. uk

                              Comment

                              Working...