Continuously concatenating binary data

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Unforgiven

    Continuously concatenating binary data

    I have an application, where I continuously get new binary data input, in
    the form of a char*. This data comes from the Windows Multimedia wave input
    functions, but that's not important. What it means is that every 2 seconds,
    I need to add 22050 bytes to an ever expanding buffer. I have no idea at the
    beginning how large this buffer would need to be.

    Now there are several possibilities to do is, as I see it:
    1. Just make the buffer a void* (or char*), and realloc it every 2 seconds,
    copying the new data to the end. This isn't a good idea of course, because
    realloc will become very expensive as the buffer grows.
    2. Use something like this, with ssBuffer an ostringstream:
    ssBuffer << newdata;
    Then just read out the entire stream at the end.
    I don't know how ostringstream manages buffer growth, so this might not be
    any better (performance-wise) than the realloc approach.
    3. Do the same as above, but with an ofstream. This can handle really huge
    input (although I don't expect input to be more than 10-15 seconds of audio
    data ever), and should be reasonably efficient since Windows buffers file
    I/O, but it does require the user to have writing rights whereever I'm going
    to put this file.
    4. Copy every 2 seconds of data into it's own 'minibuffer', add those to a
    std::list, and at the end create a large buffer only once, copying all
    individual pieces into it.

    What would be the best approach in your opinions? Or perhaps you have an
    even better one that I didn't think of.

    Thanks in advance.

    --
    Unforgiven

    A: Top Posting!
    Q: What is the most annoying thing on Usenet?

  • Victor Bazarov

    #2
    Re: Continuously concatenating binary data

    "Unforgiven " <jaapd3000@hotm ail.com> wrote...[color=blue]
    > I have an application, where I continuously get new binary data input, in
    > the form of a char*. This data comes from the Windows Multimedia wave[/color]
    input[color=blue]
    > functions, but that's not important. What it means is that every 2[/color]
    seconds,[color=blue]
    > I need to add 22050 bytes to an ever expanding buffer. I have no idea at[/color]
    the[color=blue]
    > beginning how large this buffer would need to be.[/color]

    What do you need the buffer for? Do you use it right away? Does
    the buffer have to be contiguous during your input?

    If not, use a list<your22050b ytes>. I suspect that even if you do
    need to use the "stream" right away, the list is quick enough for
    all your streaming needs.
    [color=blue]
    > [...][/color]

    Victor



    Comment

    • Nitin Rajput

      #3
      Re: Continuously concatenating binary data

      I think having a vector<char> should be good enough. vectors should
      not be more than twice worse than array accesses - They are pretty
      fast. Also they would allow you to expand as more data comes in.

      You can look at the vector allocation strategy - it doubles its size
      wheneve there is an overflow kindof situation.

      -nitin

      Comment

      • K_Lee

        #4
        Re: Continuously concatenating binary data

        Just a thought:

        If your user have small amount of memory or record large
        amount of data, all your malloc/realloc will turn into
        swap disk i/o.

        It would be no differents than stream approach.
        In fact stream give you better control on amount of memory
        your app needs.


        --
        The source is out there. Browse and document open/share source
        projects such as Apache, Tcl, Ethereal, Mozilla, .Net SSCLI.


        "Victor Bazarov" <v.Abazarov@att Abi.com> wrote in message news:<J2yjb.292 805$mp.232723@r wcrnsc51.ops.as p.att.net>...[color=blue]
        > "Unforgiven " <jaapd3000@hotm ail.com> wrote...[color=green]
        > > I have an application, where I continuously get new binary data input, in
        > > the form of a char*. This data comes from the Windows Multimedia wave[/color]
        > input[color=green]
        > > functions, but that's not important. What it means is that every 2[/color]
        > seconds,[color=green]
        > > I need to add 22050 bytes to an ever expanding buffer. I have no idea at[/color]
        > the[color=green]
        > > beginning how large this buffer would need to be.[/color]
        >
        > What do you need the buffer for? Do you use it right away? Does
        > the buffer have to be contiguous during your input?
        >
        > If not, use a list<your22050b ytes>. I suspect that even if you do
        > need to use the "stream" right away, the list is quick enough for
        > all your streaming needs.
        >[color=green]
        > > [...][/color]
        >
        > Victor[/color]

        Comment

        • lilburne

          #5
          Re: Continuously concatenating binary data

          Nitin Rajput wrote:
          [color=blue]
          > I think having a vector<char> should be good enough. vectors should
          > not be more than twice worse than array accesses - They are pretty
          > fast. Also they would allow you to expand as more data comes in.
          >
          > You can look at the vector allocation strategy - it doubles its size
          > wheneve there is an overflow kindof situation.
          >[/color]

          A raw char vector is probably not a good idea. As the vector
          grows you not only start moving large amounts of data about,
          but run the risk of being unable to allocate enough
          contiguous memory.

          Vectors are alright if you know in advance that the number
          of elements going to be used is reasonably small (a few
          thousand at most).

          A list of vectors holding each 2 seconds worth of data is
          probably sufficient in this case.

          Comment

          • Unforgiven

            #6
            Re: Continuously concatenating binary data

            lilburne wrote:[color=blue]
            > Nitin Rajput wrote:
            >[color=green]
            >> I think having a vector<char> should be good enough. vectors should
            >> not be more than twice worse than array accesses - They are pretty
            >> fast. Also they would allow you to expand as more data comes in.
            >>
            >> You can look at the vector allocation strategy - it doubles its size
            >> wheneve there is an overflow kindof situation.
            >>[/color]
            >
            > A raw char vector is probably not a good idea. As the vector
            > grows you not only start moving large amounts of data about,
            > but run the risk of being unable to allocate enough
            > contiguous memory.[/color]

            This is one of the reasons I didn't even give a vector as an option.
            Doubling size may limit reallocs, but when you start to get into the really
            big amounts of data, it could potentially waste a *lot* of memory.

            Another problem with any approach that uses contiguous memory (which would
            include C-style arrays, std::vector and I suppose also memory-based streams
            such as std::ostringstr eam) is that freeing memory (a realloc is basically a
            malloc, memcpy, free sequence) tends to be very expensive on Windows. I
            believe it has to do with the memory manager wanting to pack the heap after
            each free (I once had to deallocate a 300MB (don't ask) 4-dimensional jagged
            array of bools (bool****) and it took nearly 5 minutes on a Pentium III
            600MHz)

            Contiguous memory should not be much of a problem. All we need is contiguous
            address space, not actual contiguous memory, thanks to the virtue of virtual
            memory. And because is the heap is packed at least every so often, it
            shouldn't give any problems soon.

            --
            Unforgiven

            A: Top Posting!
            Q: What is the most annoying thing on Usenet?

            Comment

            Working...