string I/O in binary file

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • jack

    string I/O in binary file

    Hi there,

    I have an NxK matrix of strings (C++).
    I'd like to write it to a binary file
    and later read it back to C++ (since the
    dimension N is huge, I'd like to save
    some time reading it every time).
    What's the best way (neat and fast) to do this without converting to C
    character
    array or create extra storage for the size
    of each string?

    Thank you for your help.


  • Kevin Goodsell

    #2
    Re: string I/O in binary file

    jack wrote:
    [color=blue]
    > Hi there,
    >
    > I have an NxK matrix of strings (C++).
    > I'd like to write it to a binary file
    > and later read it back to C++ (since the
    > dimension N is huge, I'd like to save
    > some time reading it every time).
    > What's the best way (neat and fast) to do this without converting to C
    > character
    > array or create extra storage for the size
    > of each string?[/color]

    Is there something wrong with the line wrap on your news client, or are
    you inserting your own line breaks? If so, why? Just let your client
    handle it. It will look much nicer (assuming you set a reasonable line
    length, like maybe 70 characters).

    Basically, your problem is not well-defined. You need to indicate the
    bounds of the strings in some way if you expect to be able to find where
    one ends and the next begins. Usually this is done with a delimiter
    (like in C) or a size. There's no simple way to do it without either of
    these.

    -Kevin
    --
    My email address is valid, but changes periodically.
    To contact me please use the address from a recent posting.

    Comment

    • Kevin Goodsell

      #3
      Re: string I/O in binary file

      jack wrote:
      [color=blue]
      > Hi there,
      >
      > I have an NxK matrix of strings (C++).
      > I'd like to write it to a binary file
      > and later read it back to C++ (since the
      > dimension N is huge, I'd like to save
      > some time reading it every time).
      > What's the best way (neat and fast) to do this without converting to C
      > character
      > array or create extra storage for the size
      > of each string?[/color]

      Is there something wrong with the line wrap on your news client, or are
      you inserting your own line breaks? If so, why? Just let your client
      handle it. It will look much nicer (assuming you set a reasonable line
      length, like maybe 70 characters).

      Basically, your problem is not well-defined. You need to indicate the
      bounds of the strings in some way if you expect to be able to find where
      one ends and the next begins. Usually this is done with a delimiter
      (like in C) or a size. There's no simple way to do it without either of
      these.

      -Kevin
      --
      My email address is valid, but changes periodically.
      To contact me please use the address from a recent posting.

      Comment

      • John Harrison

        #4
        Re: string I/O in binary file


        "jack" <idlor@yahoo.co m> wrote in message
        news:40734957$0 $1626$61fed72c@ news.rcn.com...[color=blue]
        > Hi there,
        >
        > I have an NxK matrix of strings (C++).
        > I'd like to write it to a binary file
        > and later read it back to C++ (since the
        > dimension N is huge, I'd like to save
        > some time reading it every time).
        > What's the best way (neat and fast) to do this without converting to C
        > character
        > array or create extra storage for the size
        > of each string?
        >
        > Thank you for your help.
        >[/color]

        I don't see what advantage you think a C character array will give you. You
        still have the same problem that you are trying to write parts of a large
        file where each part is a different size. One way around this is to assume
        some fixed maximum size, it certainly makes things simpler.

        Another issue that greatly affects how you should do this is whether the
        parts of the files might change size, so you read some strings of one size
        but then you change the sizes and want to write different size strings to
        the same place in the file.

        Undoubtedly the fastest and neatest way to do this is to find some third
        party library that has already solved the problem for you. This sort of
        thing has been done countless times before. You could for instance look at a
        B-Tree implementation. A B-Tree is a kind of indexed data structure
        specialised to be stored in a file. In your case the index would presumably
        be the first index of your matrix.

        It's hard to give more specific advice because your problem is only vaguely
        specified. Why not post back explaining *what* you are trying to achieve,
        not *how* you are trying to achieve it.

        john


        Comment

        • John Harrison

          #5
          Re: string I/O in binary file


          "jack" <idlor@yahoo.co m> wrote in message
          news:40734957$0 $1626$61fed72c@ news.rcn.com...[color=blue]
          > Hi there,
          >
          > I have an NxK matrix of strings (C++).
          > I'd like to write it to a binary file
          > and later read it back to C++ (since the
          > dimension N is huge, I'd like to save
          > some time reading it every time).
          > What's the best way (neat and fast) to do this without converting to C
          > character
          > array or create extra storage for the size
          > of each string?
          >
          > Thank you for your help.
          >[/color]

          I don't see what advantage you think a C character array will give you. You
          still have the same problem that you are trying to write parts of a large
          file where each part is a different size. One way around this is to assume
          some fixed maximum size, it certainly makes things simpler.

          Another issue that greatly affects how you should do this is whether the
          parts of the files might change size, so you read some strings of one size
          but then you change the sizes and want to write different size strings to
          the same place in the file.

          Undoubtedly the fastest and neatest way to do this is to find some third
          party library that has already solved the problem for you. This sort of
          thing has been done countless times before. You could for instance look at a
          B-Tree implementation. A B-Tree is a kind of indexed data structure
          specialised to be stored in a file. In your case the index would presumably
          be the first index of your matrix.

          It's hard to give more specific advice because your problem is only vaguely
          specified. Why not post back explaining *what* you are trying to achieve,
          not *how* you are trying to achieve it.

          john


          Comment

          • Howie

            #6
            Re: string I/O in binary file

            On Tue, 6 Apr 2004 20:20:35 -0400, "jack" <idlor@yahoo.co m> wrote:
            [color=blue]
            >Hi there,
            >
            >I have an NxK matrix of strings (C++).
            >I'd like to write it to a binary file
            >and later read it back to C++ (since the
            >dimension N is huge, I'd like to save
            >some time reading it every time).
            >What's the best way (neat and fast) to do this without converting to C
            >character
            >array or create extra storage for the size
            >of each string?
            >
            >Thank you for your help.
            >[/color]

            I have done this several times for different reasons.

            Bye the way that some strings may be empty i used seperators like a
            csv-file and you nead a simple parser to read this back.
            stringvalue-fieldseperator-stringvalue-fieldseperator-stringvalue-rowseperator;
            stringvalue-fieldseperator-stringvalue-fieldseperator-fieldseperator-fieldseperator-stringvalue-rowseperator;
            stringvalue-fieldseperator-stringvalue-fieldseperator-stringvalue-rowseperator;

            Another way is to write the string length bevor the stringvalue. Set
            the length of -1 at the end of each row. So you read first the length
            and you knows the length of the next string or you must shift in the
            next row.

            Howie



            Comment

            • Howie

              #7
              Re: string I/O in binary file

              On Tue, 6 Apr 2004 20:20:35 -0400, "jack" <idlor@yahoo.co m> wrote:
              [color=blue]
              >Hi there,
              >
              >I have an NxK matrix of strings (C++).
              >I'd like to write it to a binary file
              >and later read it back to C++ (since the
              >dimension N is huge, I'd like to save
              >some time reading it every time).
              >What's the best way (neat and fast) to do this without converting to C
              >character
              >array or create extra storage for the size
              >of each string?
              >
              >Thank you for your help.
              >[/color]

              I have done this several times for different reasons.

              Bye the way that some strings may be empty i used seperators like a
              csv-file and you nead a simple parser to read this back.
              stringvalue-fieldseperator-stringvalue-fieldseperator-stringvalue-rowseperator;
              stringvalue-fieldseperator-stringvalue-fieldseperator-fieldseperator-fieldseperator-stringvalue-rowseperator;
              stringvalue-fieldseperator-stringvalue-fieldseperator-stringvalue-rowseperator;

              Another way is to write the string length bevor the stringvalue. Set
              the length of -1 at the end of each row. So you read first the length
              and you knows the length of the next string or you must shift in the
              next row.

              Howie



              Comment

              Working...