std::getline() behaves differently between platforms?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • JML

    std::getline() behaves differently between platforms?

    Hi,

    I have some code which parses a text file and creates objects based on
    what is in the text file. The code works just fine on Windows, but when
    I compile it using XCode on OS X the parsing goes all wrong. Is there
    some known differences with file handling on OS X?

    My code is quite long, but one of the defect parts looks like this
    (sorry about the indentation - I'm new to posting code on a newsgroup):

    //Begin adding NPCs, exits and collision boxes
    while ( std::getline(fi lestream, str) ) {

    //Look for regular NPC
    if (str == "[NPC]") {
    std::cout << "Found NPC. \n";
    std::string name;
    int x, y, w, h, co_x, co_y, co_w, co_h;
    filestream >name >x >y >w >h >co_x >co_y >co_w >co_h;
    m_NPCList.push_ back( CActor( this, name, x, y, w, h, co_x, co_y, co_w,
    co_h ) );

    //Look for a path
    bool foundPath = false;
    std::getline(fi lestream, str); //Finish previous line
    std::getline(fi lestream, str);

    if ( str == "[PATH:LOOP]" ) {
    std::cout << "Looped path!\n";
    foundPath = true; m_NPCList[m_NPCList.size( )-1].SetFollowMetho d( 0 );
    }

    if ( foundPath ) {
    std::getline(fi lestream, str);
    while ( str != "[END]" ) {
    int x, y;
    std::string::si ze_type loc = str.find( " ", 0 );
    std::istringstr eam x_string(str.su bstr(0, loc));
    std::istringstr eam y_string(str.su bstr(loc+1, str.length()-1));
    x_string >x;
    y_string >y;
    m_NPCList[m_NPCList.size( )-1].AddPathNode( CPoint( x, y ) );
    std::getline(fi lestream, str);
    }
    }
    }
    }

    On Windows the code parses a file with this content just fine:
    [NPC]
    Batman 100 100 32 32 8 16 16 8
    [PATH:LOOP]
    100 100
    200 100
    200 200
    100 200
    [END]

    But on OS X it goes wrong at around here:
    std::getline(fi lestream, str); //Finish previous line
    std::getline(fi lestream, str);
    if ( str == "[PATH:LOOP]" ) {
  • hsmit.home@gmail.com

    #2
    Re: std::getline() behaves differently between platforms?

    On Nov 20, 1:36 pm, JML <"marcus]FJERN["@larsen.dkwrot e:
    Hi,
    >
    I have some code which parses a text file and creates objects based on
    what is in the text file. The code works just fine on Windows, but when
    I compile it using XCode on OS X the parsing goes all wrong. Is there
    some known differences with file handling on OS X?
    >
    My code is quite long, but one of the defect parts looks like this
    (sorry about the indentation - I'm new to posting code on a newsgroup):
    >
    //Begin adding NPCs, exits and collision boxes
    while ( std::getline(fi lestream, str) ) {
    >
    //Look for regular NPC
    if (str == "[NPC]") {
    std::cout << "Found NPC. \n";
    std::string name;
    int x, y, w, h, co_x, co_y, co_w, co_h;
    filestream >name >x >y >w >h >co_x >co_y >co_w >co_h;
    m_NPCList.push_ back( CActor( this, name, x, y, w, h, co_x, co_y, co_w,
    co_h ) );
    >
    //Look for a path
    bool foundPath = false;
    std::getline(fi lestream, str); //Finish previous line
    std::getline(fi lestream, str);
    >
    if ( str == "[PATH:LOOP]" ) {
    std::cout << "Looped path!\n";
    foundPath = true; m_NPCList[m_NPCList.size( )-1].SetFollowMetho d( 0 );
    >
    }
    >
    if ( foundPath ) {
    std::getline(fi lestream, str);
    while ( str != "[END]" ) {
    int x, y;
    std::string::si ze_type loc = str.find( " ", 0 );
    std::istringstr eam x_string(str.su bstr(0, loc));
    std::istringstr eam y_string(str.su bstr(loc+1, str.length()-1));
    x_string >x;
    y_string >y;
    m_NPCList[m_NPCList.size( )-1].AddPathNode( CPoint( x, y ) );
    std::getline(fi lestream, str);
    >
    }
    }
    }
    }
    >
    On Windows the code parses a file with this content just fine:
    [NPC]
    Batman 100 100 32 32 8 16 16 8
    [PATH:LOOP]
    100 100
    200 100
    200 200
    100 200
    [END]
    >
    But on OS X it goes wrong at around here:
    std::getline(fi lestream, str); //Finish previous line
    std::getline(fi lestream, str);
    if ( str == "[PATH:LOOP]" ) {
    I didn't read your entire message (busy at work right now), but,
    generally these type of problems occur due to line terminator
    differences between platforms. \r\n for windows and I'm not sure what
    it is for MAC OS X (\n\r???, or simply \n). If you have a hex editor,
    have a look at how the line terminators are set in the file.

    You may also try opening the file in text mode and see what happens,
    and then try opening it in binary mode and see what happens.

    Now back to work...

    Comment

    • Ole Nielsby

      #3
      Re: std::getline() behaves differently between platforms?

      <hsmit.home@gma il.comwrote:
      [...]
      generally these type of problems occur due to line terminator
      differences between platforms. \r\n for windows and I'm not
      sure what it is for MAC OS X (\n\r???, or simply \n).
      AFAIK it's \r\n for windows, \n for unix, and \r for Mac.

      I can think of 3 options for making it work:

      1. Demand that files be converted to the lf convention of the platform,
      and open the files in text mode. (Drawback: files have to be converted
      when moved between platforms)

      2. Settle on one of the conventions and open the files in binary mode.
      (Drawback: files must be authored in an editor that supports that
      convention, or converted).

      3. (what I'd prefer) Write a line-getter that copes with all 3 conventions.
      (Drawback: slightly messy and bloated)


      Comment

      • James Kanze

        #4
        Re: std::getline() behaves differently between platforms?

        On Nov 20, 4:21 pm, "Ole Nielsby"
        <ole.niel...@te kare-you-spamminglogisk. dkwrote:
        <hsmit.h...@gma il.comwrote:
        [...]
        generally these type of problems occur due to line terminator
        differences between platforms. \r\n for windows and I'm not
        sure what it is for MAC OS X (\n\r???, or simply \n).
        AFAIK it's \r\n for windows, \n for unix, and \r for Mac.
        More correctly, it's CRLF for Windows, LF for Unix, and (was, at
        least) CR for Mac. At the disk level. Within a program, the
        line terminator is always '\n'. (Note that to add to the fun,
        some Windows programs use CRLF as a line *separator*, not a line
        *terminator*.)
        I can think of 3 options for making it work:
        1. Demand that files be converted to the lf convention of the platform,
        and open the files in text mode. (Drawback: files have to be converted
        when moved between platforms)
        2. Settle on one of the conventions and open the files in binary mode.
        (Drawback: files must be authored in an editor that supports that
        convention, or converted).
        3. (what I'd prefer) Write a line-getter that copes with all 3 conventions..
        (Drawback: slightly messy and bloated)
        In general, I would say that any serious program today that
        handles text input generated by an editor should use 3 (except
        that you probably don't need to worry about a special case for
        Max). From experience, if I'm editing files, I'll use 2, simply
        because Windows programs seem to be (on the average) more
        tolerant about this than Unix programs. But if you're asking
        your users to edit the files, then it's probably not an option:
        while every serious developer I've ever met uses either emacs or
        vim, neither are really very popular among everyday users (to
        put it mildly).

        Note that using 3 isn't anywhere near as messy and bloated as it
        sounds. CR, as you mentionned, maps to '\r', and isspace('\r')
        should return true. And since you'll normally want to trim
        trailing whitespace from the lines anyway...

        The real problem with 3 is that you can't use it reading from
        cin.

        (Note that 1 is fine *if* you're actually moving the files, as
        individual files. FTP, also supports two modes. It becomes
        more of a problem if you're transfering them in a compressed
        archive, and downright impossible if you're reading them over a
        remote mounted file system.)

        --
        James Kanze (GABI Software) email:james.kan ze@gmail.com
        Conseils en informatique orientée objet/
        Beratung in objektorientier ter Datenverarbeitu ng
        9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

        Comment

        • JML

          #5
          Re: std::getline() behaves differently between platforms?

          James Kanze wrote:
          More correctly, it's CRLF for Windows, LF for Unix, and (was, at
          least) CR for Mac. At the disk level. Within a program, the
          line terminator is always '\n'. (Note that to add to the fun,
          some Windows programs use CRLF as a line *separator*, not a line
          *terminator*.)
          I tried writing a new .txt file with an editor in OS X, and then the
          file parses just fine on the OS X build. Yay for that, but it is not an
          optimal solution of course. I think the solution must be to write a new
          line-getter, that handles the different line terminators.
          >3. (what I'd prefer) Write a line-getter that copes with all 3 conventions.
          > (Drawback: slightly messy and bloated)
          >
          In general, I would say that any serious program today that
          handles text input generated by an editor should use 3 (except
          that you probably don't need to worry about a special case for
          Max).
          Note that using 3 isn't anywhere near as messy and bloated as it
          sounds. CR, as you mentionned, maps to '\r', and isspace('\r')
          should return true. And since you'll normally want to trim
          trailing whitespace from the lines anyway...
          As I'm not the most experienced guy in writing text parsers in C++,
          could you guys give me some more concrete advice on how to create a
          better line-getter? :)

          Comment

          • Bo Persson

            #6
            Re: std::getline() behaves differently between platforms?

            JML" <"marcus]FJERN[ wrote:
            :: James Kanze wrote:
            ::: More correctly, it's CRLF for Windows, LF for Unix, and (was, at
            ::: least) CR for Mac. At the disk level. Within a program, the
            ::: line terminator is always '\n'. (Note that to add to the fun,
            ::: some Windows programs use CRLF as a line *separator*, not a line
            ::: *terminator*.)
            ::
            :: I tried writing a new .txt file with an editor in OS X, and then
            :: the file parses just fine on the OS X build. Yay for that, but it
            :: is not an optimal solution of course. I think the solution must be
            :: to write a new line-getter, that handles the different line
            :: terminators.

            Why?

            Isn't the solution to have proper text files on each platform?

            Note that, on some systems, the physical file doesn't store any
            terminator at all. It uses a hidden length value instead. It might
            also store the text in EBCDIC.

            Even so, std::getline works with a '\n' line terminator.


            Bo Persson


            Comment

            • James Kanze

              #7
              Re: std::getline() behaves differently between platforms?

              On Nov 21, 10:13 pm, "Bo Persson" <b...@gmb.dkwro te:
              JML" <"marcus]FJERN[ wrote:
              :: James Kanze wrote:
              ::: More correctly, it's CRLF for Windows, LF for Unix, and (was, at
              ::: least) CR for Mac. At the disk level. Within a program, the
              ::: line terminator is always '\n'. (Note that to add to the fun,
              ::: some Windows programs use CRLF as a line *separator*, not a line
              ::: *terminator*.)
              :: I tried writing a new .txt file with an editor in OS X, and then
              :: the file parses just fine on the OS X build. Yay for that, but it
              :: is not an optimal solution of course. I think the solution must be
              :: to write a new line-getter, that handles the different line
              :: terminators.
              Why?
              Isn't the solution to have proper text files on each platform?
              And what do you do with file systems that are mounted on two
              different platforms. Most of the time I'm working under
              Windows, the files are actually on a Unix machine somewhere,
              being served up by Samba. And they're being read by Unix
              machines at the same time.
              Note that, on some systems, the physical file doesn't store any
              terminator at all. It uses a hidden length value instead. It might
              also store the text in EBCDIC.
              True. You don't remote mount files systems with those, and the
              file transfer program does (or should) take care of any mapping;
              that's why FTP also has two modes.
              Even so, std::getline works with a '\n' line terminator.
              I'll admit that I don't see too much of a problem either. In
              practice, there are two situations. If you've copied the files,
              you should have remapped during the copy, so you'll have the
              native separator. And remote mounting, in practice, means just
              Unix and Windows: the Windows implementations I know of have no
              problem with Unix line endings, and the Unix implementations
              simply pass the CR up as a '\r', which is white space, and
              should just get ignored. In practice, it's just not a problem.

              --
              James Kanze (GABI Software) email:james.kan ze@gmail.com
              Conseils en informatique orientée objet/
              Beratung in objektorientier ter Datenverarbeitu ng
              9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

              Comment

              Working...