Size of file

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • MisterE

    Size of file

    I hear that this isn't always valid:

    FILE *in;
    long size;
    in = fopen("foo.bar" ,"rb");
    fseek(in,0,SEEK _END);
    size = ftell(in);
    fseek(in,0,SEEK _SET);

    then fread size many bytes into memory.

    Apparently fseek is not guaranteed to work because of 0xFF EOF or other
    characters, is this true only in text mode or also in binary mode? Is there
    anyway to get a filesize without having to read bytes on at a time. Is it
    best to just fread until it fails?


  • Bartc

    #2
    Re: Size of file


    "MisterE" <MisterE@nimga. comwrote in message
    news:48e55844$0 $4454$afc38c87@ news.optusnet.c om.au...
    >I hear that this isn't always valid:
    >
    FILE *in;
    long size;
    in = fopen("foo.bar" ,"rb");
    fseek(in,0,SEEK _END);
    size = ftell(in);
    fseek(in,0,SEEK _SET);
    >
    then fread size many bytes into memory.
    >
    Apparently fseek is not guaranteed to work because of 0xFF EOF or other
    characters, is this true only in text mode or also in binary mode? Is
    there anyway to get a filesize without having to read bytes on at a time.
    Is it best to just fread until it fails?
    Works for me, using binary mode files. But there are various pitfalls:

    In text mode, the size you get might be wrong because it might include '\n'
    '\r' sequences instead of just '\n'.

    Some types of files may not have a beginning or end (like stdin, or some
    serial device), so don't have a size.

    Some OSs may not store the exact bytesize of a file (for example may only
    store a block size), so the value might be approximate. (And there might be
    other OS things to bear in mind such as use of compression.)

    And whatever file size you get might change if the file is modified (by any
    other process) by the time you use the file size information.

    For more details, see threads on this subject in c.l.c.

    But within those constraints, I've been using code like yours successfully
    for a decade or two.

    --
    Bartc

    Comment

    • Gordon Burditt

      #3
      Re: Size of file

      >I hear that this isn't always valid:

      There are many, many, many different definitions of "file size",
      (probably more than there are file sizes on a 64-bit machine) and
      you need to decide which definition you want to use if you intend
      calling any result "correct" or "incorrect" .
      >FILE *in;
      >long size;
      >in = fopen("foo.bar" ,"rb");
      >fseek(in,0,SEE K_END);
      >size = ftell(in);
      >fseek(in,0,SEE K_SET);
      >
      >then fread size many bytes into memory.
      In binary mode, SEEK_END need not be meaningfully supported because
      the system may pad the file with trailing 0 bytes. For example, CP/M
      only counts sectors on binary files and rounds the size of the file
      up to the next multiple of 128 bytes, and pads the last sector with
      trailing 0 bytes.

      In text mode, the size returned from ftell need not be meaningful as
      a number. For example, it might be a bitfield of a number of values
      like sector, head, cylinder, track, train, etc. so that subtracting
      two of them does not give anything meaningful.

      (Try, for example, subtracting 09302008 from 10022008, treating
      them as decimal integers rather than dates, and try to make sense
      out of the result that would indicate that they are 2 days apart.
      The same kind of encoding can be done on text file offsets.)

      Byte offsets into a text file are likely to be misleading because
      of the \r\n -\n translation done by some systems (e.g. Windows).
      >Apparently fseek is not guaranteed to work because of 0xFF EOF or other
      >characters,
      There is no "EOF character". Even on one those systems which use
      an end marker for text files (Windows), that marker isn't 0xFF.
      Many systems (UNIX & variants) just store a file length (yet another
      definition of "file size") and don't use an end marker.

      EOF is a value that won't *fit* in a char (unless sizeof int ==
      sizeof char) which is why getchar() returns int, not char.
      >is this true only in text mode or also in binary mode?
      You are screwed in both text mode and in binary mode for different
      reasons.
      >Is there
      >anyway to get a filesize
      Do you want *A* filesize (in which case, I pick 0, it's easy, and
      you didn't say it had to be correct, and some files actually do
      have size 0) or do you want a *correct* filesize, in which case you
      have to pick a definition of filesize?
      >without having to read bytes on at a time. Is it
      >best to just fread until it fails?
      If you want to read the file into memory, two definitions
      of file size come to mind:

      1. The number of bytes read from the file in binary mode.
      2. The number of bytes read from the file in text mode.

      Chances are high that these two definitions will give different
      answers for the file size for any given file. Neither of these
      necessarily says anything about how much space the file takes on
      disk. But if you want to read the file into memory, these are
      the right definitions to use (pick the one that uses the same
      file mode as the file mode you're going to use).


      Comment

      • Barry Schwarz

        #4
        Re: Size of file

        On Fri, 3 Oct 2008 09:24:53 +1000, "MisterE" <MisterE@nimga. com>
        wrote:
        >I hear that this isn't always valid:
        You heard right.
        >
        >FILE *in;
        >long size;
        >in = fopen("foo.bar" ,"rb");
        You open the file in binary.
        >fseek(in,0,SEE K_END);
        The standard specifically states "A binary stream need not
        meaningfully support fseek calls with a whence value of SEEK_END."
        >size = ftell(in);
        >fseek(in,0,SEE K_SET);
        >
        >then fread size many bytes into memory.
        >
        >Apparently fseek is not guaranteed to work because of 0xFF EOF or other
        I don't where you came up with this. 0xFF is not a special character
        in a binary file. It could even be a normal printable character since
        the standard does not mandate ASCII or EBCDIC. EOF is not a
        character. It is a macro. It is entirely possible that the value
        used in that macro is not representable as a char.
        >characters, is this true only in text mode or also in binary mode? Is there
        >anyway to get a filesize without having to read bytes on at a time. Is it
        >best to just fread until it fails?
        Depends on how important portability is to you.

        --
        Remove del for email

        Comment

        • arnuld

          #5
          Re: Size of file

          On Fri, 03 Oct 2008 09:24:53 +1000, MisterE wrote:
          I hear that this isn't always valid:
          >
          FILE *in;
          long size;
          in = fopen("foo.bar" ,"rb");
          fseek(in,0,SEEK _END);
          size = ftell(in);
          fseek(in,0,SEEK _SET);
          >
          then fread size many bytes into memory.
          Most people use fopen and fseek. In my programs I used stat. One thing
          that always made me wonder is that stat reports filesize == 0 if the file
          is opened . Only on closed file it reports the correct size.




          --

          my email is @ the above blog.
          Gooogle Groups is Blocked. Reason: Excessive Spamming

          Comment

          • Nate Eldredge

            #6
            Re: Size of file

            arnuld <sunrise@invali d.addresswrites :
            >On Fri, 03 Oct 2008 09:24:53 +1000, MisterE wrote:
            >
            >I hear that this isn't always valid:
            >>
            >FILE *in;
            >long size;
            >in = fopen("foo.bar" ,"rb");
            >fseek(in,0,SEE K_END);
            >size = ftell(in);
            >fseek(in,0,SEE K_SET);
            >>
            >then fread size many bytes into memory.
            >
            Most people use fopen and fseek. In my programs I used stat. One thing
            that always made me wonder is that stat reports filesize == 0 if the file
            is opened . Only on closed file it reports the correct size.
            Sorry this is becoming off-topic, but where did you find this
            behavior? Under Unix this would be very strange.

            The only thing I can think of is that you opened the file for writing,
            which ordinarily would truncate it, so that its size would indeed be
            0. But opening for reading should not do this.

            Comment

            • arnuld

              #7
              Re: Size of file

              On Thu, 02 Oct 2008 23:04:30 -0700, Nate Eldredge wrote:

              Sorry this is becoming off-topic, but where did you find this
              behavior? Under Unix this would be very strange.
              well, it happens on my machine all the time.

              The only thing I can think of is that you opened the file for writing,
              which ordinarily would truncate it, so that its size would indeed be
              0. But opening for reading should not do this.
              fopen(file, "a")



              --

              my email is @ the above blog.
              Gooogle Groups is Blocked. Reason: Excessive Spamming

              Comment

              • Nate Eldredge

                #8
                Re: Size of file

                arnuld <sunrise@invali d.addresswrites :
                >On Thu, 02 Oct 2008 23:04:30 -0700, Nate Eldredge wrote:
                >
                >
                >Sorry this is becoming off-topic, but where did you find this
                >behavior? Under Unix this would be very strange.
                >
                well, it happens on my machine all the time.
                What operating system / compiler / standard library?
                >
                >
                >The only thing I can think of is that you opened the file for writing,
                >which ordinarily would truncate it, so that its size would indeed be
                >0. But opening for reading should not do this.
                >
                fopen(file, "a")
                Peculiar. Can you post a complete example of a program that shows
                this behavior?

                Comment

                • CBFalconer

                  #9
                  Re: Size of file

                  arnuld wrote:
                  >MisterE wrote:
                  >
                  >I hear that this isn't always valid:
                  >>
                  >FILE *in;
                  >long size;
                  >in = fopen("foo.bar" ,"rb");
                  >fseek(in,0,SEE K_END);
                  >size = ftell(in);
                  >fseek(in,0,SEE K_SET);
                  >>
                  >then fread size many bytes into memory.
                  >
                  Most people use fopen and fseek. In my programs I used stat. One
                  thing that always made me wonder is that stat reports filesize
                  == 0 if the file is opened . Only on closed file it reports the
                  correct size.
                  stat is not present in standard C. Thus it can do anything, and is
                  off topic here unless you present its actual coding (in standard
                  C).

                  --
                  [mail]: Chuck F (cbfalconer at maineline dot net)
                  [page]: <http://cbfalconer.home .att.net>
                  Try the download section.

                  Comment

                  • Antoninus Twink

                    #10
                    Re: Size of file

                    On 3 Oct 2008 at 7:35, CBFalconer wrote:
                    arnuld wrote:
                    >Most people use fopen and fseek. In my programs I used stat. One
                    >thing that always made me wonder is that stat reports filesize
                    >== 0 if the file is opened . Only on closed file it reports the
                    >correct size.
                    I don't know exactly what you mean. Perhaps you're talking about writes
                    that might have been buffered and not yet actually made, which stat()
                    won't detect? For example:


                    #include <stdio.h>

                    #include <sys/types.h>
                    #include <sys/stat.h>
                    #include <unistd.h>

                    int main(void)
                    {
                    FILE *out;
                    struct stat buf;
                    out=fopen("foo" , "w");
                    if(out) {
                    fputs("12345", out);
                    if(stat("foo", &buf)==0)
                    printf("size: %lu\n", (unsigned long) buf.st_size);
                    fflush(out);
                    if(stat("foo", &buf)==0)
                    printf("flushed size: %lu\n", (unsigned long) buf.st_size);
                    fclose(out);
                    }
                    return 0;
                    }

                    $ ./a
                    size: 0
                    flushed size: 5
                    stat is not present in standard C. Thus it can do anything, and is
                    off topic here unless you present its actual coding (in standard
                    C).
                    Why don't you just crawl back in your hole and die if you don't have
                    anything useful to contribute?

                    Comment

                    • Chris Ahlstrom

                      #11
                      Re: Size of file

                      After takin' a swig o' grog, CBFalconer belched out
                      this bit o' wisdom:
                      arnuld wrote:
                      >>
                      >Most people use fopen and fseek. In my programs I used stat. One
                      >thing that always made me wonder is that stat reports filesize
                      >== 0 if the file is opened . Only on closed file it reports the
                      >correct size.
                      >
                      stat is not present in standard C. Thus it can do anything, and is
                      off topic here unless you present its actual coding (in standard
                      C).
                      It's called _stat() by Microsoft.

                      --
                      If builders built buildings the way programmers wrote programs,
                      then the first woodpecker to come along would destroy civilization.

                      Comment

                      • Keith Thompson

                        #12
                        Re: Size of file

                        CBFalconer <cbfalconer@yah oo.comwrites:
                        arnuld wrote:
                        [...]
                        >Most people use fopen and fseek. In my programs I used stat. One
                        >thing that always made me wonder is that stat reports filesize
                        >== 0 if the file is opened . Only on closed file it reports the
                        >correct size.
                        >
                        stat is not present in standard C. Thus it can do anything, and is
                        off topic here unless you present its actual coding (in standard
                        C).
                        Let's assume that arnuld is referring to the "stat" function specified
                        by POSIX; it's theoretically possible that he's talking about
                        something else, but common sense points to that one particular
                        function. Presenting an actual implementation of the POSIX stat() in
                        standard C is not possible; it depends on characteristics of the file
                        system that C does not define. Even if it were possible, posting a
                        complete implementation would be a waste of bandwidth; you don't need
                        to post a fucntion's implementation to discuss what it does.

                        If you want to say it's off-topic, just say it's off-topic (and I
                        agree, it is off-topic, though I don't object to a brief mention).
                        Dragging in absurd, and presumably unserious, suggestions about how it
                        *could* be topical is not at all helpful.

                        --
                        Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                        Nokia
                        "We must do something. This is something. Therefore, we must do this."
                        -- Antony Jay and Jonathan Lynn, "Yes Minister"

                        Comment

                        • CBFalconer

                          #13
                          Re: Size of file

                          Chris Ahlstrom wrote:
                          CBFalconer belched out this bit o' wisdom:
                          >arnuld wrote:
                          >>>
                          >>Most people use fopen and fseek. In my programs I used stat. One
                          >>thing that always made me wonder is that stat reports filesize
                          >>== 0 if the file is opened . Only on closed file it reports the
                          >>correct size.
                          >>
                          >stat is not present in standard C. Thus it can do anything, and is
                          >off topic here unless you present its actual coding (in standard
                          >C).
                          >
                          It's called _stat() by Microsoft.
                          So what? It is not present in standard C, the subject of this
                          newsgroup. If you want to bring it up on a newsgroup that deals
                          with Microsoft or Posix, that is another matter. Then there is a
                          definition available for it. It still isn't portable.

                          --
                          [mail]: Chuck F (cbfalconer at maineline dot net)
                          [page]: <http://cbfalconer.home .att.net>
                          Try the download section.

                          Comment

                          • CBFalconer

                            #14
                            Re: Size of file

                            Keith Thompson wrote:
                            CBFalconer <cbfalconer@yah oo.comwrites:
                            >
                            .... snip ...
                            >
                            >stat is not present in standard C. Thus it can do anything,
                            >and is off topic here unless you present its actual coding (in
                            >standard C).
                            >
                            .... snip ...
                            >
                            If you want to say it's off-topic, just say it's off-topic (and
                            I agree, it is off-topic, though I don't object to a brief
                            mention). Dragging in absurd, and presumably unserious,
                            suggestions about how it *could* be topical is not at all helpful.
                            I disagree. There is no reason a user can't write his own stat()
                            function, say as:

                            int stat(char *s) {
                            return !!*s;
                            }

                            I think my response (above) covered the possibilities.

                            --
                            [mail]: Chuck F (cbfalconer at maineline dot net)
                            [page]: <http://cbfalconer.home .att.net>
                            Try the download section.

                            Comment

                            • Keith Thompson

                              #15
                              Re: Size of file

                              CBFalconer <cbfalconer@yah oo.comwrites:
                              Keith Thompson wrote:
                              >CBFalconer <cbfalconer@yah oo.comwrites:
                              >>
                              ... snip ...
                              >>
                              >>stat is not present in standard C. Thus it can do anything,
                              >>and is off topic here unless you present its actual coding (in
                              >>standard C).
                              >>
                              ... snip ...
                              >>
                              >If you want to say it's off-topic, just say it's off-topic (and
                              >I agree, it is off-topic, though I don't object to a brief
                              >mention). Dragging in absurd, and presumably unserious,
                              >suggestions about how it *could* be topical is not at all helpful.
                              >
                              I disagree. There is no reason a user can't write his own stat()
                              function, say as:
                              >
                              int stat(char *s) {
                              return !!*s;
                              }
                              >
                              I think my response (above) covered the possibilities.
                              The previous poster talked about using stat to determine the size of a
                              file.

                              Topicality doesn't preclude using a little common sense. When you
                              talk about Pascal, I generally assume you mean the language, not the
                              philosopher.

                              --
                              Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                              Nokia
                              "We must do something. This is something. Therefore, we must do this."
                              -- Antony Jay and Jonathan Lynn, "Yes Minister"

                              Comment

                              Working...