storing 1,000,000 records

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • lector

    storing 1,000,000 records

    do you think it will be better to use a file over here instead of a
    link list or array ? Can the file be organized as an array so that one
    can index into it ?
  • jacob navia

    #2
    Re: storing 1,000,000 records

    lector wrote:
    do you think it will be better to use a file over here instead of a
    link list or array ?
    If you have enough RAM use it since it is thousands of times faster
    than a disk.


    Can the file be organized as an array so that one
    can index into it ?
    Yes.

    If record size (on disk) is X, the 5876th record is at offset
    X*5875 bytes.


    --
    jacob navia
    jacob at jacob point remcomp point fr
    logiciels/informatique

    Comment

    • lector

      #3
      Re: storing 1,000,000 records

      On Apr 7, 12:32 am, jacob navia <ja...@nospam.c omwrote:
      >
      Yes.
      >
      If record size (on disk) is X, the 5876th record is at offset
      X*5875 bytes.
      >
      Can you please tell me what file function specifically does this ?

      Comment

      • lector

        #4
        Re: storing 1,000,000 records

        Is it this function

        int fseek ( FILE * stream, long int offset, int origin );

        I don't know if it sufficient to have offset as a long int.

        Comment

        • jacob navia

          #5
          Re: storing 1,000,000 records

          lector wrote:
          On Apr 7, 12:32 am, jacob navia <ja...@nospam.c omwrote:
          >
          >Yes.
          >>
          >If record size (on disk) is X, the 5876th record is at offset
          >X*5875 bytes.
          >>
          >
          Can you please tell me what file function specifically does this ?
          fseek

          --
          jacob navia
          jacob at jacob point remcomp point fr
          logiciels/informatique

          Comment

          • osmium

            #6
            Re: storing 1,000,000 records

            "lector" wote:

            do you think it will be better to use a file over here instead of a
            link list or array ?
            No way to tell with the little you have said. About the same as saying :
            "Is a car better than a truck?"
            >Can the file be organized as an array so that one
            can index into it ?
            Yes. Here is a link that might be helpful.




            Comment

            • jacob navia

              #7
              Re: storing 1,000,000 records

              lector wrote:
              Is it this function
              >
              int fseek ( FILE * stream, long int offset, int origin );
              >
              yes
              I don't know if it sufficient to have offset as a long int.
              if your file is smaller than 2GB yes, I am supposing 32 bit
              system.

              Since 2GB is 2147483648, you can store 1 millio of records
              of up to 2147 bytes, what is not that bad actually.

              If not, just use a compiler/OS that provides 64 bit access.



              --
              jacob navia
              jacob at jacob point remcomp point fr
              logiciels/informatique

              Comment

              • CBFalconer

                #8
                Re: storing 1,000,000 records

                jacob navia wrote:
                lector wrote:
                >
                >do you think it will be better to use a file over here instead
                >of a link list or array ?
                >
                If you have enough RAM use it since it is thousands of times
                faster than a disk.
                If you have a suitable OS it will buffer the file, and the
                performance difference from an internal buffer will be negligible.
                It will also probably be smart enough not to buffer portions of the
                file that you do not use.

                --
                [mail]: Chuck F (cbfalconer at maineline dot net)
                [page]: <http://cbfalconer.home .att.net>
                Try the download section.


                --
                Posted via a free Usenet account from http://www.teranews.com

                Comment

                • jacob navia

                  #9
                  Re: storing 1,000,000 records

                  CBFalconer wrote:
                  jacob navia wrote:
                  >lector wrote:
                  >>
                  >>do you think it will be better to use a file over here instead
                  >>of a link list or array ?
                  >If you have enough RAM use it since it is thousands of times
                  >faster than a disk.
                  >
                  If you have a suitable OS it will buffer the file, and the
                  performance difference from an internal buffer will be negligible.
                  It will also probably be smart enough not to buffer portions of the
                  file that you do not use.
                  >
                  Ahh. Yes of course.

                  So, that OS will not access the disk when you write
                  1 million records in a file?

                  Perfect.

                  Just tell me then one example of a suitable OS...


                  In any case you are just *confirming* what I said:

                  Ram is much faster than disk. Only that you rely on the
                  OS to do that. I recommended not relying on the OS and
                  do it yourself.
                  --
                  jacob navia
                  jacob at jacob point remcomp point fr
                  logiciels/informatique

                  Comment

                  • Nick Keighley

                    #10
                    Re: storing 1,000,000 records

                    On 6 Apr, 20:27, lector <hannibal.lecto ...@gmail.comwr ote:
                    do you think it will be better to use a file over here instead of a
                    link list or array ? Can the file be organized as an array so that one
                    can index into it ?
                    consider using a database

                    --
                    Nick Keighley

                    Comment

                    • user923005

                      #11
                      Re: storing 1,000,000 records

                      On Apr 6, 12:27 pm, lector <hannibal.lecto ...@gmail.comwr ote:
                      do you think it will be better to use a file over here instead of a
                      link list or array ? Can the file be organized as an array so that one
                      can index into it ?
                      I guess that you will find the easiest success if you use a database.

                      Comment

                      • lector

                        #12
                        Re: storing 1,000,000 records

                        Do you think it will make things even more efficient if I read and
                        write data in binary and in chunks of bytes ? I'm doing this using
                        fread and fwrite functions. eg. something like below

                        /*-------------------- WRITES EMPLOYEE RECORDS TO A BINARY
                        FILE-----------------------*/
                        #include <stdio.h>
                        #include <stdlib.h>

                        int main(void)
                        {
                        FILE *fp;
                        char another = 'Y';
                        typedef struct emp_struct
                        {
                        char name[40];
                        int age;
                        float bs;
                        } emp;

                        emp e;

                        fp = fopen("EMP.DAT" , "wb");

                        if(fp == NULL)
                        {
                        puts("Cannot open file");
                        exit(EXIT_FAILU RE);
                        }
                        while(another == 'Y')
                        {
                        printf("\nEnter name, age and basic salary\n");
                        scanf("%s %d %f", e.name, &e.age, &e.bs);
                        fwrite(&e, sizeof(e), 1, fp);

                        printf("Add another record (Y/N)");
                        fflush(stdin);
                        another = getchar();
                        }

                        fclose(fp);
                        return 0;
                        }

                        Comment

                        • user923005

                          #13
                          Re: storing 1,000,000 records

                          On Apr 7, 10:25 pm, lector <hannibal.lecto ...@gmail.comwr ote:
                          Do you think it will make things even more efficient if I read and
                          write data in binary and in chunks of bytes ? I'm doing this using
                          fread and fwrite functions. eg. something like below
                          >
                          /*-------------------- WRITES EMPLOYEE RECORDS TO A BINARY
                          FILE-----------------------*/
                          #include <stdio.h>
                          #include <stdlib.h>
                          >
                          int main(void)
                          {
                                  FILE *fp;
                                  char another = 'Y';
                                  typedef struct emp_struct
                                  {
                                           char name[40];
                                           int age;
                                           float bs;
                                   } emp;
                          >
                                  emp e;
                          >
                                  fp = fopen("EMP.DAT" , "wb");
                          >
                                  if(fp == NULL)
                              {
                                          puts("Cannot open file");
                                          exit(EXIT_FAILU RE);
                                }
                                  while(another == 'Y')
                              {
                                          printf("\nEnter name, age and basic salary\n");
                                          scanf("%s %d %f", e.name, &e.age, &e.bs);
                                          fwrite(&e, sizeof(e), 1, fp);
                          >
                                          printf("Add another record (Y/N)");
                                          fflush(stdin);
                                          another = getchar();
                                }
                          >
                                  fclose(fp);
                                 return 0;
                          >
                          >
                          >
                          }
                          Yes, binary is faster though not portable.

                          Even better would be a system that allows you to create hashed or B-
                          tree indexes on your table.

                          And can you imagine how nice it would be to have arbitrary search
                          features that can find things like "name BETWEEN 'Johnson' AND
                          'Johnson'" or "bs 29575.00".

                          Let's let our imagination run wild and suppose that we could even do
                          things like collecting average age or sum of basic salary.

                          I guess we're just dreaming now. Too bad there is nothing like that
                          on the planet.
                          ;-)

                          Comment

                          • lector

                            #14
                            Re: storing 1,000,000 records

                            On Apr 8, 10:41 am, user923005 <dcor...@connx. comwrote:
                            On Apr 7, 10:25 pm, lector <hannibal.lecto ...@gmail.comwr ote:
                            Yes, binary is faster though not portable.
                            >
                            Even better would be a system that allows you to create hashed or B-
                            tree indexes on your table.
                            >
                            And can you imagine how nice it would be to have arbitrary search
                            features that can find things like "name BETWEEN 'Johnson' AND
                            'Johnson'" or "bs 29575.00".
                            >
                            Let's let our imagination run wild and suppose that we could even do
                            things like collecting average age or sum of basic salary.
                            >
                            I guess we're just dreaming now. Too bad there is nothing like that
                            on the planet.
                            ;-)
                            Yes, but then there might be an issue with choosing a hash function

                            Comment

                            • Barry Schwarz

                              #15
                              Re: storing 1,000,000 records

                              On Mon, 7 Apr 2008 22:25:18 -0700 (PDT), lector
                              <hannibal.lecto r99@gmail.comwr ote:
                              >Do you think it will make things even more efficient if I read and
                              >write data in binary and in chunks of bytes ? I'm doing this using
                              >fread and fwrite functions. eg. something like below
                              Writing one large chunk as opposed to several small chunks usually
                              means less calls to the I/O functions which usually means less
                              overhead for those calls. This has nothing to do with binary vs text.
                              If you built a large string containing the text equivalent of your
                              structure members, you would achieve the same efficiency with regard
                              to calling I/O functions without the problems introduced by binary
                              noted below.
                              >
                              >/*-------------------- WRITES EMPLOYEE RECORDS TO A BINARY
                              >FILE-----------------------*/
                              >#include <stdio.h>
                              >#include <stdlib.h>
                              >
                              >int main(void)
                              >{
                              > FILE *fp;
                              > char another = 'Y';
                              > typedef struct emp_struct
                              > {
                              > char name[40];
                              > int age;
                              > float bs;
                              > } emp;
                              >
                              > emp e;
                              >
                              > fp = fopen("EMP.DAT" , "wb");
                              >
                              > if(fp == NULL)
                              {
                              > puts("Cannot open file");
                              > exit(EXIT_FAILU RE);
                              }
                              > while(another == 'Y')
                              {
                              > printf("\nEnter name, age and basic salary\n");
                              > scanf("%s %d %f", e.name, &e.age, &e.bs);
                              This opens up the possibility of the user entering more than 39
                              characters into name. This will not support a name which contains an
                              embedded blank. You really should check that scanf returns 3.
                              > fwrite(&e, sizeof(e), 1, fp);
                              If you change compilers, or possibly even compiler options, the file
                              may be difficult to process because of different padding in the
                              structure. If you transport the file to a different system, the int
                              and the float may have problems due to endian-ness or representation.
                              >
                              > printf("Add another record (Y/N)");
                              > fflush(stdin);
                              fflush is not defined for input streams.
                              > another = getchar();
                              What will you do if the user enters 'y'?

                              On most interactive systems, the user will need to press Enter after
                              typing the 'Y'. This will leave a '\n' in the buffer. When you go
                              back to the scanf, this character will be processed immediately and
                              the user will never be able to enter the three values.
                              }
                              >
                              > fclose(fp);
                              return 0;
                              >}

                              Remove del for email

                              Comment

                              Working...