dealing with huge data

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • pereges

    dealing with huge data

    ok so i have written a program in C where I am dealing with huge
    data(millions and lots of iterations involved) and for some reason the
    screen tends to freeze and I get no output every time I execute it.
    However, I have tried to reduce the amount of data and the program
    runs fine.
    What could possibly be done to resolve this ?
  • pereges

    #2
    Re: dealing with huge data

    I forgot to mention this happened while I was trying to print data.

    I have seen it can't work for extremely huge data.

    Comment

    • Kenneth Brody

      #3
      Re: dealing with huge data

      pereges wrote:
      >
      ok so i have written a program in C where I am dealing with huge
      data(millions and lots of iterations involved) and for some reason the
      screen tends to freeze and I get no output every time I execute it.
      However, I have tried to reduce the amount of data and the program
      runs fine.
      What could possibly be done to resolve this ?
      There's a bug on line 42.

      --
      +-------------------------+--------------------+-----------------------+
      | Kenneth J. Brody | www.hvcomputer.com | #include |
      | kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer .h|
      +-------------------------+--------------------+-----------------------+
      Don't e-mail me at: <mailto:ThisIsA SpamTrap@gmail. com>

      Comment

      • santosh

        #4
        Re: dealing with huge data

        pereges wrote:

        <program "freezing" on "huge data" and millions of iterations>
        I forgot to mention this happened while I was trying to print data.
        Print where? To a disk file? To a flash drive? To a screen? Some other
        device? To memory? What's the code for the print function[s]? What are
        the data structures involved? Did you try compiler optimisations? Did
        you try implementation specific I/O routines (which are sometimes
        faster than standard C ones)? Did you profile the program?
        I have seen it can't work for extremely huge data.
        Can't work or works too slowly for your taste?

        Unless you show us your current code and where exactly it's performance
        is not meeting your expectations, there's absolutely nothing that can
        be said other than the generic advice to buy faster storage devices and
        faster, more powerful hardware.

        Comment

        • CBFalconer

          #5
          Re: dealing with huge data

          pereges wrote:
          >
          ok so i have written a program in C where I am dealing with huge
          data (millions and lots of iterations involved) and for some
          reason the screen tends to freeze and I get no output every time
          I execute it. However, I have tried to reduce the amount of data
          and the program runs fine.
          >
          What could possibly be done to resolve this ?
          On the information supplied, I suspect that simply reducing the
          amount of data will fix the problem. I am unable to estimate how
          much it should be reduced.

          --
          [mail]: Chuck F (cbfalconer at maineline dot net)
          [page]: <http://cbfalconer.home .att.net>
          Try the download section.

          ** Posted from http://www.teranews.com **

          Comment

          • Richard Heathfield

            #6
            Re: dealing with huge data

            CBFalconer said:
            pereges wrote:
            >>
            >ok so i have written a program in C where I am dealing with huge
            >data (millions and lots of iterations involved) and for some
            >reason the screen tends to freeze and I get no output every time
            >I execute it. However, I have tried to reduce the amount of data
            >and the program runs fine.
            >>
            >What could possibly be done to resolve this ?
            >
            On the information supplied, I suspect that simply reducing the
            amount of data will fix the problem. I am unable to estimate how
            much it should be reduced.
            In a similar vein, it was reported a few years ago that a computer program,
            on being told that 90% of accidents in the home involved either the top
            stair or the bottom stair and being asked what to do to reduce accidents,
            suggested removing the top and bottom stairs.

            C programs regularly have to deal with very large amounts of data, and many
            of them do so with admirable efficiency. The large amount of data, then,
            is *not* the cause of the problem. Rather, it is when large amounts of
            data are being processed that the problem manifests itself. Therefore,
            reducing the amount of data will not only *not* fix the problem, but will
            actually hide it, making it *harder* to fix.

            The proper solution is to find and fix the bug that is causing the problem.
            The way to do /that/ is to reduce, not the amount of *data*, but the
            amount of *code* - until the OP has the smallest compilable program that
            reproduces the problem. It is often the case that, in preparing such a
            program, the author of the code will find the problem. But if not, at
            least he or she now has a minimal program that can be presented for
            analysis by C experts, such as those who regularly haunt the corridors of
            comp.lang.c. I commend this strategy to the OP.

            --
            Richard Heathfield <http://www.cpax.org.uk >
            Email: -http://www. +rjh@
            Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
            "Usenet is a strange place" - dmr 29 July 1999

            Comment

            • user923005

              #7
              Re: dealing with huge data

              On Apr 23, 5:00 pm, Richard Heathfield <r...@see.sig.i nvalidwrote:
              CBFalconer said:
              >
              pereges wrote:
              >
              ok so i have written a program in C where I am dealing with huge
              data (millions and lots of iterations involved) and for some
              reason the screen tends to freeze and I get no output every time
              I execute it. However,  I have tried to reduce the amount of data
              and the program runs fine.
              >
              What could possibly be done to resolve this ?
              >
              On the information supplied, I suspect that simply reducing the
              amount of data will fix the problem.  I am unable to estimate how
              much it should be reduced.
              >
              In a similar vein, it was reported a few years ago that a computer program,
              on being told that 90% of accidents in the home involved either the top
              stair or the bottom stair and being asked what to do to reduce accidents,
              suggested removing the top and bottom stairs.
              >
              C programs regularly have to deal with very large amounts of data, and many
              of them do so with admirable efficiency. The large amount of data, then,
              is *not* the cause of the problem. Rather, it is when large amounts of
              data are being processed that the problem manifests itself. Therefore,
              reducing the amount of data will not only *not* fix the problem, but will
              actually hide it, making it *harder* to fix.
              >
              The proper solution is to find and fix the bug that is causing the problem..
              The way to do /that/ is to reduce, not the amount of *data*, but the
              amount of *code* - until the OP has the smallest compilable program that
              reproduces the problem. It is often the case that, in preparing such a
              program, the author of the code will find the problem. But if not, at
              least he or she now has a minimal program that can be presented for
              analysis by C experts, such as those who regularly haunt the corridors of
              comp.lang.c. I commend this strategy to the OP.
              I don't think we can give good advice until the OP actually states
              what his exact problem is.
              This:
              ok so i have written a program in C where I am dealing with huge
              data (millions and lots of iterations involved) and for some
              reason the screen tends to freeze and I get no output every time
              I execute it. However, I have tried to reduce the amount of data
              and the program runs fine.
              Does not really tell us anything.

              Millions of records? In what format? What operations are performed
              against the data? What is the actual underlying problem that is being
              solved?

              Probably, there is a good, inexpensive and compact solution and likely
              there are prebuilt tools that will already accomplish the job (or get
              most of the way there).

              "Big data" that "seems to freeze" doesn't mean anything.

              Comment

              • pereges

                #8
                Re: dealing with huge data

                On Apr 23, 10:25 pm, santosh <santosh....@gm ail.comwrote:
                pereges wrote:
                >
                <program "freezing" on "huge data" and millions of iterations>
                >
                I forgot to mention this happened while I was trying to print data.
                >
                Print where? To a disk file? To a flash drive? To a screen? Some other
                device? To memory? What's the code for the print function[s]? What are
                the data structures involved? Did you try compiler optimisations? Did
                you try implementation specific I/O routines (which are sometimes
                faster than standard C ones)? Did you profile the program?
                >
                I have seen it can't work for extremely huge data.
                >
                Can't work or works too slowly for your taste?
                >
                Unless you show us your current code and where exactly it's performance
                is not meeting your expectations, there's absolutely nothing that can
                be said other than the generic advice to buy faster storage devices and
                faster, more powerful hardware.

                There are ~ 500 lines in the code. If you don't mind reading it I will
                definetely post it.
                I didn't post it for a reason.

                Comment

                • Gordon Burditt

                  #9
                  Re: dealing with huge data

                  >ok so i have written a program in C where I am dealing with huge
                  >data(million s and lots of iterations involved) and for some reason the
                  >screen tends to freeze and I get no output every time I execute it.
                  >However, I have tried to reduce the amount of data and the program
                  >runs fine.
                  >What could possibly be done to resolve this ?
                  Are you SURE that the screen freezes, and it's not just taking
                  a long time? (When in doubt, let it run over a weekend.)

                  You don't give a very good idea of what your program is doing, but
                  some hints that might apply:

                  Your program almost certainly has at least one bug.

                  Make sure that every call to malloc() is checked, and that you
                  report any calls that run out of memory. Also check if the behavior
                  changes if you change limits on the amount of memory the process
                  can allocate (e.g. 'ulimit').

                  Use any tools (like 'ps') you might have to see how large the program
                  is and whether it's swapping so much little CPU gets used but much
                  swapping is done.

                  If it's a multi-process program, you might be deadlocking on
                  allocation of swap/page space.

                  Make sure that you do not use more memory than you allocated (often
                  called "buffer overflow", although this problem is a bit more general
                  than a buffer overflow). This can be difficult to find. If you
                  corrupt the data malloc() uses to keep track of free memory,
                  subsequent calls to malloc() or free() might infinite loop.

                  Add some output statements to the program so you can see how far
                  it gets. Include something at the start of the program, and, say,
                  after you have read all the input but before you begin processing it.

                  Comment

                  • arnuld

                    #10
                    Re: dealing with huge data

                    On Thu, 24 Apr 2008 00:00:04 +0000, Richard Heathfield wrote:

                    In a similar vein, it was reported a few years ago that a computer
                    program, on being told that 90% of accidents in the home involved either
                    the top stair or the bottom stair and being asked what to do to reduce
                    accidents, suggested removing the top and bottom stairs.
                    >
                    C programs regularly have to deal with very large amounts of data, and
                    many of them do so with admirable efficiency. The large amount of data,
                    then, is *not* the cause of the problem. Rather, it is when large
                    amounts of data are being processed that the problem manifests itself.
                    Therefore, reducing the amount of data will not only *not* fix the
                    problem, but will actually hide it, making it *harder* to fix.
                    >
                    The proper solution is to find and fix the bug that is causing the
                    problem. The way to do /that/ is to reduce, not the amount of *data*,
                    but the amount of *code* - until the OP has the smallest compilable
                    program that reproduces the problem. It is often the case that, in
                    preparing such a program, the author of the code will find the problem.
                    But if not, at least he or she now has a minimal program that can be
                    presented for analysis by C experts, such as those who regularly haunt
                    the corridors of comp.lang.c. I commend this strategy to the OP.

                    OMG, I am sure this is one of the best advices of
                    doing Software-Construction.


                    --

                    my email ID is at the above address

                    Comment

                    • arnuld

                      #11
                      Re: dealing with huge data

                      On Wed, 23 Apr 2008 20:16:25 -0700, pereges wrote:

                      There are ~ 500 lines in the code. If you don't mind reading it I will
                      definetely post it.
                      I didn't post it for a reason.

                      I know that. As Richard Heathfield said find and post the smallest
                      compilable unit.





                      --

                      my email ID is at the above address

                      Comment

                      • CBFalconer

                        #12
                        Re: dealing with huge data

                        pereges wrote:
                        santosh <santosh....@gm ail.comwrote:
                        >
                        .... snip ...
                        >
                        >Unless you show us your current code and where exactly it's
                        >performance is not meeting your expectations, there's absolutely
                        >nothing that can be said other than the generic advice to buy
                        >faster storage devices and faster, more powerful hardware.
                        >
                        There are ~ 500 lines in the code. If you don't mind reading it I
                        will definetely post it. I didn't post it for a reason.
                        Then you have some work to do. Cut it down to a compilable and
                        runnable program of 100 to 200 lines that has the same fault.
                        After that, if you haven't found the problem in the process,
                        publish the result together with the input data and fault.

                        --
                        [mail]: Chuck F (cbfalconer at maineline dot net)
                        [page]: <http://cbfalconer.home .att.net>
                        Try the download section.


                        ** Posted from http://www.teranews.com **

                        Comment

                        • Bartc

                          #13
                          Re: dealing with huge data


                          "pereges" <Broli00@gmail. comwrote in message
                          news:9f103a28-39ea-41db-964f-eebe37750c89@h1 g2000prh.google groups.com...
                          ok so i have written a program in C where I am dealing with huge
                          data(millions and lots of iterations involved) and for some reason the
                          screen tends to freeze and I get no output every time I execute it.
                          However, I have tried to reduce the amount of data and the program
                          runs fine.
                          What could possibly be done to resolve this ?
                          Do you expect the execution time to increase in proportion to the amount of
                          data?

                          What are the timings for N=10 (where N is some measure of the amount of
                          data)?. N=100, 1000, 10K, 1M, etc? What do you mean by huge anyway, how much
                          data are we talking about?

                          At what level of N does it stop working? What did you expect the execution
                          time to be? Does the machine make noises like lots of disk activity
                          (assuming you are not dealing with disk i/o anyway)? Sometimes when you
                          exceed machine memory everything gets a lot slower.

                          Can you measure what resources are being used at each point, like memory?

                          Your code is only 500 lines. Can you put print statements in to show what's
                          happening? Not for every iteration, but maybe only when N>X, some limit
                          above which you know it fails. Or after 100ms have passed since the last
                          output, etc.

                          (You mentioned you are printing to the screen anyway; so maybe you can tell
                          from the output, what point in the execution it has reached and can put in
                          extra debug output.)

                          It sounds like above a certain level of data, some limit or resource is
                          being exceeded, causing it to hang, or perhaps entering an endless loop
                          (those are a little different, I think..).

                          --
                          Bartc



                          Comment

                          • Nick Keighley

                            #14
                            Re: dealing with huge data

                            On 24 Apr, 06:11, gordonb.ec...@b urditt.org (Gordon Burditt) wrote:
                            ok so i have written a program in C where I am dealing with huge
                            data(millions and lots of iterations involved) and for some reason the
                            screen tends to freeze and I get no output every time I execute it.
                            However,  I have tried to reduce the amount of data and the program
                            runs fine.
                            What could possibly be done to resolve this ?
                            Are you SURE that the screen freezes, and it's not just taking
                            a long time?  (When in doubt, let it run over a weekend.)
                            sounds like it's just very slow

                            You don't give a very good idea of what your program is doing, but
                            some hints that might apply:
                            >
                            Your program almost certainly has at least one bug.
                            just on the principle that all programs have at least one bug?

                            Make sure that every call to malloc() is checked, and that you
                            report any calls that run out of memory.  Also check if the behavior
                            changes if you change limits on the amount of memory the process
                            can allocate (e.g. 'ulimit').
                            >
                            Use any tools (like 'ps') you might have to see how large the program
                            is and whether it's swapping so much little CPU gets used but much
                            swapping is done.
                            >
                            If it's a multi-process program, you might be deadlocking on
                            allocation of swap/page space.
                            >
                            Make sure that you do not use more memory than you allocated (often
                            called "buffer overflow", although this problem is a bit more general
                            than a buffer overflow).  This can be difficult to find.  If you
                            corrupt the data malloc() uses to keep track of free memory,
                            subsequent calls to malloc() or free() might infinite loop.
                            >
                            Add some output statements to the program so you can see how far
                            it gets.  Include something at the start of the program, and, say,
                            after you have read all the input but before you begin processing it.
                            maybe even consider a profiler


                            --
                            Nick Keighley

                            I'd rather write programs to write programs than write programs

                            Comment

                            • pereges

                              #15
                              Re: dealing with huge data

                              freeing (using free) the memory allocated(using malloc()) has
                              certainly improved the performance of my program and now gives output
                              for even larger data. but still there are issues. i will post a
                              minimal version of my code later.

                              Comment

                              Working...