System.Data.DataTable Performance Issue

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Buddy

    System.Data.DataTable Performance Issue

    Hello,

    We are using DataTable to store our data that we retrieve
    from SQL because it provides us with ROW/COLUMN concept.
    Due to DataTable been memory hungary we are finding that
    at least 40MB of memory is been created every second which
    means the GC is concently running. We want to use
    something else that is not memory hungary and will give us
    the ROW/COLUMN concept (ie Multi-dimension Arrays).

    My question is that, is there any other .Net object that
    is light weight which will give me the ROW/COLUMN concept.
    Please also note that we do want to store data in the
    COLUMNS with different datatypes. For example I could have
    2 columns. Column 1 is a string and Column 2 is a double.

    Thanks,
  • Jeff Killingsworth

    #2
    System.Data.Dat aTable Performance Issue

    Hello,

    You could try using a multidimentoial array:

    object[,,] oArr = new object[ 20, 4 ];

    Then you can stick anything you want in any cell, but be
    aware that some boxing and unboxing will occur.

    You could create a typed multidementiona l array:

    object[] rs = new object[4]; // 4 equals the column count
    rs[0] = new int[20]; // 20 equals the row count
    rs[1] = new string[20];
    rs[2] = new double[20];
    rs[3] = new bool[20];

    double tmp = ((double[])rs[2])[15]; // 3rd column of 16th
    row

    or if you are reading from a database you could use a
    DataReader. ( Its forward only and read only )

    Hope this helps.

    Jeff
    [color=blue]
    >-----Original Message-----
    >Hello,
    >
    >We are using DataTable to store our data that we[/color]
    retrieve[color=blue]
    >from SQL because it provides us with ROW/COLUMN concept.
    >Due to DataTable been memory hungary we are finding that
    >at least 40MB of memory is been created every second[/color]
    which[color=blue]
    >means the GC is concently running. We want to use
    >something else that is not memory hungary and will give[/color]
    us[color=blue]
    >the ROW/COLUMN concept (ie Multi-dimension Arrays).
    >
    >My question is that, is there any other .Net object that
    >is light weight which will give me the ROW/COLUMN[/color]
    concept.[color=blue]
    >Please also note that we do want to store data in the
    >COLUMNS with different datatypes. For example I could[/color]
    have[color=blue]
    >2 columns. Column 1 is a string and Column 2 is a double.
    >
    >Thanks,
    >.
    >[/color]

    Comment

    • Buddy

      #3
      System.Data.Dat aTable Performance Issue

      Thanks for the reply,

      But how to I get the number of retrieved when using the
      DataReader when applying SELECT queries.

      Thanks
      [color=blue]
      >-----Original Message-----
      >Hello,
      >
      >You could try using a multidimentoial array:
      >
      >object[,,] oArr = new object[ 20, 4 ];
      >
      >Then you can stick anything you want in any cell, but be
      >aware that some boxing and unboxing will occur.
      >
      >You could create a typed multidementiona l array:
      >
      >object[] rs = new object[4]; // 4 equals the column count
      >rs[0] = new int[20]; // 20 equals the row count
      >rs[1] = new string[20];
      >rs[2] = new double[20];
      >rs[3] = new bool[20];
      >
      >double tmp = ((double[])rs[2])[15]; // 3rd column of[/color]
      16th[color=blue]
      >row
      >
      >or if you are reading from a database you could use a
      >DataReader. ( Its forward only and read only )
      >
      >Hope this helps.
      >
      >Jeff
      >[color=green]
      >>-----Original Message-----
      >>Hello,
      >>
      >>We are using DataTable to store our data that we[/color]
      >retrieve[color=green]
      >>from SQL because it provides us with ROW/COLUMN[/color][/color]
      concept.[color=blue][color=green]
      >>Due to DataTable been memory hungary we are finding[/color][/color]
      that[color=blue][color=green]
      >>at least 40MB of memory is been created every second[/color]
      >which[color=green]
      >>means the GC is concently running. We want to use
      >>something else that is not memory hungary and will give[/color]
      >us[color=green]
      >>the ROW/COLUMN concept (ie Multi-dimension Arrays).
      >>
      >>My question is that, is there any other .Net object[/color][/color]
      that[color=blue][color=green]
      >>is light weight which will give me the ROW/COLUMN[/color]
      >concept.[color=green]
      >>Please also note that we do want to store data in the
      >>COLUMNS with different datatypes. For example I could[/color]
      >have[color=green]
      >>2 columns. Column 1 is a string and Column 2 is a[/color][/color]
      double.[color=blue][color=green]
      >>
      >>Thanks,
      >>.
      >>[/color]
      >.
      >[/color]

      Comment

      • Buddy

        #4
        System.Data.Dat aTable Performance Issue

        Hello,

        I mean number of rows retrieved.
        [color=blue]
        >-----Original Message-----
        >Thanks for the reply,
        >
        >But how to I get the number of retrieved when using the
        >DataReader when applying SELECT queries.
        >
        >Thanks
        >[color=green]
        >>-----Original Message-----
        >>Hello,
        >>
        >>You could try using a multidimentoial array:
        >>
        >>object[,,] oArr = new object[ 20, 4 ];
        >>
        >>Then you can stick anything you want in any cell, but[/color][/color]
        be[color=blue][color=green]
        >>aware that some boxing and unboxing will occur.
        >>
        >>You could create a typed multidementiona l array:
        >>
        >>object[] rs = new object[4]; // 4 equals the column[/color][/color]
        count[color=blue][color=green]
        >>rs[0] = new int[20]; // 20 equals the row count
        >>rs[1] = new string[20];
        >>rs[2] = new double[20];
        >>rs[3] = new bool[20];
        >>
        >>double tmp = ((double[])rs[2])[15]; // 3rd column of[/color]
        >16th[color=green]
        >>row
        >>
        >>or if you are reading from a database you could use a
        >>DataReader. ( Its forward only and read only )
        >>
        >>Hope this helps.
        >>
        >>Jeff
        >>[color=darkred]
        >>>-----Original Message-----
        >>>Hello,
        >>>
        >>>We are using DataTable to store our data that we[/color]
        >>retrieve[color=darkred]
        >>>from SQL because it provides us with ROW/COLUMN[/color][/color]
        >concept.[color=green][color=darkred]
        >>>Due to DataTable been memory hungary we are finding[/color][/color]
        >that[color=green][color=darkred]
        >>>at least 40MB of memory is been created every second[/color]
        >>which[color=darkred]
        >>>means the GC is concently running. We want to use
        >>>something else that is not memory hungary and will[/color][/color][/color]
        give[color=blue][color=green]
        >>us[color=darkred]
        >>>the ROW/COLUMN concept (ie Multi-dimension Arrays).
        >>>
        >>>My question is that, is there any other .Net object[/color][/color]
        >that[color=green][color=darkred]
        >>>is light weight which will give me the ROW/COLUMN[/color]
        >>concept.[color=darkred]
        >>>Please also note that we do want to store data in the
        >>>COLUMNS with different datatypes. For example I could[/color]
        >>have[color=darkred]
        >>>2 columns. Column 1 is a string and Column 2 is a[/color][/color]
        >double.[color=green][color=darkred]
        >>>
        >>>Thanks,
        >>>.
        >>>[/color]
        >>.
        >>[/color]
        >.
        >[/color]

        Comment

        • Buddy

          #5
          RE: System.Data.Dat aTable Performance Issue

          Hello Parker,

          I cannot use the 'select count(*)' SQL query because the
          code has been converted to C#. This means that I would
          have to manually change at least 8000 to 10,000 lines of
          code.

          Thanks,
          [color=blue]
          >-----Original Message-----
          >Hi Buddy,
          >
          >DataReader uses a forward only and read only cursor.
          >
          >The RecordCount is unavailable for this type of cursor[/color]
          until all the[color=blue]
          >records have been read. Both ADO and DAO return -1 for[/color]
          the RecordCount of[color=blue]
          >the equivalent cursor.
          >
          >You can use "select count(*) from ..." to get the record[/color]
          count.[color=blue]
          >
          >--
          >Parker Zhang
          >Microsoft Developer Support
          >
          >This posting is provided "AS IS" with no warranties, and[/color]
          confers no rights.[color=blue]
          >
          >.
          >[/color]

          Comment

          • Parker Zhang [MSFT]

            #6
            RE: System.Data.Dat aTable Performance Issue

            Hello,

            Then, how about counting the records as you go through the reader?

            Microsoft Support is here to help you with Microsoft products. Find how-to articles, videos, and training for Microsoft Copilot, Microsoft 365, Windows 11, Surface, and more.


            Would you please explain a little more why you cannot use "select count(*)"?

            All you have to do is to add a few lines to get the record number. Why do
            you have to change at least 8000 to 10,000 lines?

            --
            Parker Zhang
            Microsoft Developer Support

            This posting is provided "AS IS" with no warranties, and confers no rights.

            Comment

            • Buddy

              #7
              RE: System.Data.Dat aTable Performance Issue

              Hello Parker,

              Let me explain the actual problem. We've had Microsoft
              profile our system and found that there is a lot of GC
              taking place (approx 40MB second of memory is been created
              and destroyed), the reason behind is that we are using
              DataTable to store our data that we get from SQL server.
              The reason we use DataTable is that it was the best way we
              could see of getting the data from SQL by using the
              SqlDataAdapter and then using the Fill method. The problem
              is that the Fill method will fill into a Dataset and not
              into a DataTable (even though under the cover the DataSet
              has a DataTable). So we would then have to copy the data
              from the DataSet into our DataTable, this would result to
              the DataSet object been destroyed each time. In a given
              second we could be doing up to 50-100 separate queries.

              So we started looking at using the DataReader so we could
              then copy the data directly into our DataTable but the
              problem is that when the data is retrieved the number of
              records retrieved is sent to the user and then we can then
              fetch back the data. The user will then decide how many
              records to fetch, for example we might return 100 and the
              user only wants the first 50 records. This means we need
              to send the total number of records before retrieving the
              data. We could do this by caching the data in another
              DataTable but this would mean we are back to square one
              with the GC.

              Please also note that the code has been CONVERTED from
              another language called AM (4-GL language) and we are
              testing the conversion, the SQL queries is also embedded
              in the code so to run an extra query like COUNT(*) is
              almost impossible because the queries is passed to a
              common function which then runs the query, we would have
              to parse the SQL query to perform the COUNT(*) which is
              not the route we would want to take as we would prefer to
              leave the query parsing to the provider engine.
              For example the user will pass in the following query
              SELECT Client, Name, PostCode From Client WHERE Name
              Like 'Smith%'
              Your approach is to do this
              SELECT COUNT(*) From Client WHERE Name Like 'Smith%'
              and then run the actual query
              SELECT Client, Name, PostCode From Client WHERE Name
              This query is simpler but when you have joins then it gets
              tricker which requires us to parse it.


              Thanks,

              [color=blue]
              >-----Original Message-----
              >Hello,
              >
              >Then, how about counting the records as you go through[/color]
              the reader?[color=blue]
              >
              >http://support.microsoft.com/?id=308352
              >
              >Would you please explain a little more why you cannot[/color]
              use "select count(*)"?[color=blue]
              >
              >All you have to do is to add a few lines to get the[/color]
              record number. Why do[color=blue]
              >you have to change at least 8000 to 10,000 lines?
              >
              >--
              >Parker Zhang
              >Microsoft Developer Support
              >
              >This posting is provided "AS IS" with no warranties, and[/color]
              confers no rights.[color=blue]
              >
              >.
              >[/color]

              Comment

              • Parker Zhang [MSFT]

                #8
                RE: System.Data.Dat aTable Performance Issue

                Hi Buddy,
                [color=blue]
                >The reason we use DataTable is that it was the best way we
                >could see of getting the data from SQL by using the
                >SqlDataAdapt er and then using the Fill method. The problem
                >is that the Fill method will fill into a Dataset and not
                >into a DataTable (even though under the cover the DataSet
                >has a DataTable). So we would then have to copy the data
                >from the DataSet into our DataTable, this would result to
                >the DataSet object been destroyed each time. In a given
                >second we could be doing up to 50-100 separate queries.[/color]

                The Fill method will also fill into a DataTable:

                Overloads Public Function Fill( _
                ByVal dataTable As DataTable _
                ) As Integer

                Gain technical skills through documentation and training, earn certifications and connect with the community

                taadapterclassf illtopic2.asp

                Please let me know if you have any questions.

                --
                Parker Zhang
                Microsoft Developer Support

                This posting is provided "AS IS" with no warranties, and confers no rights.

                Comment

                • Buddy

                  #9
                  RE: System.Data.Dat aTable Performance Issue

                  Hello Parker,

                  Thank you, I'll give this a try.

                  Thanks again,[color=blue]
                  >-----Original Message-----
                  >Hi Buddy,
                  >[color=green]
                  >>The reason we use DataTable is that it was the best way[/color][/color]
                  we[color=blue][color=green]
                  >>could see of getting the data from SQL by using the
                  >>SqlDataAdapte r and then using the Fill method. The[/color][/color]
                  problem[color=blue][color=green]
                  >>is that the Fill method will fill into a Dataset and not
                  >>into a DataTable (even though under the cover the[/color][/color]
                  DataSet[color=blue][color=green]
                  >>has a DataTable). So we would then have to copy the data
                  >>from the DataSet into our DataTable, this would result[/color][/color]
                  to[color=blue][color=green]
                  >>the DataSet object been destroyed each time. In a given
                  >>second we could be doing up to 50-100 separate queries.[/color]
                  >
                  >The Fill method will also fill into a DataTable:
                  >
                  >Overloads Public Function Fill( _
                  > ByVal dataTable As DataTable _
                  >) As Integer
                  >
                  >http://msdn.microsoft.com/library/en-[/color]
                  us/cpref/html/frlrfsystemdata commondbda[color=blue]
                  >taadapterclass filltopic2.asp
                  >
                  >Please let me know if you have any questions.
                  >
                  >--
                  >Parker Zhang
                  >Microsoft Developer Support
                  >
                  >This posting is provided "AS IS" with no warranties, and[/color]
                  confers no rights.[color=blue]
                  >
                  >.
                  >[/color]

                  Comment

                  Working...