Hi all,
We're trying to test DB2's performance for an OLAP data warehouse.
We went with a column-store based approach after loading using a row-
store approach generated over 2 million rows for just a single month's
worth of data.
We're using Alphablox as the cubing engine, but we have found that
queries take several minutes to complete, which is worrying our
developers because it would render the application unusable. OLAP is
supposed to let the end users freely adjust their analysis but they
won't do so if drilling down on 1 dimension takes such a long time.
We're trialling with 1 month of data, which gives a fact table with:
450k rows
And each row has about 200 columns (it's an engineering analysis so
there are lots of parameters)
Currently our testing environment is a VMWare running Win2k3 w/ DB2
DWE 9.5 installed, 3.5GB ram, 80GB disk.
In comparison, the target production environment will have 4 CPU cores
and hopefully 10 disks so we can make more containers for the data and
make use of parallelism. However, would the 5-7 minute queries get
reduced to at sub-second responses? What other tuning tasks can we
perform? We also plan on using Query Patroller to build MQTs to
improve OLAP performance, but that is later in the project - currently
we need to be able to address the 5 minute query response.
Appreciate any advice :)
We're trying to test DB2's performance for an OLAP data warehouse.
We went with a column-store based approach after loading using a row-
store approach generated over 2 million rows for just a single month's
worth of data.
We're using Alphablox as the cubing engine, but we have found that
queries take several minutes to complete, which is worrying our
developers because it would render the application unusable. OLAP is
supposed to let the end users freely adjust their analysis but they
won't do so if drilling down on 1 dimension takes such a long time.
We're trialling with 1 month of data, which gives a fact table with:
450k rows
And each row has about 200 columns (it's an engineering analysis so
there are lots of parameters)
Currently our testing environment is a VMWare running Win2k3 w/ DB2
DWE 9.5 installed, 3.5GB ram, 80GB disk.
In comparison, the target production environment will have 4 CPU cores
and hopefully 10 disks so we can make more containers for the data and
make use of parallelism. However, would the 5-7 minute queries get
reduced to at sub-second responses? What other tuning tasks can we
perform? We also plan on using Query Patroller to build MQTs to
improve OLAP performance, but that is later in the project - currently
we need to be able to address the 5 minute query response.
Appreciate any advice :)