SIMPLE but SLOW SELECT query!!!

**MMcCarthy** · Jun 27 '07, 02:17 AM

I don't really understand what you are doing the subquery selects COMPONENT1_ID Where COMPONENT1_ID = 1 so why not just say
WHERE PRIMARY_ACCT_NU M = 1

However, if as I suspect COMPONENT1_ID is not the field that holds the value corresponding to PRIMARY_ACCT_NU M then that is the field you should be Selecting in the subquery.

**Snib** · Jun 30 '07, 02:19 PM

Assuming that the SQL you meant to post was as follows:

SELECT D.STORE_FACILIT Y_ID,
D.PRIMARY_ACCT_ NUM
FROM DS_STORE D
WHERE D.PRIMARY_ACCT_ NUM IN
(SELECT W.PRIMARY_ACCT_NU M
FROM WK_FILTER_CMPNT W
WHERE W.COMPONENT1_ID = 1)
FOR FETCH ONLY

Then there are a few things to consider:

1/ Temporary data created for an "IN" statement does not have an INDEX so the data is searched sequentially from start to end to find if the value being searched for is IN the list. If there is a large amount of data created for the IN statement this can slow down the SQL.

2/ If you "correlate" sub-selects then DB2 will use indexes, if they exist, to speed up the SQL. If the amount of data in the IN statement is large then correlated sub-selects can be faster.

3/ As you are not using any predicates to restrict the data selected on the first table then this table will be accessed via a table space scan. As the second table does have predicates (well one at least!) then this sub-select will be executed first by DB2 to create the data for the IN statement. Once it has this data it will then start the table space scan on the first table and search the data from the IN statement with the field selected - it will execute the search of the data for the IN statement for every row on the first table!

4/ You are not actually returning any of the data from the second table so you could use a correlated sub-select with an EXISTS clause. Something like this:

SELECT D.STORE_FACILIT Y_ID,
D.PRIMARY_ACCT_ NUM
FROM DS_STORE D
WHERE EXISTS
(SELECT
FROM WK_FILTER_CMPNT W
WHERE W.COMPONENT1_ID = 1
AND W.PRIMARY_ACCT_ NUM = D.PRIMARY_ACCT_ NUM IN)
FOR FETCH ONLY

If you then have an index on DS_STORE with PRIMARY_ACCT_NU M as the first field and also an index on WK_FILTER_CMPNT with COMPONENT1_ID as the primary field then the SQL should be executed something like this:

A/ Execute sub-select, using the INDEX on WK_FILTER_CMPNT to find each row with a value of 1 in the COMPONENT1_ID field and return the PRIMARY_ACCT_NU M.

B/ Use this list of PRIMARY_ACCT_NU M values to then access table DS_STORE using the index with the PRIMARY_ACCT_NU M field. Only read the rows from the DS_STORE field were values in these fields match.

So, this will stop the repeated search of the data in the IN statement and should also remove the table space scan of the DS_STORE table - assuming you have the indexes I have mentioned above!

It is possible that the SQL could be executed as a table space scan on the first table and then index access on the second table - it will depend on the indexes and the values populated by RUNSTATS in the catalog.

One last thing, if you find you are getting duplicate rows on the result set then add a GROUP BY to the end of the SQL on D.STORE_FACILIT Y_ID and D.PRIMARY_ACCT_ NUM.

Regards

Snib

SIMPLE but SLOW SELECT query!!!

SIMPLE but SLOW SELECT query!!!

Comment

Comment