[How to] avoid cross product/Cartesian product to improve performance

**code green** · Jan 31 '08, 02:58 PM

It is probably the sub-query that slows it down.
There is an equivalent JOIN looking for NULL instead of NOT IN

Code:

SELECT * FROM [Alarm History] Jan
LEFT JOIN [Alarm History] Dec ON (Jan.AlarmID = Dec.AlarmID
AND Dec.SetTime BETWEEN '1-dec-2007' AND '31-DEC-2007')
WHERE Jan.SetTime BETWEEN '1-jan-2008' AND '31-jan-2008'
AND IS NULL Dec.AlarmID

I think

**Delerna** · Feb 1 '08, 05:15 AM

If you want to compare the effectiveness of two different methods of writing a query then

Open query analyser and paste the two queries into the query analyser window

Then click the display estimated execution plan button and this will display a graphical representation of how each query will execute

I did that for the 2 queries here, changing table and field names to suit one of my tables

Sorry to say that, at least in the check that I did, code green's method was 3 times slower. 28%(lightkeeper s method) to 72% (code greens method)
That may be different when you try it on your own table ???

Also the execution plan will show you which parts of your query is spending the most time and therefow shows where you might be able to improve performance.
Look for loops and and high I/O costs when looking for performance bottlenecks

All the above is in relation to MS SQL Server. I guess other databse engines have something similar

**code green** · Feb 1 '08, 09:20 AM

Wow! 3 times slower. That suprised me.
What is really slowing the query down is the date comparison.
I have tried similar queries in both formats on my online server (1and1).
They both timed out.
I was able to get around it because my table used auto-ids.
I then used SQL variables to get the IDs of the minimum and maximum dates.
Then did a SELECT comparing IDs rather than DATE.
Very fast.

Code:

 //Get the IDs of earliest and latest dates
SELECT @earliest :=   MAX(AlarmID)  FROM [Alarm History]
WHERE SetTime >= '1-jan-2008'; 
SELECT @latest :=   MAX(AlarmID)  FROM [Alarm History]
WHERE SetTime <= '31-jan-2008'

Use these to filter the main query
SELECT AlarmID FROM [Alarm History]
WHERE AlarmID > @earliest AND AlarmID< @latest 
AND  ....

I kow this is MySql but could you adapt this idea to your table?

**Delerna** · Feb 1 '08, 08:49 PM

Yea it surprised me also as I also thought that the subquery was the problem but it seems that SQL Server does a prettey good job of executing it. I also tried a method of my own and was beaten 48% to 52%. I personally have not used a "where not in" query myself but I think I will be taking a closer look at it in the future.

One thing that may help the speed of this query is indexes I have seen slow queries that had nothing wrong with the way it was written. The sheer number of records was the cause. Well thought out indexes took those queries from minutes to a few seconds.

**ozone702** · Jul 16 '13, 12:37 PM

use a left outer join to your "not in" table, group by some columns in table a and at least one column in table b (the "not in" table) having b.some_column is null.

This is much more effecient than a not in statement.

i.e.

Code:

select a.col1, a.col2
from tablea as a
left outer join tableb as b
  on b.fkey_id = a.id
group by a.col1, a.col2, b.some_column
having b.some_column is null

[How to] avoid cross product/Cartesian product to improve performance

[How to] avoid cross product/Cartesian product to improve performance

Comment

Comment

Comment

Comment

Comment