An optimum query..?

**Dan Guzman** · Apr 3 '08, 02:25 AM

Re: An optimum query..?

Which option is better ? I have subsequent joins to another table

which has about 2 mil more rows and trying to optimize the response
time.

I would have expected both the NOT EXISTS and the LEFT JOIN...WHERE..I S NULL
to yield the same plan and performance. However, the second query you
posted is invalid (has a WHERE in the JOIN clause). Can you post the actual
table DDL and queries?

--
Hope this helps.

Dan Guzman
SQL Server MVP

Dan Guzman Blog

http://weblogs.sqlteam.com/dang/

<dsdevonsomer@g mail.comwrote in message
news:195e21cb-f77d-48ef-b040-bbb4c403750d@m7 3g2000hsh.googl egroups.com...

Hello friends,
I have one simple question. I have two tables. 1 ( Table A ) has about
2.5 million rows and second one ( Table B ) has about 1 million. There
are common ID fields in both tables. I want join them on ID field and
get all rows of Table A which are not in there Table B.
>
When I ran following two queries, I got same result set, but time it
took was very different.
>
Following query took 1:35 minutes
SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSE
NUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATE
FROM [Table A] Tbl1
WHERE NOT exists (
SELECT 1 from [Table B] Tbl2 WHERE
Tbl1.UID = Tbl2.UID )
>
vs this one took .45 seconds.
>
SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSE
NUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATE
FROM [Table A] Tbl1 LEFT OUTER JOIN [Table B] Tbl2 WHERE ON Tbl1.UID =
Tbl2.UID
WHERE Tbl2.UID IS NULL
>
Which option is better ? I have subsequent joins to another table
which has about 2 mil more rows and trying to optimize the response
time.
>
I appreciate all help from the community.
JB

**dsdevonsomer@gmail.com** · Apr 3 '08, 03:15 AM

Re: An optimum query..?

On Apr 2, 10:21 pm, "Dan Guzman" <guzma...@nospa m-
online.sbcgloba l.netwrote:

Which option is better ? I have subsequent joins to another table
which has about 2 mil more rows and trying to optimize the response
time.

>
I would have expected both the NOT EXISTS and the LEFT JOIN...WHERE..I S NULL
to yield the same plan and performance. However, the second query you
posted is invalid (has a WHERE in the JOIN clause). Can you post the actual
table DDL and queries?
>
--
Hope this helps.
>
Dan Guzman
SQL Server MVPhttp://weblogs.sqlteam .com/dang/
>
<dsdevonso...@g mail.comwrote in message
>
news:195e21cb-f77d-48ef-b040-bbb4c403750d@m7 3g2000hsh.googl egroups.com...
>
>
>

Hello friends,
I have one simple question. I have two tables. 1 ( Table A ) has about
2.5 million rows and second one ( Table B ) has about 1 million. There
are common ID fields in both tables. I want join them on ID field and
get all rows of Table A which are not in there Table B.

>

When I ran following two queries, I got same result set, but time it
took was very different.

>

Following query took 1:35 minutes
SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSE
NUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATE
FROM [Table A] Tbl1
WHERE NOT exists (
SELECT 1 from [Table B] Tbl2 WHERE
Tbl1.UID = Tbl2.UID )

>

vs this one took .45 seconds.

>

SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSE
NUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATE
FROM [Table A] Tbl1 LEFT OUTER JOIN [Table B] Tbl2 WHERE ON Tbl1.UID =
Tbl2.UID
WHERE Tbl2.UID IS NULL

>

Which option is better ? I have subsequent joins to another table
which has about 2 mil more rows and trying to optimize the response
time.

>

I appreciate all help from the community.
JB- Hide quoted text -

>
- Show quoted text -

Oh,
second query is just without WHERE clause. It was an honest mistake. I
checked on client statistics and execution plan. In Left Outer join
query, Client processing time 67858, vs in NOT EXISTS query Client
processing time 72074 .

Thanks Dan..

**DeanGC** · Apr 3 '08, 12:05 PM

Re: An optimum query..?

On Apr 2, 8:06 pm, dsdevonso...@gm ail.com wrote:

Oh,
second query is just without WHERE clause. It was an honest mistake. I
checked on client statistics and execution plan. In Left Outer join
query, Client processing time 67858, vs in NOT EXISTS query Client
processing time 72074 .

What do the query plans look like? Like Dan, I would expect that these
queries would produce the same plan.

Did you run the queries multiple times? If you ran each one only once,
the difference may be due to caching.

**--CELKO--** · Apr 3 '08, 11:05 PM

Re: An optimum query..?

Here is another alternative, if you have 2005 or 2008. Since you did
not use very good data element names, you might not know about SAN
("Standard Address Number") which is used in many industries for
address data. On Oracle and other DBs, the EXCEPT is quite fast.

SELECT san, last_name, first_name, house_nbr, street_name, city_name,
state_code
FROM Addresses
WHERE san
IN
(SELECT san FROM Addresses
EXCEPT
SELECT san FROM Foobar); -- not a great table name ..

An optimum query..?

An optimum query..?

Comment

Comment

Comment

Comment