Jun 25, 2006

Understanding and Using the NOT operator in SQL Server

Using a NOT operator, be it NOT IN or NOT EXISTS had been a performance bottleneck in most of the queries.

No knowledgeable SQL prgrammer will ever say that he has achieved the best performing query that can ever be written for a given functionality. Because, performance doesn't just depend on the query we write, but goes far beyond that. It includes (but not limited to) the indexes on the table, the data, fragmentation of the data and indexes, server load and server settings.


What query works very well in my server might fail miserably in another server. So,I thought I will give all the possible ways (that I know) in handling a particular functionality - the negation operator.

Here is the scenario. I have a table tbl1 that has a column id. Similary I have a table tbl2 that too had a column id. Now I need to get all the id in tbl1 which does not exist in tbl2. Quite a simple scenario :)


Soution 1:

Select id from tbl1 where id not in (select id from tbl2)

The above solution is writing our requirement in T-SQL. Nothing much.

A slightly different solution

Solution 2:


Select id from tbl1 a where not exists (select 1 from tbl2 b where a.id = b.id)

Well, here I just try to find out those rows where the join on tbl2 for that id does not exist.

Solution 3:

Select a.id from tbl1 a left outer join tbl2 b
on a.id = b.id
where b.id is null


This too doesn't need much explaination. b.id will be null if that id doesn't exist in tbl2.
Well, I like this solution and had performed better for me in quite some places that the first two. Again, it performed better for ME :)


Solution 4 (SQL Server 2005):

select id from tbl1
except
select id from tbl2


The EXCEPT operator in SQL Server 2005 can get me those rows in the result of the first select which does not have a copy in the result of the second select. Personally I wasn't very impressed with the EXCEPT operator for this usage because its more like the UNION operator and the schema of the result sets of both the selects should match.

All the above solutions will work if the id column is non-nullable. But if the id column in tbl2 has null values, then solution 1 will not work. It will not return any rows.
It can be made to work by modifying it this way ( Thanks to Uri Dimant for pointing this to me)

Solution 1 (Modified):


Select id from tbl1 where id not in (select id from tbl2 where tbl1.id = tbl2.id)

Now are we having fool proof methods. Maybe... maybe not.

All the 4 solutions will yield the same result except for one case.
When there is a null value in the id column in both the tables.
In that case, Solutions 1(Modified) ,2,3 will return the null value but solution 4 will not.
EXCEPT operator matches null with null and removes it from the list.

So, we need to understand what we need and use the appropriate method.

5 comments:

Unknown said...

Great tip with left outer join - worked for me on PostgresQL

Anonymous said...

Hi. Very useful bit of info... came in handy! Out of curiousity, how would you cater for selecting the contents of tbl1 that are NOT in tbl2 OR a third tbl3?

Omnibuzz said...

Well it goes the same way. You use an inline view that does a union of the two tables.
Or simply put,
replace tbl2 b

with

(select id from tbl2
union all
sekect id from tbl3) b

life insurance said...

Many of the built-in functions of SQL also violate the encapsulation rule. Why?i dont understand its effect on function

Pravesh Singh said...

Very informative article. Its really helped me lot. Thanks for sharing with us. Check out this link too its also having a nice post related to this post with wonderful explanation...
http://mindstick.com/Articles/1216a153-610b-41f1-b51b-3035c069ffd7/?%E2%80%98SELECT%E2%80%99%20command%20with%20SQL%20operator

Thanks

Post a Comment