Nov 17, 2006

Parameter Sniffing & Stored Procedures Execution Plan

This post is an attempt to explain what parameter sniffing is all about and how it affects the performance of a stored procedure. I would be walking through an example to demonstrate what the effect of parameter sniffing is. I will be showing the query execution plan generated for SQL Server 2005. To my knowledge, there is no change, in how this process works, between the two versions and you should be able to relate it to SQL Server 2000 as well.

According to the white paper,Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005 published in the Microsoft Site:

"Parameter sniffing" refers to a process whereby SQL Server's execution environment "sniffs" the current parameter values during compilation or recompilation, and passes it along to the query optimizer so that they can be used to generate potentially faster query execution plans. The word "current" refers to the parameter values present in the statement call that caused a compilation or a recompilation.

Before I go ahead with the example, you need to understand that the query execution plan generated by the query optimizer depends on a lot of factors and parameter sniffing is just one of them. So the execution plans I show here might not be the execution plan you get if you run the same query on your server.

Lets consider the Orders table in the Northwind database.

Say, I create a stored procedure with the definition as:

create procedure GetOrderForCustomers(@CustID varchar(20))
as
begin
select * from orders
where customerid = @CustID
end


Remember, that the query execution plan is not generated when you create the procedure. It gets created and cached the first time you run it.

First lets look at the distribution of the number of orders for each customer

Running the following query in the Northwind database:

select top 100 percent customerid,count(*) as OrderCount from orders
group by customerid
order by count(*)

We get the number of orders placed by each customer (Only partial result shown)



The first and the last customer are the ones we are interested in:

CENTC has OrderCount (min) = 1

SAVEA has OrderCount (max) = 31

Lets say, I want to find all the orders for CustomerID = 'SAVEA' by calling the stored procedure that we created before.

exec GetOrderForCustomers 'SAVEA'

Since this is the first time the stored procedure is called, it will create an optimized query execution plan and execute it.

Now, the stored proc will return me 31 rows. Let's look at the query execution plan.


Two values in the Clustered Index Scan information are of interest to us.

  • Actual Number of Rows 31
  • Estimated Number of Rows 31.0747

How did the optimizer correctly estimate the actual number of rows?

It's because of what we call "parameter sniffing". The optimizer created the plan knowing the fact that it was going to get the information for the customerID 'SAVEA' and hence retrieve 31 rows.

Then how did the optimizer know that 'SAVEA' has 31 orders?

SQL Server internally maintains the statistics and distribution of the values in the columns used for filtering. Which, inturn, is nothing but the information in the result of this query (which we used above)

select top 100 percent customerid,count(*) as OrderCount
from orders
group by customerid
order by count(*)


Then, what is the problem with parameter sniffing, if its helping the query optimizer do better optimization?

Check out the query execution plan, again, for a different input (CustomerID = 'CENTC'):


exec GetOrderForCustomers 'CENTC'


Did you notice that the estimated number of rows is still 31 when the actual number of rows is 1? This plan was optimized for retriving 31 rows (in the first run) and the plan stays in the cache for reuse till server is restarted or the procedure is recompiled or if the proc cache is removed because of any other reason. And so, we need to understand that this plan may or may not work with the same efficiency for retrieving 1 row.

Now, lets see what is the efficient plan (in my server) for retrieving the orders for 'CENTC'. For that I need to clear the current execution plan. Let's drop and recreate the stored procedure so that the plan cache is removed.

drop proc GetOrderForCustomers
go
create procedure GetOrderForCustomers(@CustID varchar(20))
as
begin
select * from orders
where customerid = @CustID
end


Now that the plan cache is cleared, a fresh query execution plan will be generated for the input I give in the following procedure call.

exec GetOrderForCustomers 'CENTC'

Let's look at the query execution plan now.


Now, do you notice that the plan has changed and the estimates also have changed
  • Actual Number of Rows 1
  • Estimated Number of Rows 1.00241

If you call the procedure again with 'SAVEA' as the input, you will see that the plan will be the same and estimated number of rows will still be 1 though the actual rowcount will be 31.

So, either way it all depends on which input I give the first time I call the stored procedure.

But, It's impossible to keep track of when the plan was created and what was the input it used. In that case, we have an option here to disable the dependency on the input parameter.

We can create local variables and assign the input parameter to the local variables and use the local variables in the query. And in that case, since we didn't use the procedural parameter directly in the query, it will generate a generic query execution plan.

The same stored procedure can be written this way to avoid parameter sniffing:

create procedure GetOrderForCustomersWithoutPS(@CustID varchar(20))
as
begin
declare @LocCustID varchar(20)
set @LocCustID = @CustID


select * from orders
where customerid = @LocCustID
end


Now, when you run the stored procedure with the input as 'SAVEA'

exec GetOrderForCustomersWithoutPS 'SAVEA'

This is the query execution plan that I get.
Here you see that the actual number of rows is 31 whereas the estimated number of rows is not 31, but its close to 9. Its clear in this case that the query optimizer did not use the parameter value for generating the query execution plan. Then, from where did it get the value 9?
You may guess, just like I did. We saw the distribution of the number of orders for each customer. Now, lets find the average number of orders per customer.This query finds the average of ordercount of all customers we got from the previous query:
select avg(OrderCount) from
(select top 100 percent customerid,count(*) as OrderCount
from orders
group by customerid
order by count(*)) b

Or it can be simply written this way:
select 1.0*count(*)/count(distinct customerid) from orders
You will see that the output is:

Well, this is the estimated rowcount that we saw in the execution plan.If you look at it, for the current data, 67 out of 89 distinct customers have ordercount between 4 (9-5) and 14 (9+5) orders. So, execution plan generated should work good for the majority of the customers. Hence, disabling parameter sniffing is a good choice in this case.
A few points to note:
  • Parameter sniffing can be enabled or disabled at the stored procedure level.
  • We need not use local variables (to disable parameter sniffing) if the amount of data you will retrieve from the table is evenly distributed for all values of the filtered column (example, search by primary key or unique key)
  • Parameter sniffing can be disabled (by using local variables) if we see a bell curve distribution in the number of rows retrieved for the filtered column.

70 comments:

Anonymous said...

Very interesting article.

One wonders though why this doesn't happen more often, as most queries tend to return resultsets of vastly different sizes. Yet the parameter sniffing problem seems to only occur in certain types of (complex) queries.

In any event, thanks for the solid info which allowed us to rapidly solve a performance problem in one of our production system.

Henri

LearnItFirst.com said...

Very nice write-up - thanks :)

Anonymous said...

Thank-you very much!

This fixed a performance problem in a recursive function, where the plan seemed to get progressively worse the deeper the recursion went, and RECOMPILE didn't work.

Phew!

Andrew Taplin said...

Thanks, this was really driving me to distraction.

I simply couldn't work out why a stored procedure would take an hour and a half and the same code in a batch completed in a minute.

It really made that much difference.

Keep up the good work.

Wang Wan Hsun said...

Oh~my god !!
you are my god because you slove my problem.

Mashiku said...

A fews days ago I helped my friend resolve a query of similar nature. In a batch the query took about a minute. When he turned it into a stored procedure it run for about 9hours and going...I was able to re-write the query using a derived table instead of the normal join and it run fine.

Here is the nature of his query: It the takes a string of comma separated values. Parses the strings and puts into a temporary table. Then uses the temporary table to join with some other tables.

Your post makes a lot of sense!

Thanks, Kuyi.

Keyu said...

Ultimate

Mahavir Shah said...

Really good information with full of details.

thank you very much...
keep going with good articles.

Unknown said...

Fantastic little trick that sped up my stored procedure from 20 minutes to 2 seconds. Thanks!

Unknown said...

Now, I'm a bit confused as to when it's best to use Local variables and when not to.

I've always followed the recommendation from Microsoft to avoid the use of Local variables to achieve high performance stored procedures based on the following article:
http://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx

and this one:
http://www.sqlprof.com/blogs/sqlserver/archive/2008/03/29/avoid-using-local-variables-in-your-select-statements.aspx

Any comments suggestions would be welcome.

Thanks,
Phil

Omnibuzz said...

Phil,
If you are looking for a straight yes or no.... :)

What the technet poster suggests is that using an OPTION RECOMPILE is better than using a local variable, which is fine as long as coming up with a new execution plan with the current input parameters doesn't add too much of an overhead.

RECOMPILE option is definitely something you can consider which I have missed in my post.

But I still stand by my suggestions for the specific cases I have stated.
-Omni

Unknown said...

Great article. Crazy how something you wrote so long ago is still so relevant. I'm smarter for having read it. I will now subscribe to your blog.

Alfonso said...

You are the man!

Rajesh said...

This is very helpful. Thanks a million.

zyx said...

This is nice information... thanks a lot!

Anonymous said...

NICE, thank you very much, I got a store procedure, that takes 14:17 minutes, and with the change, it now takes 30 seconds!!!

McGryn said...

Very educative piece...A developer at work pointed me to this blog and I'm smarter now because I read it. The little things never cease to matter...Good job on this.

Anonymous said...

Awesome ! Thank you for a great article and information.

Saved my proverbial "arse"

Thank you

Unknown said...

That'd too good ..

Clear explanation ... Even i amnew to SQL.. i can understand ... please Continue ... writing such articles .

Anonymous said...

Thanks for detailing "parameter sniffing". It help me debug a slow stored procedure, which had an XML input parameter.

http://www.oaktreesw.net

Dan Pretselman said...

Very helpfull article!

PingBack:
http://madebysql.blogspot.com/2009/07/execution-time-difference-between-sp.html

bb said...

brilliant thanks. this and http://elegantcode.com/2008/05/17/sql-parameter-sniffing-and-what-to-do-about-it/ this really helped me nail a long-term query performance problem

Faraz said...

Very informative and very well written.

Thanks

Unknown said...

Just wanted to say thanks!

Pete said...

Brilliant article. Thanks. And it's quite well written so that the problem and the solution is clearly understandable.

It is very rare that I learn that have a widespread effect on how I write SQL code, but I just did today.

Anonymous said...

Thank you so much. Truly un-expected behavior.

Guru said...

Great post. Covers complete detail.

Anonymous said...

Thanks a lot, it solved mysterious too long execution of sp

Anonymous said...

I have been asked Question during Microsoft interview like this "Stored Proc runs perfactly fine and few days later same Proc takes 15-20 Minutes to execute.What will be the potential problem here?

Thank you in advance

Anonymous said...

Thanks for this article. It's a really interesting read and saved the day when one of my procs slowed down!

Anonymous said...

Great Article.
Thanks

Anonymous said...

Thanks alot dude... keep up the work!!

Anonymous said...

Great Job and thanks very much! I notice this problem some months ago but it's your wonderful solution which help me solving this problem completely! (I was using WITH RECOMPILE procedure hint to solve this problem but it did not work well on large SQL procedures...)
Really great and wonderful!

Wilfred van Dijk said...

SQL Server 2008 SP1 Cumulative Update 7 and SQL Server 2005 SP3 Cumulative Update 9 introduce trace flag 4136 that can be used to disable the "parameter sniffing" process, see http://support.microsoft.com/kb/980653/

Anonymous said...

Thanks for the very clear explanation - it helped me get the execution time down from 20 minutes to 12 seconds!

Anonymous said...

Fantastic explanation. Its a shame I spent 4 hours tearing my hair out before I found this article. ;)

Anonymous said...

Great explanation! Thanks! That really helped me with a recent problem I had

Anonymous said...

Thanx dude. Our team sat for hours trying to figure out why a SP takes 5min to run while the inner script took 8sec on it's own.
Applied the local vars solution and bam, 8sec.

Anonymous said...

nice One !

Anonymous said...

Hmm.
This seems to give a lot of meaning, but is it something I didnt get here.
I have this SP that runs 15 seconds in SQLserver management studio, and when I run the same thru Reporting services it takes approx 5 minutes.
I've change the sp, so that it uses locally defined variables in the where statements.

Jostein
Is it still a reason why Reporting services should take 100 times longer than the actual run of the SP ? (And yes: I did try with defining the sp with recompile, seems to make no difference.

Anonymous said...

Hmm - is it something I didn't properly get in this ? - I seem to have same cause of problem, but cannot fix it by declaring variables and using them inside my stored procedure.

My SP has approx 10 joined views (select.... ) as xxx inner join ....)
It runs in 5 seconds from sql server management studio, it takes 5 minutes or so when I run the same thing from Reporting services.

I even tried with Optimise for... and recompile.
Any other reason it could be that the same procedure takes forever when running thru reporting services ?
Jostein

Anonymous said...

Many thanks for the clear explanation

Yasir Bamarni said...

Great article
I have a similar problem where a simple stored procedure times out (takes more than 30 seconds when executed from a Delphi client), but running the procedure from management studio takes 1 second.
As soon as I alter the procedure, it starts working fast again.
My question is, why the procedure (with the same parameters) runs fast in management studio? If it was parameter sniffing, wouldnt be affecting every client?

thanks again for the great article.

Anonymous said...

Ok I have seen loads of posts of using option (recompile) and using local variables in a stored procedure. What about if you are not in control of the generation of the query, as when using Dynamics CRM. It's queries are passed into the sp_executesql stored proc with a literal string of the query & parameters. This gives the same parameters sniffing problem. If i capture the SQL query being produced using profiler & then re-run in management studio, i see the use of a old plan & the query taking minutes. If i amend the query string and add a option (recompile) or actually just adding a space in the string - the query plan is re-evaluated and the results are returned within a second. Trouble is the CRM platform is generating the query so I cant intercept. Is there a way to fix this at a server level???

Syed Shah Lateef Ahmed said...

Very good article. Nicely explained.

Anonymous said...

good article. my sproc went from 34 seconds down to 1.5. thanks, james

Anonymous said...

Just to answer some questions about differences in performance from different clients, each connection that is determined to be different (even a slightly different connection string from the same application) keeps its own execution plans. This is why the same stored procedure may run fast in query analyzer while being slow in an application.

Anonymous said...

great information, my sp improved from 79s to 1.5s, you saved me tons of time and painful diagnosing.

Anonymous said...

This article and recommendation/solution saved us a ton of time! Thank you for writing and posting it.

Anonymous said...

Thx a million, my improvement : from 150 s to 2 s.

C-G said...

This solution worked for an SQL 2008 R2 environment.

Amazing information! thanks for the post.

RingoSchplingo said...

Have to say a very clear and well paced explanation of this issue, bravo!

gfongm said...

Please Help....

If I Run this:

SELECT --A.ItemNmbr, A.Dex_Row_Id, A.CreationDate, A.UpdateDate, --B.ItemNmbr, B.Dex_Row_Id, B.CreatdDt, B.ModifDt
count(A.ItemNmbr)
FROM Database1.dbo.Item as A, Database2.dbo.IV00101 B with(nolock)
WHERE A.Dex_Row_Id = B.Dex_Row_Id And A.ItemNmbr <> B.ItemNmbr And
(A.CreationDate = B.CreatdDt Or (A.CreationDate = '2008-01-01 00:00:00.000' And B.CreatdDt = '1900-01-01 00:00:00.000'))

then
time = 0


but is I run this:


IF (SELECT count(A.ItemNmbr)
FROM Database1.dbo.Item as A, Database2.dbo.IV00101 B
WHERE A.Dex_Row_Id = B.Dex_Row_Id And A.ItemNmbr <> B.ItemNmbr And
(A.CreationDate = B.CreatdDt Or (A.CreationDate = '2008-01-01 00:00:00.000' And B.CreatdDt = '1900-01-01 00:00:00.000'))
) > 0

BEGIN
PRINT 'UPDATED'--@strMsg
END

It takes 22 secs...

Could you help me to know why???

Thank you,

Glenn

Ramir said...

Very informative article. This saved me. Thanks a lot.

Dougy Z said...

Amazing! thanks man

shyam said...

Thank you for the post, it's really helping me alot...

Anonymous said...

Man Woow from 7 minutes to 1 second,
its even faster than normal query specially if you considered using temp table to hold the set off all possible values instead of directly filtering data from the source query then filter
the results from the temp table.

Fantastic Information Thanks.

Ian Fry said...

I knew there must be an answer! Thanks.

FYI - coming from an SSRS report issue...
The query worked in SQLServer
Running the query from the query preview window in SSRS was slow - and that is the clue, because it eliminates the considerations that formatting, expression formula etc are causing the slowness.

Unknown said...

nice information,thanx a lot

Anonymous said...

Excellent post,earlier my SP was taking 1+ minutes where now it is taking less than 1 sec. Thanks a ton.....

Anonymous said...

Thanks a lot - worked a treat. 7mins -> 1 sec.

Anonymous said...

Thanks, this solved my problem but the explanation doesn't seem to apply to my case because:

1) Using WITH RECOMPILE option makes no difference.

2) Calling the procedure with the same parameters as when it was compiled makes no difference.

Both these *should* give a similar performance improvment as the local variable solution. My conclusion is there is something goign wrong with the optimiser when it attempts "parameter sniffing" (in some cases).

Anonymous said...

Life saving post.
Stuck in this issue for last two days, found not a single clue. Thanks

Unknown said...

very good explanation of para sniffing

Anonymous said...

very informative article with good examples

Anonymous said...

The never ending SP just ran in 1 minute. You are the best!!!!!!!!!

Unknown said...

This post is STILL appropriate 8 years later and in SQL 2012. I had tinkered with setting ANSI_NULLS ON since part of my update query which ran in milliseconds in Mgt Studio but took 45 seconds inside the stored proc, but quickly eliminated that as having NO effect in either environment. Switching to local variables immediately allowed the proc to perform the update in milliseconds. Thanks SO much for this, and I'm SO glad it is still linked from within this post: http://stackoverflow.com/questions/440944/sql-server-query-fast-but-slow-from-procedure

Anonymous said...

This is still an issue with SQL Server 2012 (and probably will be with 2014) so a nice solution to a time-consuming problem. Thanks for the info!

Hery Senjaya said...

Good example and detail explanation (+ snapshots), thx for ur article

Anonymous said...

Excellent article. Thank you very very much.

Post a Comment