Finding the worst running query in a stored procedure

September 10, 2015 by Kenneth Fisher

The other day I was asked to tune a stored procedure. Not exactly an uncommon task, but I worked something out in the process. I typically want to start by determining what the slowest part of the SP is. Normally I use sys.dm_exec_query_stats and run a query something like this:

SELECT *
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
WHERE st.objectid = object_id('Test.dbo.Sneezy');

There is a bit of a problem with this though. sys.dm_exec_sql_text is slow, particularly when running against a large cache. I have one system where this can take 5-10 minutes to run. So this time I decided to use sys.dm_exec_procedure_stats instead. It’s fast and has an object_id column.

SELECT * FROM sys.dm_exec_procedure_stats 
WHERE object_id = object_id('Test.dbo.Sneezy');

It doesn’t, however, break the code down. It’s just the stats for the procedure as a whole. So for my first pass I grabbed the sql_handle (or plan_handle) out and then ran this query:

SELECT * FROM sys.dm_exec_query_stats 
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
WHERE sql_handle = 0x0300060016B804056B1D2101FCA4000001000000000000000000000000000000000000000000000000000000

And I’m good to go! Each of the queries in the SP broken down. I still need more information though.

Well to make a long story, well, not quite as long, here is what I ended up with.

SELECT CAST(qp.query_plan AS XML) AS XML_Plan,
	SUBSTRING(st.text,qs.statement_start_offset/2+1,
            ((CASE WHEN qs.statement_end_offset = -1 THEN DATALENGTH(st.text)
                ELSE qs.statement_end_offset END) - qs.statement_start_offset)/2 + 1)  AS SqlText,
	qs.*
FROM sys.dm_exec_query_stats qs
JOIN sys.dm_exec_procedure_stats ps
	ON qs.sql_handle = ps.sql_handle
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, 
		statement_start_offset, statement_end_offset) qp
WHERE PS.object_id = object_id('Test.dbo.Sneezy');

I ended up with the query plan and text for each individual query within the SP and the associated stats. For example min/max/total duration/cpu time, number of executions, etc. All of which really helps you pin down the parts of the SP you really need to work on.

Category: Microsoft SQL Server, SQLServerPedia Syndication, System Functions and Stored Procedures, T-SQL | Tags: blogging, DMV, language sql, microsoft sql server, sql statements, system functions, T-SQL

14 thoughts on “Finding the worst running query in a stored procedure”

BPOTW 2015-09-18 | SQL Notes From The Underground says:

September 21, 2015 at 2:55 AM

[…] A T-SQL Hack for Mahalanobis Distance Work of art – the intra query parallelism deadlock graphs Finding the worst running query in a stored procedure The #TSQL2sday #70 roundup Why is my tSQLt test not […]

Commenter with question says:

September 22, 2015 at 8:07 AM

Thanks for this! I got the following error when I tried one procedure:

Msg 6335, Level 16, State 101, Line 1
XML datatype instance has too many levels of nested nodes. Maximum allowed depth is 128 levels.

I know this probably is itself an indicator that the procedure is in dire need of improvement, but I’m wondering if you can explain why this error happens and what it is telling me about the design problem in the procedure code.

Thanks!

Kenneth Fisher says:

September 22, 2015 at 9:43 AM

Unfortunately I don’t have a good answer for most of your issue. I can tell you that if you remove the cast statement in the first line that will get rid of your error but it also means you won’t be able to easily see the visual query plan. You will at least be able to see the queries. I’ll see if I can get someone else to look at your comment and see if they have any better answers for you.

Reply
- Matan Yungman says:
  
  September 22, 2015 at 10:27 AM
  
  The XML data type supports up to 128 nest levels, and that’s why you get this error.
  It’s one of the very rare cases where working with sys.dm_exec_text_query_plan doesn’t work.
  
  I think there is a way to load the textual form to SQL Sentry Llan Explorer, but not sure..
  
  Reply
- Matan Yungman says:
  
  September 22, 2015 at 10:45 AM
  
  In fact, as weird as it may sound, that’s the reason this DMV was introduced in the first place – to allow viewing execution plans that are bigger than the nesting limit in a textual way.
  You can hear about it in Bob Beauchemin ‘s session at around 21:00 in this link:
  https://channel9.msdn.com/Events/TechEd/Europe/2009/DAT405
  
  Reply
- Alex Friedman says:
  
  September 24, 2015 at 5:51 AM
  
  Matan, nice reference!
  
  Just to add another tidbit: if you’re using Adam Machanic’s excellent sp_whoisactive with the @get_plans parameter, it faces the same issue and actually catches error 6335 to rework the xml to still display it somehow (although without the ability to open the plan directly, you still have to save into a file or paste into Plan Explorer).
  
  Reply

Ray Herring says:

September 23, 2015 at 3:33 PM

Thanks Kenneth.
I have a similar query I have worked on for weeks and had just about given up on getting the query plan for the individual queries. Your concise query showed me exactly where my problem was.

Kenneth Fisher says:

September 23, 2015 at 3:53 PM

Awesome! Glad I could help!

Reply

Alex Friedman says:

September 24, 2015 at 5:44 AM

Very useful, thanks!
In my case I had to do a small adjustment to the join to get the blazing fast performance:

FROM sys.dm_exec_procedure_stats ps
INNER LOOP JOIN sys.dm_exec_query_stats qs

With the order change + the LOOP hint it works pretty much instantly, which really is much better than the long running queries on dm_exec_query_stats I was also using.

Alex Friedman says:

September 24, 2015 at 6:34 AM

And another small fix, joining also by plan_handle to avoid duplicate rows when there are several plans for the same sql_handle:

FROM sys.dm_exec_procedure_stats ps
INNER LOOP JOIN sys.dm_exec_query_stats qs
ON qs.sql_handle = ps.sql_handle
AND qs.plan_handle = ps.plan_handle

Reply

gserdijn says:

December 21, 2015 at 1:35 AM

Great post. Shows why the cached_time in sys.dm_exec_procedure_stats can give the wrong indication of a stable plan. I got a question which puzzles me.

When I executed the query for a stored procedure which timed out today, the query gave back 7 rows. The problematic part, subplan 8 with a unwanted parallel execution, was missing from the resultset.

After adding OPTION (MAXDOP 1) to the stored procedure, the query in your blog post resulted in 8 rows.

Is it possible that something like rebuilding indexers and updating statistics could invalidate the subplan? So that it was missing from the output?

Kenneth Fisher says:

December 21, 2015 at 9:51 AM

There are any number of reasons a plan/subplan might be removed the cache. Updating stats is definitely among them. In fact I’ve read that most of the time when you update stats and get a better plan it isn’t because the stats changed. It’s because the old plan was invalidated and a new one created.

On a side note remember to be careful with MAXDOP = 1. If you are just doing it to remove CXPACKET waits I wouldn’t. Usually CXPACKET gets a bad rap and very well may not be your problem at all. Take a look at sys.dm_os_waiting_tasks when your query is running to see the individual threads. The CXPACKET waits are the threads that are done and waiting on the rest. What you want to look at is those other threads. What are they waiting on?

Reply
- gserdijn says:
  
  December 22, 2015 at 1:05 AM
  
  Thank you for your feedback. It’s always good to learn more. 🙂
  
  Regarding your side note: In this case the parallel plan used clustered index scans on quite a few tables each containing millions of rows, whereas the ideal plan would involve clustered index seeks.
  
  Parallelism is something which can be very useful, but for me it is a huge warning sign that either the query is poorly written and/or that indices are missing. It should trigger thorough investigation, and the query in your post will be used quite a lot I think.
  
  Reply
  - Kenneth Fisher says:
    
    December 22, 2015 at 8:16 AM
    
    Glad I could help! If you have some time Adam Mechanic did some great sessions on parallelism https://www.youtube.com/playlist?list=PLxjd3Ep95CRo4VCWcWq1YIIAv5OcCtpoR

BPOTW 2015-09-18 | SQL Notes From The Underground says:

September 21, 2015 at 2:55 AM

[…] A T-SQL Hack for Mahalanobis Distance Work of art – the intra query parallelism deadlock graphs Finding the worst running query in a stored procedure The #TSQL2sday #70 roundup Why is my tSQLt test not […]

Commenter with question says:

September 22, 2015 at 8:07 AM

Thanks for this! I got the following error when I tried one procedure:

Msg 6335, Level 16, State 101, Line 1
XML datatype instance has too many levels of nested nodes. Maximum allowed depth is 128 levels.

I know this probably is itself an indicator that the procedure is in dire need of improvement, but I’m wondering if you can explain why this error happens and what it is telling me about the design problem in the procedure code.

Thanks!

- Kenneth Fisher says:
  
  September 22, 2015 at 9:43 AM
  
  Unfortunately I don’t have a good answer for most of your issue. I can tell you that if you remove the cast statement in the first line that will get rid of your error but it also means you won’t be able to easily see the visual query plan. You will at least be able to see the queries. I’ll see if I can get someone else to look at your comment and see if they have any better answers for you.
  
  - Matan Yungman says:
    
    September 22, 2015 at 10:27 AM
    
    The XML data type supports up to 128 nest levels, and that’s why you get this error.
    It’s one of the very rare cases where working with sys.dm_exec_text_query_plan doesn’t work.
    
    I think there is a way to load the textual form to SQL Sentry Llan Explorer, but not sure..
    
  - Matan Yungman says:
    
    September 22, 2015 at 10:45 AM
    
    In fact, as weird as it may sound, that’s the reason this DMV was introduced in the first place – to allow viewing execution plans that are bigger than the nesting limit in a textual way.
    You can hear about it in Bob Beauchemin ‘s session at around 21:00 in this link:
    https://channel9.msdn.com/Events/TechEd/Europe/2009/DAT405
    
  - Alex Friedman says:
    
    September 24, 2015 at 5:51 AM
    
    Matan, nice reference!
    
    Just to add another tidbit: if you’re using Adam Machanic’s excellent sp_whoisactive with the @get_plans parameter, it faces the same issue and actually catches error 6335 to rework the xml to still display it somehow (although without the ability to open the plan directly, you still have to save into a file or paste into Plan Explorer).
    
Ray Herring says:

September 23, 2015 at 3:33 PM

Thanks Kenneth.
I have a similar query I have worked on for weeks and had just about given up on getting the query plan for the individual queries. Your concise query showed me exactly where my problem was.

- Kenneth Fisher says:
  
  September 23, 2015 at 3:53 PM
  
  Awesome! Glad I could help!
  
Alex Friedman says:

September 24, 2015 at 5:44 AM

Very useful, thanks!
In my case I had to do a small adjustment to the join to get the blazing fast performance:

FROM sys.dm_exec_procedure_stats ps
INNER LOOP JOIN sys.dm_exec_query_stats qs

With the order change + the LOOP hint it works pretty much instantly, which really is much better than the long running queries on dm_exec_query_stats I was also using.

- Alex Friedman says:
  
  September 24, 2015 at 6:34 AM
  
  And another small fix, joining also by plan_handle to avoid duplicate rows when there are several plans for the same sql_handle:
  
  FROM sys.dm_exec_procedure_stats ps
  INNER LOOP JOIN sys.dm_exec_query_stats qs
  ON qs.sql_handle = ps.sql_handle
  AND qs.plan_handle = ps.plan_handle
  
gserdijn says:

December 21, 2015 at 1:35 AM

Great post. Shows why the cached_time in sys.dm_exec_procedure_stats can give the wrong indication of a stable plan. I got a question which puzzles me.

When I executed the query for a stored procedure which timed out today, the query gave back 7 rows. The problematic part, subplan 8 with a unwanted parallel execution, was missing from the resultset.

After adding OPTION (MAXDOP 1) to the stored procedure, the query in your blog post resulted in 8 rows.

Is it possible that something like rebuilding indexers and updating statistics could invalidate the subplan? So that it was missing from the output?

- Kenneth Fisher says:
  
  December 21, 2015 at 9:51 AM
  
  There are any number of reasons a plan/subplan might be removed the cache. Updating stats is definitely among them. In fact I’ve read that most of the time when you update stats and get a better plan it isn’t because the stats changed. It’s because the old plan was invalidated and a new one created.
  
  On a side note remember to be careful with MAXDOP = 1. If you are just doing it to remove CXPACKET waits I wouldn’t. Usually CXPACKET gets a bad rap and very well may not be your problem at all. Take a look at sys.dm_os_waiting_tasks when your query is running to see the individual threads. The CXPACKET waits are the threads that are done and waiting on the rest. What you want to look at is those other threads. What are they waiting on?
  
  - gserdijn says:
    
    December 22, 2015 at 1:05 AM
    
    Thank you for your feedback. It’s always good to learn more. 🙂
    
    Regarding your side note: In this case the parallel plan used clustered index scans on quite a few tables each containing millions of rows, whereas the ideal plan would involve clustered index seeks.
    
    Parallelism is something which can be very useful, but for me it is a huge warning sign that either the query is poorly written and/or that indices are missing. It should trigger thorough investigation, and the query in your post will be used quite a lot I think.
    
    - Kenneth Fisher says:
      
      December 22, 2015 at 8:16 AM
      
      Glad I could help! If you have some time Adam Mechanic did some great sessions on parallelism https://www.youtube.com/playlist?list=PLxjd3Ep95CRo4VCWcWq1YIIAv5OcCtpoR

SQL Studies

Finding the worst running query in a stored procedure

14 thoughts on “Finding the worst running query in a stored procedure”

Leave a reply to Alex Friedman Cancel reply

Follow me via RSS

Follow me via Email

Follow me on Twitter

Archives

Categories

Meta

SQL Studies

Finding the worst running query in a stored procedure

Share this:

14 thoughts on “Finding the worst running query in a stored procedure”

Leave a reply to Alex Friedman Cancel reply

Follow me via RSS

Follow me via Email

Follow me on Twitter

Archives

Categories

Meta