A better way to find missing indexes

13

November 11, 2013 by Kenneth Fisher

I recently attended SQL Saturday #255 and specifically the performance tuning pre-con put on by Grant Fritchey. It was a great seminar and well worth the time (a very long day including travel) and money (not much). One of the many interesting discussions was on the missing index DMVs. It seems that these DMVs, while very helpful, are not cleared out on a regular basis. In fact no one in the room knew when/how/or even if these DMVs are cleared. This has at least two effects. First that when you create an index the entry(s) in the DMVs don’t clear and second entry(s) in the DMVs could be from queries that ran months ago and are no longer needed.

So what’s the solution? Well one is to pull the missing indexes directly out of the query plan. Now as it happens Grant has a great query to do just that. Unfortunately it wasn’t exactly what I needed. I needed a query that would list query and index side by side along with various helpful statistics.

So I started playing and not only got the query I wanted but found out something rather interesting. It turns out that when you view a query plan it shows only the first missing index in the plan. So if you look at the query plan for a batch with multiple queries and they have multiple “suggested” indexes you will only see the first one. I may have to go see if there is a connect entry for that one.

In case anyone finds it helpful here is the query I worked up. I’m afraid I’m not all that great with XML and xquery so if someone has a better/faster way of handling this I would love to hear it.

WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan'
AS sp)

SELECT DB_NAME(CAST(pa.value AS INT)) QueryDatabase
	,s.sql_handle
	,OBJECT_SCHEMA_NAME(st.objectid, CAST(pa.value AS INT)) AS ObjectSchemaName
	,OBJECT_NAME(st.objectid, CAST(pa.value AS INT)) AS ObjectName
	,SUBSTRING(st.text,s.statement_start_offset/2+1,
			((CASE WHEN s.statement_end_offset = -1 THEN DATALENGTH(st.text)
				ELSE s.statement_end_offset END) - s.statement_start_offset)/2 + 1)  AS SqlText
	,s.total_elapsed_time
	,s.last_execution_time
	,s.execution_count
	,s.total_logical_writes
	,s.total_logical_reads
	,s.min_elapsed_time
	,s.max_elapsed_time
	-- query_hash is useful for grouping similar queries with different parameters
	--,s.query_hash
	--,cast (p.query_plan as varchar(max)) query_plan
	,p.query_plan
	,mi.MissingIndex.value(N'(./@Database)[1]', 'NVARCHAR(256)') AS TableDatabase
	,mi.MissingIndex.value(N'(./@Table)[1]', 'NVARCHAR(256)') AS TableName
	,mi.MissingIndex.value(N'(./@Schema)[1]', 'NVARCHAR(256)') AS TableSchema
	,mi.MissingIndex.value(N'(./@Impact)[1]', 'DECIMAL(6,4)') AS ProjectedImpact
	,ic.IndexColumns
	,inc.IncludedColumns
FROM (	-- Uncomment the TOP & ORDER BY clauses to restrict the data and
		-- reduce the query run time.
		SELECT --TOP 200
		s.sql_handle
		,s.plan_handle
		,s.total_elapsed_time
		,s.last_execution_time
		,s.execution_count
		,s.total_logical_writes
		,s.total_logical_reads
		,s.min_elapsed_time
		,s.max_elapsed_time
		,s.statement_start_offset
		,s.statement_end_offset
		--,s.query_hash
	FROM sys.dm_exec_query_stats s
	-- ORDER BY s.total_elapsed_time DESC
	) AS s
CROSS APPLY sys.dm_exec_text_query_plan(s.plan_handle,statement_start_offset,statement_end_offset) AS pp
CROSS APPLY (SELECT CAST(pp.query_plan AS XML) AS query_plan ) AS p
CROSS APPLY p.query_plan.nodes('/sp:ShowPlanXML/sp:BatchSequence/sp:Batch/sp:Statements/sp:StmtSimple/sp:QueryPlan/sp:MissingIndexes/sp:MissingIndexGroup/sp:MissingIndex')
				AS mi (MissingIndex)
CROSS APPLY (SELECT STUFF((SELECT ', ' + ColumnGroupColumn.value('./@Name', 'NVARCHAR(256)')
			FROM mi.MissingIndex.nodes('./sp:ColumnGroup')
				AS t1 (ColumnGroup)
			CROSS APPLY t1.ColumnGroup.nodes('./sp:Column') AS t2 (ColumnGroupColumn)
			WHERE t1.ColumnGroup.value('./@Usage', 'NVARCHAR(256)') <> 'INCLUDE'
			FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)'), 1, 2, '') AS IndexColumns ) AS ic
CROSS APPLY (SELECT STUFF((SELECT ', ' + ColumnGroupColumn.value('./@Name', 'NVARCHAR(256)')
			FROM mi.MissingIndex.nodes('./sp:ColumnGroup')
				AS t1 (ColumnGroup)
			CROSS APPLY t1.ColumnGroup.nodes('./sp:Column') AS t2 (ColumnGroupColumn)
			WHERE t1.ColumnGroup.value('./@Usage', 'NVARCHAR(256)') = 'INCLUDE'
			FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)'), 1, 2, '') AS IncludedColumns ) AS inc
CROSS APPLY sys.dm_exec_plan_attributes(s.plan_handle) pa
CROSS APPLY sys.dm_exec_sql_text (s.sql_handle) st
WHERE pp.query_plan LIKE '%MissingIndexes%'
  AND pa.attribute = 'dbid'

13 thoughts on “A better way to find missing indexes

  1. lonnyniederstadt says:

    Regarding the missing index DMVs, from http://technet.microsoft.com/en-us/library/ms345407.aspx:
    ‘Missing index information is kept only until SQL Server is restarted. Database administrators should periodically make backup copies of the missing index information if they want to keep it after server recycling.’

  2. lonnyniederstadt says:

    Creating a point-in-time copy of sys.dm_db_missing_index_group_stats information (each group is actually a single index) can also be helpful – eg before and after a huge workload of many queries where the goal is to optimize the whole workload.

    • It would allow you to find any additional indexes suggested by your workload, but that also assumes you either a) recycle your system, or b) haven’t run the workload before. Otherwise you would end up with the same (or similar data). At least that is the way I would read it.

      • lonnyniederstadt says:

        The secret is to PIT sys.dm_db_missing_index_group_stats so you get stuff like user_seeks, user_scans, last_user_seek, last_user_scan that can be diff’d if a given index was in the data previously.

  3. Greg Moore says:

    Very Useful. Thanks. Confirms a few indexes I want to add to our database.

  4. lonnyniederstadt says:

    Aaron Bertrand mentioned on twitter that this is the duplicated ‘missing index’ info bug. http://connect.microsoft.com/SQLServer/feedback/details/518467/

  5. Greg Moore says:

    BTW, if I can add, I added at the top
    declare @newline char(2);
    select @newline=char(13)+char(10);
    declare @tab char(1);
    select @tab=char(9);

    then wrapped the SQLText and varchar(max) version of the query_plan with some replace statements.

    This then let me cut and paste the results cleanly into Excel so I could review the results and save them and share them.

    Already used this to convince one team to add an index and will be working on adding a second index to a slow overnight process.

    Wonderful tool. But I’ll remind newcomers who come across it of the caveat that this is still no replacement for understanding the impact of adding or removing indexes. Just because one thing says an index would be “good” doesn’t mean it’s true. Test everything and understand the impact!

    • Glad you like it :). And absolutely, anyone who is looking at the suggested indexes needs to review not just the suggested indexes, but the indexes that currently exist on the table, impact to the whole system and even “Do I really need to improve that query”. Please please please don’t just add indexes because “The plan suggested them” or something else like that. These are a starting point, a suggestion, nothing more.

  6. Yakov Shlafman says:

    Thanks for the script.
    I think there is a syntax error, an extra dot, in this snippet
    ../@Impact)[1]
    with extra dot, I got an error:
    Msg 8115, Level 16, State 8, Line 8
    Arithmetic overflow error converting nvarchar to data type numeric.
    Completion time: 2023-03-18T13:43:53.8090386-04:00
    without a dot, it works fine
    Thanks

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 6,756 other subscribers

Follow me on Twitter

Archives

ToadWorld Pro of the Month November 2013