When were the statistics last updated and how many rows were sampled?

1

September 4, 2019 by Kenneth Fisher

TL;DR; Final query is at the bottom.

Every now and again (particularly when someone is having performance problems) I’ll get the question “When were the statistics last updated?” On top of that, I recently found out that our UPDATE STATISTICS job uses the RESAMPLE option that tells the command to update the statistics based on it’s most recent sample rate. Which of course led me to wonder What was the previous sample rate?

In the end, I did a little research and found this post on the PERSIST_SAMPLE_PERCENT option with a great query that is almost exactly what I needed. So I’m going to use it as a starting point. I recommend reading through the post as it’s quite good and has some great information about the option PERSIST_SAMPLE_PERCENT. Here’s the original query:

SELECT ss.stats_id, ss.name, filter_definition, last_updated, rows, 
    rows_sampled, steps, unfiltered_rows, modification_counter, persisted_sample_percent,
    (rows_sampled * 100)/rows AS sample_percent
FROM sys.stats ss
INNER JOIN sys.stats_columns sc 
    ON ss.stats_id = sc.stats_id AND ss.object_id = sc.object_id
INNER JOIN sys.all_columns ac 
    ON ac.column_id = sc.column_id AND ac.object_id = sc.object_id
CROSS APPLY sys.dm_db_stats_properties(ss.object_id, ss.stats_id) shr
WHERE ss.[object_id] = OBJECT_ID('[Sales].[SalesOrderHeaderBulk]') 
    AND ac.name = 'OrderDate';

The first change I want to make is to get rid of the existing WHERE clause. I want to run this on all statistics, not just one. This also means I can get rid of the reference to sys.all_columns and sys.stats_columns. Next, I’m going to make all of the column references two-part names to be sure I’m not missing a reference to one of the tables I removed and because I think it’s a good best practice. Then, since I’m not pulling for a single statistic anymore I’d best pull the table name into the query (by referencing sys.objects), and since I don’t care about system objects I’m going to restrict this to user tables only (using is_ms_shipped). Last but not least I’m going to clean up the column order a little bit (given that my point here is when was it last updated and how many rows have been/will be sampled) and put an order by.

SELECT o.name AS object_name, ss.stats_id, ss.name as stat_name, 
	ss.filter_definition, shr.last_updated, 
	shr.persisted_sample_percent, 
	(shr.rows_sampled * 100)/shr.rows AS sample_percent,
	shr.rows, shr.rows_sampled, 
    shr.steps, shr.unfiltered_rows, shr.modification_counter 
FROM sys.stats ss
INNER JOIN sys.objects o 
    ON o.object_id = ss.object_id
CROSS APPLY sys.dm_db_stats_properties(ss.object_id, ss.stats_id) shr
WHERE o.is_ms_shipped = 0
ORDER BY o.name, ss.stats_id;

You can get the definitions for the columns from last_updated on from the BOL entry for sys.dm_db_stats_properties and I’ve added this query to my github repository.

One thought on “When were the statistics last updated and how many rows were sampled?

  1. […] When were the statistics last updated and how many rows were sampled? […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 3,755 other subscribers

Follow me on Twitter

ToadWorld Pro of the Month November 2013
%d bloggers like this: