Partitioning on a nullable column

3

May 3, 2017 by Kenneth Fisher


tl;dr; If you are SWITCHing data into a table and the partitioning column is nullable you will need to add AND ColName IS NOT NULL to the constraint of the table that holds the data you are SWITCHing in.


Partitioning is a really handy, if rather complex tool. It can be used for a lot of odd tasks but one of its primary uses is to move data in and out of a table quickly. We do that using a SWITCH command.

I’ve recently been working on a project doing just that. It’s a monthly archive of 200+ tables. I’m looping through all of the eligible tables and dynamically setting up a table to swap the archive data into. Then I move those tables holding the archive data into an archive database and swap the data back into the archive tables. Unfortunately, I don’t control the creation of the tables so I’m having to cover a rather large range of possibilities. Including the possibility that the partitioning column will allow NULLs. Now I’m not one of those people who thinks NULLs are completely evil (Thomas LaRock (b/t) for example) but they sure do complicate things. And with something as complicated as partitioning adding more is a royal pain.

So to start with how does partitioning handle a NULL? If you look in the BOL for the CREATE PARTITION FUNCTION you’ll see the following:

Any rows whose partitioning column has null values are placed in the left-most partition unless NULL is specified as a boundary value and RIGHT is indicated. In this case, the left-most partition is an empty partition, and NULL values are placed in the following partition.

So basically NULLs are going to end up in the left most partition(#1) unless you specifically make a partition for NULL and are using a RIGHT partition. So let’s start with a quick example of where NULL values are going to end up in a partitioned table (a simple version).

First we set up our test environment

-- Set up the partition function and scheme
CREATE PARTITION FUNCTION PartWithNullsPF (int)
AS RANGE RIGHT FOR VALUES (1,2,3);
GO
CREATE PARTITION SCHEME PartWithNullsPS
AS PARTITION PartWithNullsPF
ALL TO ( [PRIMARY] );
GO
-- Create the table 
CREATE TABLE PartWithNullsTab (Col1 INT NULL, Col2 varchar(10) NULL)
ON PartWithNullsPS (Col1);
GO
-- Insert some test values across the full range of 
-- possibilities. All of the named values and just 
-- outside of the upper and lower boundries and 
-- a null value of course.
INSERT INTO PartWithNullsTab 
		VALUES (0,'abc'), (1, 'def'),
				(2, 'nop'), (3, 'qrs'),
				(4, 'tuv'), (NULL, 'wxy');
GO

Now take a look at the results

SELECT $Partition.PartWithNullsPF(Col1) AS Partition, *
FROM PartWithNullsTab
ORDER BY Partition, Col1;
GO

You can see that the NULL value ended up in partition one as expected. Like I said, this was a pretty simple example and I’m not displaying the exception but you can consider trying that out your homework.

Now for the SWITCH itself. Here is an example of the code I started with to switch the data back into the archive database.

-- Remove existing data
TRUNCATE TABLE PartWithNullsTab;
GO
-- Create swap table
CREATE TABLE PartWithNullsTab_Part (Col1 INT NULL, Col2 varchar(10) NULL,
	CONSTRAINT ck_PartWithNullsTab_Part CHECK (Col1>=3) );
GO
-- Load swap table with data
INSERT INTO PartWithNullsTab_Part VALUES (3,'ABC'), (4,'DEF');
GO
-- Swap in data
ALTER TABLE PartWithNullsTab_Part SWITCH
	TO PartWithNullsTab PARTITION 4;
GO

Msg 4972, Level 16, State 1, Line 48
ALTER TABLE SWITCH statement failed. Check constraints or partition function of source table ‘Test.dbo.PartWithNullsTab_Part’ allows values that are not allowed by check constraints or partition function on target table ‘Test.dbo.PartWithNullsTab’.

This worked great when the column wasn’t nullable but now I get an error. I spent hours and hours trying to figure this one out. Finally I found this answer on StackExchange by the great Paul White (b/t). In it he reminded me how NULL works. My constraint checks if Col1>=3 and only restricts a value if it returns a False. But if Col1 is NULL then Col1>=3 evaluates to NULL and NULL is not False allowing the value into the table. So basically our swap table could easily have a NULL in it but that isn’t allowed in that partition. End result, all we need to do is:

-- Remove old swap table
DROP TABLE PartWithNullsTab_Part;
GO
-- Create swap table
CREATE TABLE PartWithNullsTab_Part (Col1 INT NULL, Col2 varchar(10) NULL,
	CONSTRAINT ck_PartWithNullsTab_Part CHECK (Col1>=3 AND Col1 IS NOT NULL) );
GO
-- Load swap table with data
INSERT INTO PartWithNullsTab_Part VALUES (3,'ABC'), (4,'DEF');
GO
-- Swap in data
ALTER TABLE PartWithNullsTab_Part SWITCH
	TO PartWithNullsTab PARTITION 4;
GO

See the IS NOT NULL? That makes sure that there are no NULLs in the table being swapped in. The possible values for both sides are the same so the SWITCH worked!

So how about swapping the data in in the first place? When I first started writing this post that was what I started with.

-- Reload the data 
INSERT INTO PartWithNullsTab 
		VALUES (0,'abc'), (1, 'def'),
				(2, 'nop'), (3, 'qrs'),
				(4, 'tuv'), (NULL, 'wxy');
GO
-- Drop and re-create the swap table without the IS NOT NULL condition
-- Remove old swap table
DROP TABLE PartWithNullsTab_Part;
GO
CREATE TABLE PartWithNullsTab_Part (Col1 INT NULL, Col2 varchar(10) NULL,
	CONSTRAINT ck_PartWithNullsTab_Part CHECK (Col1>=3) );
GO
-- Swap out the data
ALTER TABLE PartWithNullsTab SWITCH PARTITION 4
	TO PartWithNullsTab_Part;
GO

And this time it worked fine! I have to tell you this was even more frustrating than when it didn’t work in the first place! And one more time I spent hours trying to figure out what was wrong and ended up reaching out for help. This time from Hugo Kornelis (t), although to be fair I was grumbling a bit about NULLs and he offered to help me out (don’t you love this community?). Anyway, he ended up pointing out that when you swap data into a table the point of the constraint is to make sure that the data moving in is valid. So in the case of swapping the data out it looked like this:

PartWithNullsTab(3,4) -> ParWithNullsTab_Part(3,4,NULL)

So obviously that worked. All values of (3,4) will go into (3,4,NULL).

But when I was trying to put the data back in:

ParWithNullsTab_Part(3,4,NULL) -> PartWithNullsTab(3,4)

And it makes complete sense that this one doesn’t work. The “value” NULL obviously won’t go in that other dataset.

So there you go. The possible values of the partition to be swapped have to be a subset (or the full set) of the possible values in the partition being swapped into. Clear as mud? 🙂

3 thoughts on “Partitioning on a nullable column

  1. This sounds like a headache

  2. […] my last post on partitioning I used the $Partition command in passing. I’ve been thinking it deserves a bit more […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 1,664 other followers

Follow me on Twitter

ToadWorld Pro of the Month November 2013
%d bloggers like this: