August 5, 2015 by Kenneth Fisher
As you may have noticed if you read my blog I learn a lot from trolling various forums. In this particular case I was looking at dba.stackexchange.com and came across this great code pattern. Now the answer comes from Andriy M but in the comments they give credit to Paul White (b/t). So what is this amazing pattern?
Well have you ever needed to match two values? Specifically where you need to match the values even if they are NULLs? Something like this.
CREATE TABLE Lookup1 (Code char(2), Description varchar(500)); INSERT INTO Lookup1 VALUES (NULL, 'Unknown'), ('AA','Division1'), ('BB','Division2'), ('CC','Division3'); GO CREATE TABLE SourceTable ( Id_SourceTable int NOT NULL IDENTITY(1,1) CONSTRAINT pk_SourceTable PRIMARY KEY, Code char(2), MoreColumns varchar(50), MoreColumns2 varchar(50) ); GO CREATE INDEX ix_SourceTable_Code ON SourceTable(Code); GO INSERT INTO SourceTable VALUES (NULL,'One','One'), ('AA','Two','One'), ('BB',NULL,'One'), ('CC','Four','One'), (NULL,'One','Two'), ('AA','Two',NULL), ('BB','Three','Two'), ('CC','Four','Two'), (NULL,'One','Three'), ('AA','Two','Three'), ('BB','Three',NULL), ('CC','Four','Three'), (NULL,'One','Four'), ('AA',NULL,NULL), ('BB','Three','Four'), ('CC','Four','Four'); GO SELECT SourceTable.*, Lookup1.Description FROM SourceTable JOIN Lookup1 ON SourceTable.Code = Lookup1.Code OR (SourceTable.Code IS NULL AND Lookup1.Code IS NULL) GO
This isn’t the best data model in the world since it’s using a NULL as an actual value but it will do as an example.
Some people try code like this:
SELECT SourceTable.*, Lookup1.Description FROM SourceTable JOIN Lookup1 ON ISNULL(SourceTable.Code,'NULL') = ISNULL(Lookup1.Code,'NULL') GO
But I wouldn’t. SQL won’t use any indexes and that’s going to slow everything down.
The new interesting version looks like this:
SELECT SourceTable.*, Lookup1.Description FROM SourceTable JOIN Lookup1 ON EXISTS (SELECT SourceTable.Code INTERSECT SELECT Lookup1.Code) GO
In terms of speed/query plan it’s basically the same as the first query. It just looks better (IMHO). It get’s even better when you start dealing with multiple values.
DECLARE @Code char(2) = 'AA'; DECLARE @MoreColumns varchar(50) = NULL; SELECT * FROM SourceTable WHERE EXISTS (SELECT Code, MoreColumns INTERSECT SELECT @Code, @MoreColumns); -- VS SELECT * FROM SourceTable WHERE (Code = @Code OR (Code IS NULL AND @Code IS NULL)) AND (MoreColumns = @MoreColumns OR (MoreColumns IS NULL AND @MoreColumns IS NULL)); GO
Still the same query plan but much easier to read and maintain. Pretty cool hu? I’ll be curious to see if this improves/worsens performance under more complex circumstances. It’s probably not a huge game changer but still rather interesting. And still matches the current best pattern so it’s worth keeping in your toolbox.
Note: You can use EXCEPT to to do not equals.