October 31, 2016 by Kenneth Fisher
DBA: Help me! Help me! It’s Halloween! I have a problem! Oh won’t you please help me!
Ok, so that’s really got nothing to do with the Halloween problem. But it is Halloween today so it seemed like a good time to bring the subject up. Of course if we are going to talk about the Halloween problem it would probably be nice to know why it’s called that, and what the heck, it would be nice to know what it actually is.
Why is it called The Halloween Problem?
Simply enough it was discovered on Halloween. A bit anti-climatic maybe, but true.
What is The Halloween Problem?
This is a bit more complicated. Let’s say you are trying to give a 10% raise to everyone who makes less than $25k.
UPDATE TableName SET YearlySalary = YearlySalary * 1.1 -- 10% raise WHERE YearlySalary < 25000;
Couple of quick notes here. This is a common example because this in fact the problem that exposed the issue. Also, while UPDATEs are probably the easiest way to explain what’s going on, it can affect any type of write.
So back to our update statement. There are several ways this could be implemented. I’m going to use pseudo T-SQL to demonstrate a couple and explain each.
-- Collect the rows to be updated SELECT KeyVal INTO #Temp FROM TableName WHERE YearlySalary < 25000; -- Perform the update on the rows selected UPDATE TableName SET YearlySalary = YearlySalary * 1.1 -- 10% raise JOIN #Temp ON TableName.KeyVal = #Temp.KeyVal
So this obviously has the down side of having to hit each row to be updated twice. That might be the best performing plan, but it might not. The other option is to hit each row one at a time.
DECLARE @KeyVal SELECT TOP 1 @KeyVal = KeyVal FROM TableName WHERE YearlySalary < 25000; WHILE @[clustered index columns or row identifier] IS NOT NULL BEGIN UPDATE TableName SET YearlySalary = YearlySalary * 1.1 -- 10% raise WHERE KeyVal = @KeyVal SELECT TOP 1 @KeyVal = KeyVal FROM TableName WHERE YearlySalary < 25000; END
In this particular case we only have to hit each row once, but the more suspicious minded of you might have noticed a slight issue. The while loop will only end when everyone has a salary higher than $25k. If someone started at $10k they will get raise after raise until they are actually at $25,937.42. And while that’s great for them, it’s probably not the actual intent.
Now I do want to point out that this probably not how anyone would actually perform the update. This is just my interpretation via pseudo-code.
So how do we fix this? Basically by switching form a WHILE loop to a CURSOR. Ie Pull the data, making a list of each of the rows to be updated, and updating each one. I know this sounds pretty similar to the first method but it isn’t. Quite. This would be more like using a cursor if I understand it correctly. The query is still only run through once, each row being updated during that single run through.
DECLARE GiveRaise CURSOR FOR SELECT KeyVal FROM TableName WHERE Salary < 25000; DECLARE @KeyVal int; OPEN GiveRaise; FETCH NEXT FROM GiveRaise INTO @KeyVal; WHILE (@@fetch_status <> -1) BEGIN IF (@@fetch_status <> -2) BEGIN UPDATE TableName SET Salary = Salary * 1.1 WHERE TableName.KeyVal = @KeyVal; END; FETCH NEXT FROM GiveRaise INTO @KeyVal; END; CLOSE GiveRaise; DEALLOCATE GiveRaise; GO
Of course all of this is handled behind the scenes and we don’t have to worry about it. Unless of course you happen to be writing your own database system.