T-SQL FizzBuzz
17May 17, 2017 by Kenneth Fisher
I was looking around for something to write about this evening and came across one of Russ Thomas’ (b/t) old monthly challenges (Feb 2016).
The challenge is two-fold. A FizzBuzz problem is a common programming interview challenge that asks a coder to print the numbers from 1 to 100. Every time a number is divisible by 3, print fizz instead, when it’s divisible by 5, print buzz, and when it’s divisible by both, print fizzbuzz. The real challenge, however, is to do it in as few lines of code as possible and in our case… TSQL… it should also be set-based – (temp tables are ok).
Now normally FizzBuzz is done with a loop, but as Russ said, we are using T-SQL so batch code is always the goal. That said, what table should I query to get the numbers 1-100? Well, I decided I’d just do something simple here and use a system view that’s more than 100 rows, the ROW_NUMBER function, and restrict it using TOP.
SELECT TOP 100 ROW_NUMBER() OVER (ORDER BY object_id) Id FROM sys.all_columns
A numbers table would be easier (and probably faster) but I decided that since it would need to be created (not already part of SQL) it would go against the rules or at least the time spent creating it would count against my performance.
Next I needed to determine what should be printed out. Well modulo and a case statement is prefect for that. A simple case statement checking for Id % 3 = 0, Id % 5 = 0 and Id % 15 = 0 and return fizz, buzz or fizzbuzz respectively. Then an ELSE clause at the end to return the number itself.
CASE WHEN Id % 15 = 0 THEN 'fizzbuzz' WHEN Id % 5 = 0 THEN 'buzz' WHEN Id % 3 = 0 THEN 'fizz' ELSE CAST(Id AS VARCHAR(5)) END
I’m close but now I need to figure out how to display the output. I decided that just running the query didn’t really count. That would be too easy right? The next option is to loop through the result set printing each row out, which again goes against the whole batch thing. Which leads me to a simple piece of code that creates a delimited list. In this case, I’m delimiting by a carriage return so each row goes on its own line.
Now I had a query that created a string that I just needed to print out. I couldn’t just run the thing (same reason as above) so I tried doing a PRINT (query) but that doesn’t work (no subqueries on a print, sorry). So next I decided I needed to put the results of the query (a single string anyway) into a variable, then print the variable. Rather than declaring a variable and setting it in a separate line I combined both steps into a single command, then one more command to print the variable out.
DECLARE @FizzBuzz varchar(max) = ( STUFF( (SELECT ' ' + CASE WHEN Id % 15 = 0 THEN 'fizzbuzz' WHEN Id % 5 = 0 THEN 'buzz' WHEN Id % 3 = 0 THEN 'fizz' ELSE CAST(Id AS VARCHAR(5)) END FROM (SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY object_id) Id FROM sys.all_columns) FizzBuzz ORDER BY Id FOR XML PATH, TYPE ).value(N'.[1]',N'nvarchar(max)'),1,1,N'') ); PRINT @FizzBuzz;
How about performance you might ask? Running this against 100 rows it was near instant (<5ms). So I tried again against a few more rows. At 9500 rows it was still under 5 ms. One of the downsides of this method is needing a table to run against with sufficient rows to cover the number of lines required and 9500 was about the limit to this table. So I joined sys.all_columns against itself to go higher. When I got up to 100k rows I finally got a reasonable amount of time. ~150ms.
The other down side is that I can’t print the whole thing at once. The PRINT command can only print 8000 characters at a time. This means that if I’m going to work with more than ~2000 rows (rough average of 4 characters a row) at a time I’m going to have to use multiple print statements with substrings.
Do you have a t-sql fizzbuzz solution? Is it better than mine? I’d love to see it.
Hey,
you switched fizz and buzz around 🙂 fizz should be divisible 3 and buzz by 5
Anyway, my attempt, you wanted it printed? hit CTRL + T first 🙂
You know somehow I knew I was going to get fizz and buzz backwards 🙂 Fixed. Thanks for pointing it out.
Great solution. Even at 2000 rows (close to the max for spt_values) I’m getting 0-10ms.
I saw this for the first time in a demonstration about six months ago. The demonstrator did not show a set based approach. We discussed the possibility of a set approach. I came up with the query below. The range maximum is not limited by the number of records in the source tables used to generate the number series.
The catch is if your max number is greater than 100, then you must increase the maximum recursion.
Nice. I like the recursion method. It’s a bit slower but still very nice.
You should never use recursion for simple counting. Even for just 10 rows, it’s horribly expensive. Please see the following article for much more detail on why it should never be using for counting…
http://www.sqlservercentral.com/articles/T-SQL/74118/
I totally agree, but in a thought process exercise, it is a valid solution. If a person I was interviewing provided this solution, I would expect him/her to understand the pros and cons of this solution. Recursion is a cool concept that I would very strongly hesitate before ever considering executing in a production environment due to its poor performance.
I do like the Itzik-Style Cross-Join. I come up with the query below using this concept.
I have to agree to disagree. The only time you should use a recursive counting query is to show how bad they are for performance and resource usage. You didn’t do that. You posted it as an alternative set-based solution (it’s not actually set-based) and it’s not a valid solution because of the problems associated with it. My concern is that someone, especially a neophyte to T-SQL, will read such posts and think that it’s a valid solution especially since it DOES look cool without understanding that even a well written WHILE loop will beat it for performance.
As for interviews, if I were interviewing someone for a Junior DB related or front-end Developer position, then, yes, I’d consider it as having some knowledge of rCTEs and I’d explain the reasons why it’s a bad thing to use them where RBAR counting is concerned but, if it’s for a Senior database related position, I wouldn’t ask them about the pros and cons about it because if they actually knew such pros and cons, then they wouldn’t have written it as a solution in the first place. 😉
Shifting gears a bit, I absolutely agree that Itzek’s cascading CTE method is one of the better ways to pull this off, especially since it produces no logical reads. I also find that the FizzBuzz problem can reveal a whole lot more about the applicant’s though processes than expected if the interviewer is on the ball.
For example, if a Junior or front-end Developer interviewee wrote that solution, that would definitely cause high points in their favor. If an applicant for a Senior DB related position wrote the solution that way, then I’d have to ask them about why they didn’t consider bullet-proofing their code for future scalability whether it’s for up-scaling or down-scaling or making the code a bit more flexible for starting and ending value.
A lot of folks will say that such requirements weren’t defined in the original problem definition and I agree with that statement but one of the attributes that I’m looking for in applicants for Senior positions is the realization that such considerations are what actually separate Juniors from Seniors and that’s a major part of what the interview is all about.
I first saw this from Brent Ozar. Back in ’09.
https://www.brentozar.com/archive/2009/01/top-10-interview-questions-to-ask-senior-dbas/
You know I vaugly remember reading that myself. Thanks for the reminder 🙂
Hi Robert Eder,
I have use recursions before but after saw your query and to be honest this is my first time I saw this kind of query which is very interesting for me.
I am facing exactly same problem. Ex-workers had implemented CTE on table with more than 300k rows and now end users are complaining the system is very slow as obviously takes more time. It would be great if you are able to suggest what is the best alternative solution to look into fix this problem.
Awaiting for your reply.
Thanks
CTEs in and of themselves are fine (depending on when you use them) it’s the recursion that will kill you. We’d probably need more information on the query though before making any helpful suggestions. You could throw it up on sqlfiddle, github or any number of other places and we could take a look at it for you.
My recommendation would be to post it on SQLServerCentral.com.
I’d written my solution prior to seeing the post by denisgobo, and taking into account the comments by Jeff Moden regarding scalability (in either direction), here’s a solution that doesn’t use loops or recursive CTEs, and is good for the number of rows in sys.messages:
(Any similarity to code living or dead is unintentional)
I’d enhance your inner SELECT (this one with the FOR XML) with the PRINT command and just EXECUTE (@FizzBuzz).
Or – if you want to prevent to execute several thousand PRINTs, use a
CASE WHEN (Id - 1) % 1900 THEN '''' + CHAR(10) + CHAR(13) + 'PRINT ''' ELSE CHAR(10) + CHAR(13) END
. This would add the PRINT only to the lines 1, 1901, 3801 …[…] a recent post by Kenneth Fisher ( blog | twitter ) out about T-SQL FizzBuzz, I’m going to create two tables, both of which will have incrementing column names i.e. […]
[…] end of the month it would be fun to do another T-SQL Puzzle. In this case a slight upgrade of the FizzBuzz puzzle using T-SQL. So here are the […]
I can be achieves via recursive CTE. I know recursions at not great for performance ..
with CTE(Rownumber, FizzOrBuzz) as
(
select 0, ‘FizzBuzz’
union all
select
c.Rownumber +1,
case
when (c.Rownumber +1) % 15 = 0 then ‘FizzBuzz’
when (c.Rownumber +1) % 3 = 0 then ‘Fizz’
when (c.Rownumber +1) % 5 = 0 then ‘Buzz’
else cast(c.Rownumber + 1 as varchar(8))
end
from
CTE C
where
c.Rownumber < 100
)
select
c.*
from
CTE C
option (MaxRecursion 0)