Yes, Virginia, you can have too much white space.

22

July 16, 2014 by Kenneth Fisher

When I was in college and taking programming classes (back in the days when “a T-Rex ate my 5.25 inch floppy” was a reasonable excuse) we were taught that since the compiler didn’t really see white space we should use it to format our code. And that is absolutely correct. White space is a wonderful way to format your code. However it can, and frequently is, taken to extremes. I’ve heard “There is no such thing as too much white space” and I disagree. Strongly.

I’m going to show you an example using one of the queries that will output from my sp_dbpermissions stored procedure. It’s medium-long so it should make a good example.

SELECT 'master' AS DBName, DBPrincipals.name AS DBPrincipal, SrvPrincipals.name AS SrvPrincipal, DBPrincipals.sid, 
	DBPrincipals.type, DBPrincipals.type_desc, DBPrincipals.default_schema_name, DBPrincipals.create_date, 
	DBPrincipals.modify_date, DBPrincipals.is_fixed_role, Authorizations.name AS Role_Authorization, 
	CASE WHEN DBPrincipals.is_fixed_role = 0 THEN 
			'DROP ' + CASE DBPrincipals.[type] WHEN 'C' THEN NULL 
		WHEN 'K' THEN NULL 
		WHEN 'R' THEN 'ROLE' 
		WHEN 'A' THEN 'APPLICATION ROLE' 
		ELSE 'USER' END + 
	' '+QUOTENAME(DBPrincipals.name COLLATE SQL_Latin1_General_CP1_CI_AS) + ';' ELSE NULL END AS Drop_Script, 
	CASE WHEN DBPrincipals.is_fixed_role = 0 THEN 
			'CREATE ' + CASE DBPrincipals.[type] WHEN 'C' THEN NULL 
		WHEN 'K' THEN NULL 
		WHEN 'R' THEN 'ROLE' 
		WHEN 'A' THEN 'APPLICATION ROLE' 
		ELSE 'USER' END + 
	' '+QUOTENAME(DBPrincipals.name COLLATE SQL_Latin1_General_CP1_CI_AS) END + 
	CASE WHEN DBPrincipals.[type] = 'R' THEN 
			ISNULL(' AUTHORIZATION '+QUOTENAME(Authorizations.name COLLATE SQL_Latin1_General_CP1_CI_AS),'') 
		WHEN DBPrincipals.[type] = 'A' THEN ''  
		WHEN DBPrincipals.[type] NOT IN ('C','K') THEN 
			ISNULL(' FOR LOGIN ' + 
				QUOTENAME(SrvPrincipals.name COLLATE SQL_Latin1_General_CP1_CI_AS),' WITHOUT LOGIN') +  
			ISNULL(' WITH DEFAULT_SCHEMA =  '+
				QUOTENAME(DBPrincipals.default_schema_name COLLATE SQL_Latin1_General_CP1_CI_AS),'') 
		ELSE '' 
	END + ';' +  
	CASE WHEN DBPrincipals.[type] NOT IN ('C','K','R','A') 
			AND SrvPrincipals.name IS NULL 
			AND DBPrincipals.sid IS NOT NULL 
			AND DBPrincipals.sid NOT IN (0x00, 0x01)  
		THEN ' -- Possible missing server principal'  
		ELSE '' END 
		AS Create_Script 
FROM sys.database_principals DBPrincipals 
LEFT OUTER JOIN sys.database_principals Authorizations 
	ON DBPrincipals.owning_principal_id = Authorizations.principal_id 
LEFT OUTER JOIN sys.server_principals SrvPrincipals 
	ON DBPrincipals.sid = SrvPrincipals.sid 
	AND DBPrincipals.sid NOT IN (0x00, 0x01) 
WHERE 1=1 
	AND SrvPrincipals.name LIKE '%Kenneth%'
	AND EXISTS (SELECT 1 
				FROM sys.all_objects [Objects] 
				INNER JOIN sys.database_permissions Permission 
					ON Permission.major_id = [Objects].object_id 
				WHERE Permission.major_id = [Objects].object_id 
					AND Permission.grantee_principal_id = DBPrincipals.principal_id 
					AND [Objects].name LIKE '%MyObject%')
ORDER BY DBPrincipals.name

Here it is formatted with a bit more white space. Note it’s still a fairly consistent format and just so you know this format is one I’ve seen before on more than one occasion.

SELECT 
	'master' AS DBName, 
	DBPrincipals.name AS DBPrincipal, 
	SrvPrincipals.name AS SrvPrincipal, 
	DBPrincipals.sid, 
	DBPrincipals.type, 
	DBPrincipals.type_desc, 
	DBPrincipals.default_schema_name, 
	DBPrincipals.create_date, 
	DBPrincipals.modify_date, 
	DBPrincipals.is_fixed_role, 
	Authorizations.name AS Role_Authorization, 
	CASE 
		WHEN DBPrincipals.is_fixed_role = 0 
			THEN 
			   'DROP ' + 
				CASE DBPrincipals.[type] 
					WHEN 'C' 
						THEN NULL 
					WHEN 'K' 
						THEN NULL 
					WHEN 'R' 
						THEN 'ROLE' 
					WHEN 'A' 
						THEN 'APPLICATION ROLE' 
					ELSE 'USER' 
				END + 
				' ' + 
				QUOTENAME(DBPrincipals.name 
					COLLATE SQL_Latin1_General_CP1_CI_AS) + 
				';' 
		ELSE NULL 
	END AS Drop_Script, 
	CASE 
		WHEN DBPrincipals.is_fixed_role = 0 
			THEN 
			   'CREATE ' + 
				CASE DBPrincipals.[type] 
					WHEN 'C' 
						THEN NULL 
					WHEN 'K' 
						THEN NULL 
					WHEN 'R' 
						THEN 'ROLE' 
					WHEN 'A' 
						THEN 'APPLICATION ROLE' 
					ELSE 'USER' 
				END + 
				' ' + 
				QUOTENAME(DBPrincipals.name 
					COLLATE SQL_Latin1_General_CP1_CI_AS) 
	END + 
    CASE 
		WHEN DBPrincipals.[type] = 'R' 
			THEN 
				ISNULL(' AUTHORIZATION '+
							QUOTENAME(Authorizations.name 
									COLLATE SQL_Latin1_General_CP1_CI_AS),
							'') 
		WHEN DBPrincipals.[type] = 'A' 
			THEN ''  
		WHEN DBPrincipals.[type] NOT IN ('C','K') 
			THEN 
				ISNULL(' FOR LOGIN ' + 
						QUOTENAME(SrvPrincipals.name 
							COLLATE SQL_Latin1_General_CP1_CI_AS),
						' WITHOUT LOGIN') +  
				ISNULL(' WITH DEFAULT_SCHEMA =  '+
						QUOTENAME(DBPrincipals.default_schema_name 
							COLLATE SQL_Latin1_General_CP1_CI_AS),
						'') 
           ELSE '' 
	END + 
	';' +  
    CASE WHEN DBPrincipals.[type] NOT IN ('C','K','R','A') 
			AND SrvPrincipals.name IS NULL 
			AND DBPrincipals.sid IS NOT NULL 
			AND DBPrincipals.sid NOT IN (0x00, 0x01)  
		THEN ' -- Possible missing server principal'  
		ELSE '' 
	END 
		AS Create_Script 

FROM 
		sys.database_principals DBPrincipals 

LEFT OUTER JOIN 
		sys.database_principals Authorizations 
	ON 
		DBPrincipals.owning_principal_id = 
				Authorizations.principal_id 

LEFT OUTER JOIN 
		sys.server_principals SrvPrincipals 
   ON 
		DBPrincipals.sid = 
				SrvPrincipals.sid 
		AND DBPrincipals.sid NOT IN (0x00, 0x01) 

WHERE 
  1=1 
  AND SrvPrincipals.name LIKE '%Kenneth%'
  AND EXISTS 
			(SELECT 
					1 

               FROM 
					sys.all_objects [Objects] 

               INNER JOIN 
					sys.database_permissions Permission 
                   ON 
					Permission.major_id = 
						[Objects].object_id 

               WHERE 
					Permission.major_id = 
						[Objects].object_id 
                 AND Permission.grantee_principal_id = 
						DBPrincipals.principal_id 
                 AND [Objects].name LIKE '%MyObject%')

ORDER BY 
	DBPrincipals.name 

This format still looks nice and clean and it’s certainly readable. Unfortunately if I’m looking at it the first time I’m going to start getting lost about half way down. The whole point of white space is to make code easier to read not to spread it out so thin that it’s hard to keep track of.

This was an example using a moderate sized query of around 50 lines. The longer format is 125 lines. I’ve seen pieces of code like this that run into the multiple hundreds of lines and longer. Use white space, just use it responsibly.

22 thoughts on “Yes, Virginia, you can have too much white space.

  1. I don’t think either example really uses an optimal amount of whitespace. My personal preference is somewhere between amount used in the author’s two examples. To the first example I might add some whitespace around some of the nested case logic. Adding few line breaks would make the list of selected columns much more readable.

    • Perfectly reasonable. And the first one is just my preference it certainly doesn’t mean it’s perfect for everyone. My point in general is that you can have way to much white space. I’ve seen cases that are far far worse than my second example. I just couldn’t bring myself to make even longer than it is.

  2. Jason says:

    Thanks for the article, it’s good to see multiple perspectives on formatting (a very passionate topic for most).

    What I gleaned for this article is that white space is a preference thing.
    For me white space is visual indexing. It allows me to see sections and “pick apart” quickly.

    In your article you stated [referring to the second code example], “if I’m looking at it the first time I’m going to start getting lost about half way down. The whole point of white space is to make code easier to read not to spread it out so thin that it’s hard to keep track of.”

    For me its opposite, when I read the first one I wasn’t able to quickly break apart the sections and determine what the code is doing. I would have had to read every line (probably apply white space while reading) to get the gist of it, whereas in the second was easy to see each section for me.

    So, I liked the second one better than the first.
    I’d bet it’s near a 50/50 split on which amount of white space is preferred over the other.

    • Thanks for the excellent comment! And you are absolutely right. Formatting is always a personal preference and it just happens that a pet-peeve of mine is lots of unnecessary white space. I probably should have included the 20 odd blank lines both before and after the SP that I frequently see.

      If the second one is easier for you to read then by all means format it that way. Just remember that if it ever becomes my turn to maintain it I’m probably going to re-format it. We can even take turns if you like 🙂

  3. Dave says:

    If you use source control, don’t be re-formatting stuff unless you absolutely have to. If you need to find out what actually changed between V1 and V2, you really don’t want to see every line altered due to ‘it not being how I like it’. Re-formatting changes hide actual changes.

    • Good point. Although I consider readability high up on the list of “important”. Really that just emphasizes the importance of using decent (doesn’t have to be perfect) formatting right up front.

      • D. Smith says:

        My personal opinion is — it needs to be discussed and agreed upon by the development team, and a “happy medium” reached. If that’s done it should eliminate or reduce the battle of the white space.

        • Absolutely. I’m really not for or against most formatting as long as it’s fairly consistent and readable. I personally dislike really spread out formatting with (in my opinion) far to much white space but it’s certainly better than nothing.

  4. D. Smith says:

    I absolutely and completely agree. Like most everyone that would read this article, I’ve seen code that had so many tabs inserted, I had to scroll horizontally quite a bit to read it. And I thought, “Come on… really?”

  5. Steve says:

    Thanks for the great article. I also think that the “best” format is somewhere between the two examples. I can’t stand having multiple columns per line in the SELECT part of the query, as shown in the first example; I much prefer having them on individual lines. However, the second example goes too far in the CASE…WHEN sections, and having blank lines inside the query that do nothing. Both queries could benefit from shorter table aliases. I like that all of the columns in the SELECT section identify the source table, but using long aliases makes it harder to read. Formatting of JOIN criteria is tricky. Putting the ON part on the same line as the JOIN makes for lines that are easy to read, but require a lot of horizontal scrolling to understand. However, putting them on their own line directly underneath the JOIN line makes them hard to read as well. I’ve gone back and forth on this in my own queries over the years, using both styles depending upon the complexity of the join criteria. I also usually convert tabs to spaces so that tabs aren’t converted to 8 spaces when the query is displayed, which is usually too wide.

    Finally, I agree with Dave, that you shouldn’t reformat a query just to make it “pretty” and then check it into source control with no logical changes, because that makes searching old versions to determine which change “broke” the query very difficult (although I will admit that I have been called out by other people for doing this in the past). You can reformat a COPY of the query to make it easier to understand, but if you are only making a minor correction (> to >=, for example), make it to the original version and then check that in, WITH AN APPROPRIATE COMMENT.

  6. David McKinney. says:

    I’m tempted to say that this doesn’t matter anymore. We don’t need to write guidelines to establish our policy etc.. Why? Because most of us can and do reformat the sql when we need to read it according to our own preferences with one click. I use a Red-gate tool but there are of course many adequate alternatives. (Although it’s true that whitespace is sometimes excluded from reformatting.)

    • There are a number of reformatting tools available. However not all shops use them. Mine for example for all it’s a huge shop (which may be the problem) doesn’t have a reformatting tool. If I want to make something readable I do it by hand. Like most tools (even those available in SSMS) I doubt they will ever see universal usage. You never know though. I’ve certainly been wrong before.

  7. Merrill Nelson says:

    For me, I am somewhere in between 1 and 2. I would breakout the 1st one more, maybe each field on its own line and each statement and sub-statement such as Case When Then Else and End on its own line as well. I would not however use blank lines between line or double space everything, that just makes it much harder to read for me so for the 2nd one I would cut back on the white space quite a bit. It is a matter of preference but it is also a matter of standards and working in a group where everyone is on or about the same standards and can easily read, interpret and if necessary edit your code without much difficulty. With me it is simply the way I write the code now, I don’t write it as a paragraph and then format it later. I need to format it as I’m writing it so that as it starts to get lengthy, it already is very clear what I was doing in each part and where I need to go to finish it. We do have to read it as we are writing it to determine if what we wrote so far makes sense and to know what we are going to write to finish it. I do know people on both extremes of this though and I’m not sure how either of them get anything done, but they do so it must make sense to them.

  8. doryllis says:

    I would tend to white space a little more conservatively. With the table names on the same lines as their join statements and without splitting When/Thens with tabs (or in the case of the ones you are using they are so short one line would be fine for each when/then.

    Ultimately, the whitespace argument will always be personal and hotly debated. The same way people argue over comments. I always side on the “Assume the next person after you is coding from “Sam’s SQL Server in 24 hours”” because you won the lottery” perspective. This means that the comments and the white space are based on what you are writing, not as hard and fast as, “Always indent and double space between X & Y.”

    Ultimately the coder’s call.

    • I have to admit I don’t adhere to a hard and fast formatting rule. My general rule of thumb (like you said) is to make it as readable as possible because the next guy down the line might be new to this stuff. My one hard and fast rule is that it be as clear as possible.

      I do believe in consistency but not to the point where I make this particular piece of code harder to read than it needs to be. I also like to put in-line comments in my longer queries to be clear what a given section is doing.

    • D. Smith says:

      Hey! We should be on the same team… we’re discussing it, and we may reach that “middle ground” yet 🙂

      • doryllis says:

        What I have found in my teams is that you find the middle ground and stick to it for about as long as the team is stable working on the same thing. Any changes in personnel or project roles tends to lead to a renegotiation.

  9. J. Murphy says:

    My pet peeve is when developers do not indent their code, especially on long queries/procedures with multiple sub-queries and CTEs and IF-THEN-ELSE constructs. I inevitably have to do the indenting myself to figure out what the existing logic is so that I know where to put in changes that don’t break everything.

  10. Steve says:

    My format is a little easier for me to read (duh). I like to keep my case when else end lined up rather that use the case line for a when sometimes and a variable others. I also like to keep my from, joins and where etc. lined up with the on statements in the same line if possible. Makes it a little easier for me to figure out what goes with what line.

    Good article, got me thinking about why I do it the way I do. Not sure how this formatting is going to make it in a comment box.

    SELECT ‘master’ AS DBName, DBPrincipals.name AS DBPrincipal, SrvPrincipals.name AS SrvPrincipal, DBPrincipals.sid,
    DBPrincipals.type, DBPrincipals.type_desc, DBPrincipals.default_schema_name, DBPrincipals.create_date,
    DBPrincipals.modify_date, DBPrincipals.is_fixed_role, Authorizations.name AS Role_Authorization,
    CASE
    WHEN DBPrincipals.is_fixed_role = 0 THEN ‘DROP ‘ +
    CASE DBPrincipals.[type] WHEN ‘C’ THEN NULL
    WHEN ‘K’ THEN NULL
    WHEN ‘R’ THEN ‘ROLE’
    WHEN ‘A’ THEN ‘APPLICATION ROLE’
    ELSE ‘USER’
    END + ‘ ‘+QUOTENAME(DBPrincipals.name COLLATE SQL_Latin1_General_CP1_CI_AS) + ‘;’
    ELSE NULL
    END AS Drop_Script,
    CASE
    WHEN DBPrincipals.is_fixed_role = 0 THEN ‘CREATE ‘ +
    CASE DBPrincipals.[type]
    WHEN ‘C’ THEN NULL
    WHEN ‘K’ THEN NULL
    WHEN ‘R’ THEN ‘ROLE’
    WHEN ‘A’ THEN ‘APPLICATION ROLE’
    ELSE ‘USER’
    END + ‘ ‘+QUOTENAME(DBPrincipals.name COLLATE SQL_Latin1_General_CP1_CI_AS)
    END +
    CASE
    WHEN DBPrincipals.[type] = ‘R’ THEN ISNULL(‘ AUTHORIZATION ‘+QUOTENAME(Authorizations.name COLLATE SQL_Latin1_General_CP1_CI_AS),”)
    WHEN DBPrincipals.[type] = ‘A’ THEN ”
    WHEN DBPrincipals.[type] NOT IN (‘C’,’K’) THEN
    ISNULL(‘ FOR LOGIN ‘ +
    QUOTENAME(SrvPrincipals.name COLLATE SQL_Latin1_General_CP1_CI_AS),’ WITHOUT LOGIN’) +
    ISNULL(‘ WITH DEFAULT_SCHEMA = ‘+
    QUOTENAME(DBPrincipals.default_schema_name COLLATE SQL_Latin1_General_CP1_CI_AS),”)
    ELSE ”
    END + ‘;’ +
    CASE
    WHEN DBPrincipals.[type] NOT IN (‘C’,’K’,’R’,’A’)
    AND SrvPrincipals.name IS NULL
    AND DBPrincipals.sid IS NOT NULL
    AND DBPrincipals.sid NOT IN (0x00, 0x01)
    THEN ‘ — Possible missing server principal’
    ELSE ”
    END
    AS Create_Script
    FROM sys.database_principals DBPrincipals
    LEFT OUTER JOIN sys.database_principals Authorizations ON DBPrincipals.owning_principal_id = Authorizations.principal_id
    LEFT OUTER JOIN sys.server_principals SrvPrincipals ON DBPrincipals.sid = SrvPrincipals.sid AND DBPrincipals.sid NOT IN (0x00, 0x01)
    WHERE 1=1
    AND SrvPrincipals.name LIKE ‘%Kenneth%’
    AND EXISTS (SELECT 1
    FROM sys.all_objects [Objects]
    INNER JOIN sys.database_permissions Permission ON Permission.major_id = [Objects].object_id
    WHERE Permission.major_id = [Objects].object_id
    AND Permission.grantee_principal_id = DBPrincipals.principal_id
    AND [Objects].name LIKE ‘%MyObject%’)
    ORDER BY DBPrincipals.name

  11. Jeff Steward says:

    If you use source control, develop a standard and stick to that for check ins. There are a free sql source code formatting tools in addition to paid versions, so that shouldn’t be an excuse to not use them.

    I work with a great deal of SQL written to various standards and I find it extremely helpful to be able to quickly format the code to my standard so that I can quickly pick out patterns and understand the code more easily.

    One could also argue that with modern formatting tools, the need for a consistent format at source check in isn’t absolutely required, since you could manually diff the files in your preferred format. I don’t advocate that approach though, since I prefer to be able to use my source control diff tools to get a quick look at what really changed between versions.

  12. […] languages. In Python for example formatting actually affects how the code runs. In T-SQL, however, white space doesn’t do anything. In addition, statements like the IF..ELSE statement, and the WHILE statement only affect the […]

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 6,758 other subscribers

Follow me on Twitter

Archives

ToadWorld Pro of the Month November 2013