ORDER BY the numbers
17June 17, 2013 by Kenneth Fisher
Have you ever needed to order by a calculated column? You might have written it something like this:
SELECT LoginID, YEAR(HireDate) AS HireYear, MONTH(HireDate) AS HireMonth FROM HumanResources.Employee ORDER BY YEAR(HireDate), MONTH(HireDate), LoginID
Did you know you don’t have to put the calculations in the ORDER BY? You don’t even have to put column names.
SELECT LoginID, YEAR(HireDate) AS HireYear, MONTH(HireDate) AS HireMonth FROM HumanResources.Employee ORDER BY 2, 3, 1
Each number represents a position in the field list. 1=’LoginID’, 2=’YEAR(HireDate)’ etc.
BOL says
“A sort column can be specified as a name or column alias, or a nonnegative integer representing the position of the name or alias in the select list. An integer cannot be specified when the order_by_expression appears in a ranking function.”
When I copied this out I noticed you can’t use an integer with the ORDER BY in a ranking function, which is rather a shame. Even so, if you are sorting on calculated columns using integers in your order by can make life a whole lot easier.
Also I should note that when I check execution plans and timing using the integer vs typing the column/calculation out there is absolutely no difference.
Thanks fir that tip. I nva realised you could do that at all.
It’s one of those very basic things that you just don’t see very often.
My understanding is this: Using the functions YEAR and MONTH can interfere with query optimization by SQL Server in using any indexes on HireDate.
Also, using numbers in the ORDER BY column is bad practice, as changes in a query, especially longer queries, could invalidate the validity of using integers to refer to columns.
I would think using ORDER BY HireDate, LoginID would be best, as this allows query optimization to use any index on HireDate.
II can see what you are saying, but if in fact you only want to order by the year and the month then you only have limited options. Any function used is going to affect indexing, unless of course your index happens to match. For example in this case if I had an index on “YEAR(HireDate), MONTH(HireDate), LoginID” then it would be used.
As for changes to the query, yes, you would have to pay attention to the ORDER BY if you change the order of the columns in your field list. But in changes to a complicated query that would be one of the simpler changes to make. What if I’m changing a complicated calculation that happens to be one I’m ordering by? In that case if I’m using integers in the order by I don’t have to change them at all. If I’m using the actual calculation I could end up with a bug that is going to be very difficult to track down.
I think you both have valid points. I never knew about the use of the position in the ORDER BY and I think it could be very useful especially for quick exploratory queries. As for persistent code, I think using the alias names (HireYear and HireMonth) is the best solution. You don’t have to maintain your calculation in multiple places and you don’t have to worry about preserving the order of the columns in the select clause.
You can also use CTE to specify and manage calculations at one place and ORDER BY in final SELECT contains names not numeric referencies
Since you have given you columns an alias name, why wouldn’t you want to order by that name rather than an integer?
ORDER BY HireYear, HireMonth, LoginID
In that way you avoid the issue of later inserting a column that could break your order by integers and it is self documenting where using integers is not.
Another option is to just use the aliases that are already in the query. If other fields are added to the query later, the ORDER BY will still work correctly with the aliases even if the position of the fields changes in the SELECT.
SELECT
LoginID
, YEAR(HireDate) AS HireYear
, MONTH(HireDate) AS HireMonth
FROM
HumanResources.Employee
ORDER BY
HireYear
, HireMonth
, LoginID;
Helpful information and useful discussion; at least we have an option which could be used at times. Might be worth comparing advantages, disadvantages to aliases.
Since it makes no difference, please copy/paste the expression from the select clause to the order by clause. 1) it’s clearer to read 2) it will survive the addition of a field to the select list
Considering how much time code spends in the “maintenance” mode of its life cycle, we should be optimizing for source code for the maintenance developer if performance is the same.
I use this very thing when teaching SQL practices. This reinforces the point that the ORDER BY happens near the end of query processing. Using the numbers helps me more than it hurts because it makes me double check my work. “Lets see. I’m sorting by the 4th column then the 6th and THEN the 1st column. Is that right or should I reorder my column list?”
Yes an alias will work, and yes it is easier to read in some ways. Column order is nice at times also. However one thing you should remember. If you get a query where someone else used integers it might be nice to know what exactly it was doing :). That is at least part of why I point out somewhat unusual ways of doing things. I’ve seen them and had to spend time looking them up.
This (ordinal value in ORDER BY) is very bad practice for production code. It was removed from the 1999 ANSI SQL standard and it will probably be deprecated from Transact SQL eventually. See this: http://sqlmag.com/t-sql/deprecated-t-sql-features
While I’m not sure on first pass that I agree with everything in that link it looks very interesting and I’ll have to read it more carefully when I have time later.
Of the 3 options available for the ORDER BY clause, using the column’s alias is easily my preferred & recommended option for production code:
1) the ORDER BY clause has more meaning when read using aliases (compared to using numbers)
2) for calculated SELECT columns used for ordering you don’t have two places (SELECT & ORDER BY clauses) where the calculation needs to be maintained & kept in sync (compared to using the calculation
3) the intended order will continue to “hold true” when column order or new columns are added to the SELECT clause (or a calculation is modified in the SELECT) in the ORDER BY clause)
Using numbers can be handy for “qik&dirty” queries – IMO “ORDER BY numbers” is in the same usage category as “SELECT * …”
APPLY works well for this. Without a table reference, APPLY roughly equates to ‘Calculate’:
SELECT
LoginID,
x.HireYear,
x.HireMonth
FROM HumanResources.Employee
CROSS APPLY ( — = ‘CALCULATE’
SELECT
YEAR(HireDate) AS HireYear,
MONTH(HireDate) AS HireMonth
) x
ORDER BY x.HireYear, x.HireMonth, LoginID
[…] sorts are pretty easy. In the ORDER BY clause you can list out the columns by name or by position in the field list and determine if you want them sorted ascending or descending. But how about going beyond […]