Small identity columns
5October 9, 2017 by Kenneth Fisher
We frequently talk about dealing with outgrowing INT identity columns. What we don’t talk about all that often is small tables. Where we don’t even need an INT. An IDENTITY column can be any of the following data types tinyint, smallint, int, bigint, decimal(p,0), or numeric(p,0). INT and BIGINT we talk about all the time, so how about the others? Let’s start with some information about them, shall we?
Data Type | Range | Size |
TINYINT | 0 to 255 | 1 Byte |
SMALLINT | -2^15 (-32,768) to 2^15-1 (32,767) | 2 Bytes |
DECIMAL/NUMERIC(1-9,0) | See below | 5 Bytes |
DECIMAL/NUMERIC(10-19,0) | See below | 9 Bytes |
DECIMAL/NUMERIC(20-28,0) | See below | 13 Bytes |
DECIMAL/NUMERIC(29-38,0) | See below | 17 Bytes |
The number of values available with a NUMERIC or DECIMAL is based on the precision. The precision is the number of columns available in the number. So when p = 2 (NUMERIC(2,0)) then the values are -99 to 99. When p = 7 (NUMERIC(7,0)) then the values are -9,999,999 to 9,999,999. Honestly, the size to number of values for NUMERIC and DECIMALs isn’t really worth it for non-decimal values. So let’s concentrate on TINYINT and SMALLINT.
So, we CAN use TINYINT or SMALLINT for an identity column. But should we?
As an example let’s look at the SalesHeader table from AdventureWorks2014. To be fair, this is a big table, and AdventureWorks is fairly well designed. I don’t expect much, but let’s see.
CREATE TABLE [Sales].[SalesOrderHeader]( [SalesOrderID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL, [RevisionNumber] [tinyint] NOT NULL, [OrderDate] [datetime] NOT NULL, [DueDate] [datetime] NOT NULL, [ShipDate] [datetime] NULL, [Status] [tinyint] NOT NULL, [OnlineOrderFlag] [dbo].[Flag] NOT NULL, [SalesOrderNumber] AS (isnull(N'SO'+CONVERT([nvarchar](23),[SalesOrderID]),N'*** ERROR ***')), [PurchaseOrderNumber] [dbo].[OrderNumber] NULL, [AccountNumber] [dbo].[AccountNumber] NULL, [CustomerID] [int] NOT NULL, [SalesPersonID] [int] NULL, [TerritoryID] [int] NULL, [BillToAddressID] [int] NOT NULL, [ShipToAddressID] [int] NOT NULL, [ShipMethodID] [int] NOT NULL, [CreditCardID] [int] NULL, [CreditCardApprovalCode] [varchar](15) NULL, [CurrencyRateID] [int] NULL, [SubTotal] [money] NOT NULL, [TaxAmt] [money] NOT NULL, [Freight] [money] NOT NULL, [TotalDue] AS (isnull(([SubTotal]+[TaxAmt])+[Freight],(0))), [Comment] [nvarchar](128) NULL, [rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL, [ModifiedDate] [datetime] NOT NULL)
For a total of 520 bytes per row. Note: This is just the total of the column sizes. There are several other things that come into play when calculating the actual required space for a given row.
The columns that are related to another table with an Identity column are:
- SalesOrderID (INT)
- ShipToAddressId (INT)
- CreditCardID (INT)
- CurrencyRateID (INT)
- CustomerID (INT)
- SalesPersonID (INT)
- ShipMethodID (INT)
- TerritoryID (INT)
Just looking briefly, we probably shouldn’t change most of them, but how about ShipMethodID. Do we really need an INT? How many shipping methods are you going to use?!? I’m going to guess that 256 shipping methods is plenty. So TINYINT. Which saves us 3 bytes. Next TerritoryID. How many territories do we have? Well, Adventureworks2014 only has 10. However, I’m keeping my fingers crossed that the company will grow, and territory changes could happen so let’s use a SMALLINT here. Saving us 2 bytes. And depending on your business TINYINT may work, it just depends. Still, just those two columns gives us a savings of 5 bytes per row. That’s only 1% of the total size of the row but this was also a fairly quick example with no business knowledge. And again, it’s a large fairly well-designed table.
You do need to be careful here. If you use too small a value you are going to have to take time to fix it. Although increasing a SMALLINT or TINYINT to an INT isn’t going to take nearly as long as INT to BIGINT (far fewer max rows to be changed). It’s still a downtime and a pain. Personally, I think it’s well worth thinking about though.
Don’t forget that the size of indexes changes too. Also keep in mind that when you adjust column sizes you have to turn the indexes off and later rebuild them.
Very true. And columns related to identity columns to tend to be indexed more frequently. Also that you not only have to adjust your main table but every table that references the identity column.
Thank you, Kenneth
“I’m going to guess that 512 shipping methods is plenty. So TINYINT.”
Um, did I miss something? Isn’t TINYINT 0 – 255?
Oops. Typo 🙂 Fixing it now.