Guid vs Identity columns (Ints)

January 13, 2014 by Kenneth Fisher

I came across an interesting question on SE last week. Guid vs INT – Which is better as a primary key? In addition to the quite good accepted answer I thought I would throw in my own take.

Size
- GUIDs are 16 bytes and hold more values you then could ever use.
- With an identity column you can choose a data type dependent on your need.
  - tinyint 1 byte 0-255
  - smallint 2 bytes -2^15 (-32,768) to 2^15-1 (32,767)
  - int 4 bytes -2^31 (-2,147,483,648) to 2^31-1 (2,147,483,647)
  - bigint 8 bytes -2^63 (-9,223,372,036,854,775,808) to 2^63-1 (9,223,372,036,854,775,807)
Remember that the size of your column affects not just how much space the table takes up but how many pages (both index and data) need to be read to perform a given operation. Bigger the column the less you can fit in a page, the more pages need to be read, the slower your queries. Even if by a very small amount.
Uniqueness
- GUIDs are considered universally unique. This isn’t exactly true but if you look here and here you will see that it’s really close enough to true to work with.
- Identity columns are only as unique as you make them. If you put a unique constraint on the column (or PRIMARY KEY) then you will be at least guaranteed that you have a unique value in that table. But only in the table itself, when you compare to other tables, databases etc there is no uniqueness
You have to decide here how unique you need your column to be and if it’s worth the space it’s going to take up.
Portability
- Because they are universally unique GUIDs are completely portable. You can move the values from place to place with no difficulty.
- Identity columns are not really portable. Anyone who has tried to merge two tables with identity columns, between prod and test for example, knows what a pain this is.

If you are never going to merge data with another table/location then you are probably ok with an identity column. If on the other hand you expect to need to merge data from multiple tables/locations then you should probably think about GUIDs.

Ease of use
- GUIDs require the use of NEWSEQUENTIALID() or NEWID() either as a default, part of the insert, a trigger etc
- Identity columns are created and then you actually have to avoid them to make them work properly.
Personally I find identity columns MUCH easier but on the other hand I use them far more often than GUIDs so I have a lot more experience there. They do say you tend to go with what you know.

And last but not least here is BOLs take on each of them.

As with many design considerations this is an important decision. When deciding to use a GUID vs an integer Identity column you should balance the portability of a GUID vs it’s additional space required and the smaller size of the identity column vs the major pain that moving a row with an identity column from one table to another can be. The fact that integers are easier to work with when debugging is true but somewhat insignificant if you truly need a GUID. And to be honest I’ll bet you get used to it. Integers “looking” better when displayed to the end user is a factor but somewhat less important when compared to other considerations.

You, your co-workers and your replacement (there will always be one) will have to live with your decision so at least think about it before you decide. One of the things I dislike most is the “Always do it this way” mentality.

Category: Microsoft SQL Server, SQLServerPedia Syndication, T-SQL | Tags: code language, language sql, microsoft sql server, sql statements, T-SQL

11 thoughts on “Guid vs Identity columns (Ints)”

Daniel Mallott says:

January 13, 2014 at 10:17 AM

One other major difference:

Identity columns are generated sequentially and make an excellent candidate for the clustered index on the table. GUIDs, on the other hand, should never be clustered due to the method in which they are generated.

Kenneth Fisher says:

January 13, 2014 at 10:44 AM

Unless of course you use NEWSEQUENTIALID. Then the GUIDs are in order (except after a system restart) and can be clustered without too much of a problem. Also while identity columns are usually generated sequentially you can get exceptions although those are usually one offs.

Reply

Ron Kyle says:

January 20, 2014 at 9:37 AM

I worked with a database full of unnecessary GUIDs for a long time and never got used to it. They created a lot of extra work over the years. They have their place, but usually integer is the way to go.

Jens Lemke says:

January 20, 2014 at 10:45 AM

I had to deal with the same question when implementing a datawarehouse with different data sources and the need for a unique ID across data sources => Thus the only option was GUID. The snag was that SSAS doesn’t like GUIDs at all. A computed column did the trick. Formula is: (CONVERT([bigint],CONVERT([varbinary](128),[myGUID]))*(-1)) The resulting bigint is only half the size of a GUID and by far better for use in SSAS Cubes. A test with 4.5 Million records produced only unique “newIDs “.

peter roothans says:

January 20, 2014 at 10:53 AM

Regarding portability I sometimes add a small char column next to the identity column (combined primary key). So, the codes 2475 US and 2475 FR can refer completely different objects and can be easily merged into one table for reporting … or in case databases of one of more branch offices need to be centralized. Of course this implies extra storage space, 2 conditions in join statements, extra column in related tables … but it is easier to read/remember than a GUID.

Matthew says:

January 20, 2014 at 5:56 PM

I commonly work on systems with tons of remote databases where records are added, then replicated to a central database. I haven’t came up with a good way to use identity columns while ensuring uniqueness. It’s much easier to deal with the index fragmentation than devising a scheme with identity column

Kenneth Fisher says:

January 20, 2014 at 9:19 PM

Well as Peter said adding an additional char(2) column can provide the uniqueness you want and still avoid fragmentation. Another option is to seed the identity column with large gaps for the different locations (1mil for one 2 mil for another etc). Guids are certainl a good choice here though.

Reply

Jerome says:

January 28, 2015 at 6:23 AM

I think it all depends on what you’re trying to achieve. If you’re doing a lot of replication, use a Sequential GUID rather than a GUID. Fragmentation is a lot less.

If all your records sit in one database and performance is important, just stick with int.

http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/27/guid-vs-int-debate.aspx

Kenneth Fisher says:

January 28, 2015 at 10:30 AM

I’m not really sure replication has much to do with it. There are forms of replication where I might take guid vs int into account but even then it’s only one factor. Performance and fragmentation are certainly other factors but again they are factors and there are reasons to override them at times. Really it all comes down to making an informed decision and not just saying “We’ve always done it that way.”

Reply

anilkumarup says:

June 30, 2015 at 6:28 AM

Reblogged this on nuWave Development Team.

jeffmoden says:

August 14, 2021 at 9:35 PM

I know this post is pretty old but there are couple of things people really need to know about GUIDs and Identity columns and index maintenance. To summarize, index fragmentation is a myth perpetuated by a more than 20 year of “Best Practice” that’s actually a WORST PRACTICE continued by not knowing your insert/update patterns with it comes to identity columns.

I have a “summary” video on the subject. There’s a whole lot more to it. I “just” hit the 3 highpoints I mentioned above in this one.

Seriously… take the time to watch it until the end. It will blow your mind because almost everything they say about Random GUIDs and “Best Practice” index maintenance is wrong and always has been.

Here’s the link.

SQL Studies

Guid vs Identity columns (Ints)

11 thoughts on “Guid vs Identity columns (Ints)”

Leave a comment Cancel reply

Follow me via RSS

Follow me via Email

Follow me on Twitter

Archives

Categories

Meta

SQL Studies

Guid vs Identity columns (Ints)

Share this:

11 thoughts on “Guid vs Identity columns (Ints)”

Leave a comment Cancel reply

Follow me via RSS

Follow me via Email

Follow me on Twitter

Archives

Categories

Meta