October 22, 2014 by Kenneth Fisher
I thought of something rather interesting the other day while answering a question asking for help with some table structures.
In a similar situation to the questioner let’s say we have a Baseball team. There are teams and players. A player can be some combination of coach, assistant coach and/or player. A player can only be on one team. There can only be one coach per team, only one assistant coach per team, but multiple players. Each player has one email per position they hold. So if they have more than one position then they have to have more than one email.
Starting with a structure like this:
CREATE SCHEMA Baseball CREATE TABLE Team ( TeamId INT NOT NULL IDENTITY(1,1) PRIMARY KEY, TeamName varchar(50) NOT NULL ) CREATE TABLE Players ( PlayerId INT NOT NULL IDENTITY(1,1) PRIMARY KEY, PlayerName varchar(50) NOT NULL, TeamId INT CONSTRAINT fk_Players_TeamId FOREIGN KEY REFERENCES Team(TeamId) ) CREATE TABLE Position ( PositionId INT NOT NULL IDENTITY(1,1) PRIMARY KEY, PositionName varchar(50) NOT NULL ) CREATE TABLE PlayerPosition ( PlayerId INT NOT NULL CONSTRAINT fk_PlayerPosition_PlayerId FOREIGN KEY REFERENCES Players(PlayerId), PositionId INT NOT NULL CONSTRAINT fk_PlayerPosition_PositionId FOREIGN KEY REFERENCES Position(PositionId), EmailAddress varchar(500) NOT NULL, CONSTRAINT pk_PlayerPosition PRIMARY KEY (PlayerId, PositionId) ); GO CREATE UNIQUE INDEX ix_PlayerPosition_UniqueEmail ON Baseball.PlayerPosition(EmailAddress); GO INSERT INTO Baseball.Position VALUES ('Coach'),('Assistant Coach'),('Player'); INSERT INTO Baseball.Team VALUES ('The Oreos'),('The Blueberry Muffins'), ('The Red Velvet Cakes'),('The Cinnamon Red Hots'); INSERT INTO Baseball.Players VALUES ('Bob Smith',1), ('Joe Smith',1), ('Jane Smith',1), ('Alice Smith',1), ('Bob Baker',2), ('Joe Baker',2), ('Jane Baker',2), ('Alice Baker',2), ('Bob Allen',3), ('Joe Allen',3), ('Jane Allen',3), ('Alice Allen',3), ('Bob White',4), ('Joe White',4), ('Jane White',4), ('Alice White',4); INSERT INTO Baseball.PlayerPosition VALUES (1,1,'Bob.Smith.Coach@Oreos.com'), (3,2,'Jane.Smith.Assistant.Coach@Oreos.com'), (1,3,'Bob.Smith.Player@Oreos.com'), (2,3,'Joe.Smith.Player@Oreos.com'), (3,3,'Jane.Smith.Player@Oreos.com'), (4,3,'Alice.Smith.Player@Oreos.com'), (8,1,'Alice.Baker.Coach@Muffin.com'), (5,2,'Bob.Baker.Assistant.Coach@Muffin.com'), (5,3,'Bob.Baker.Player@Muffin.com'), (6,3,'Joe.Baker.Player@Muffin.com'), (7,3,'Jane.Baker.Player@Muffin.com'), (8,3,'Alice.Baker.Player@Muffin.com'), (10,1,'Joe.Allen.Coach@Velvet.com'), (12,2,'Alice.Allen.Assistant.Coach@Velvet.com'), (9,3,'Bob.Allen.Player@Velvet.com'), (10,3,'Joe.Allen.Player@Velvet.com'), (11,3,'Jane.Allen.Player@Velvet.com'), (12,3,'Alice.Allen.Player@Velvet.com'), (15,1,'Jane.White.Coach@RedHot.com'), (14,2,'Joe.White.Assistant.Coach@RedHot.com'), (13,3,'Bob.White.Player@RedHot.com'), (14,3,'Joe.White.Player@RedHot.com'), (15,3,'Jane.White.Player@RedHot.com'), (16,3,'Alice.White.Player@RedHot.com'); GO
This gives us team, player, and position tables and a cross reference table between player and position (all in the BaseBall schema). By putting the TeamId in the Player table you ensure that a Player can only be on one team. The EmailAddress in the PlayerPosition table is NOT NULL which enforces the fact that a player/position combination must have an email address and the unique index forces them to all be different. The cross reference table between player and position allows a player to hold multiple positions but the primary key on PlayerId, PositionId makes sure that each player can only hold a given position once.
So far this is fairly basic. But how do we enforce the fact that a team can only have one coach and one assistant coach? One option is to create CoachId & AsstCoachId columns in the Team table with foreign keys back to the player table. This however complicates certain types of queries and what happens if there are a dozen positions that need to be unique? Not to mention the complication of adding a new unique position. So what’s a better solution?
First let’s add TeamId to the PlayerPosition table. It logically needs to be NOT NULL so we will have to create it NULLable, update it and then modify it to be NOT NULL.
ALTER TABLE Baseball.PlayerPosition ADD TeamId INT; GO UPDATE Baseball.PlayerPosition SET TeamId = Baseball.Players.TeamId FROM Baseball.PlayerPosition JOIN Baseball.Players ON Baseball.PlayerPosition.PlayerId = Baseball.Players.PlayerId; GO ALTER TABLE Baseball.PlayerPosition ALTER COLUMN TeamId INT NOT NULL; GO
Next for completeness sake let’s add in the appropriate foreign keys.
ALTER TABLE Baseball.PlayerPosition ADD CONSTRAINT fk_PlayerPosition_TeamId FOREIGN KEY (TeamId) REFERENCES Baseball.Team(TeamId); CREATE UNIQUE INDEX ix_Player_PlayerId_TeamId ON Baseball.Players(PlayerId, TeamId); ALTER TABLE Baseball.PlayerPosition DROP CONSTRAINT fk_PlayerPosition_PlayerId; ALTER TABLE Baseball.PlayerPosition ADD CONSTRAINT fk_PlayerPosition_PlayerId_TeamId FOREIGN KEY (PlayerId, TeamId) REFERENCES Baseball.Players(PlayerId, TeamId); GO
Here comes the fun part. By creating a filtered unique index on TeamId, PositionId we can enforce the idea that there can only be one of each of the non-player positions per team.
CREATE UNIQUE INDEX ix_PlayerPosition_TeamId_PositionId_Filtered ON Baseball.PlayerPosition(TeamId,PositionId) WHERE PositionId <> 3;
In case you wanted proof that this will do what we expect remember that we already have multiple players per team and one each of coach and assistant coach. So we really only need one extra test of trying to insert an additional restricted position and make sure it fails.
INSERT INTO Baseball.PlayerPosition VALUES (11,1,'Jane.Allen.Coach@Velvet.com',3)
Msg 2601, Level 14, State 1, Line 96
Cannot insert duplicate key row in object ‘Baseball.PlayerPosition’ with unique index ‘ix_PlayerPosition_TeamId_PositionId_Filtered’. The duplicate key value is (3, 1).
The statement has been terminated.
Now I realize this is nothing earth shattering but it’s a very nice solution to this type of problem. And let’s face it, if you are a data geek it’s just plain fun!
EDIT: (Before it’s posted even)
Aaron Bertrand (b/t) answered a forum question with an excellent use of this technique. In it he put a WHERE column IS NOT NULL on a UNIQUE INDEX in order to allow multiple NULL values in the UNIQUE column. This fits in better with the ANSI standard version of NULL.