Creating a list of random numbers3
March 22, 2017 by Kenneth Fisher
I was working on a blog post this weekend that required a list of random numbers. Now, this isn’t exactly an uncommon task, but it’s not as easy as it would seem. The random number function RAND() has a few inherent flaws. And from what I can remember the random functions from most languages have the same issue.
First a few quotes from BOL about RAND()
Returns a pseudo-random float value from 0 through 1, exclusive.
If a seed is not specified, the SQL Server Database Engine assigns a seed value at random. For a specified seed value, the result returned is always the same.
If you don’t specify the seed it gets selected at random. But that’s only once per query run, not once per row of the output.
Repetitive calls of RAND() with the same seed value return the same results.
Since the seed is the same, each row of output will get you the same “random” value.
SELECT RAND() FROM sys.objects
Every time you run this you will get a different result, but every row is going to be the same. So you could try explicitly passing in a seed value. (Remember that the seed is an INT so be careful that your values don’t go into the BIGINT range.)
SELECT RAND(object_id) FROM sys.objects
This works better. You at least get unique values. Unfortunately when the seeds are close together in value the RAND values are going to be pretty close together too. Also RAND(seed) always returns the same value. So every time you run this you’re going to get the same result set. Not really all that random if you ask me.
So what’s a good solution? Well, I have to admit I had to look this up myself but I found a great suggestion here.
SELECT ABS(CHECKSUM(NEWID()) % 1000) FROM sys.objects
- NEWID(): This function returns a, probably still note completely random, unique identifier (GUID).
- CHECKSUM(): This returns a checksum (INT) value of the value or values passed to it. Since we are passing in a fairly random value we are getting a fairly random value back.
- % (modulo) 1000: This returns the remainder of our random value divided by 1000. Basically this creates a range of values. In this case -999 to 999. If you want a range of -5 to 5 then use % 6. Simple enough. It’s important to remember that the maximum remainder is going to be one less than the value. Hence a range of 0-5 requires % 6. If you want 1-5 then use (% 5) + 1
- ABS(): Returns the absolute value of the whole thing. You only need to do this if you don’t want negative values.
And there you go. A list of actually (mostly) random values!
It’s funny but I was talking with various people about making mistakes on a blog and owning up to them and then today someone pointed out I’ve made a mistake on this post. I’ve corrected the post above. You’ll see the correction in the modulo portion.
Category: Microsoft SQL Server, SQLServerPedia Syndication, T-SQL | Tags: microsoft sql server, T-SQL
3 thoughts on “Creating a list of random numbers”
Leave a Reply Cancel reply
This site uses Akismet to reduce spam. Learn how your comment data is processed.
When I’m testing newly written or modified stored procedures, I first create a table with 60’000 valid entries. I use modulo to skip through the source table to get the data. The primary key of the table is a simple, sequential int generated by row_number() from 1-60’000.
The parameters in question are chosen by taking the relevant entries in the table where the primary key equals the product of the seconds and milliseconds (+1 to avoid selecting zero) of the execution time. This gives me a number anywhere between 1 and 58’942 — (59×999)+1.
To be sure, the execution of the SP is slightly longer, while it must calculate and retrieve the parameters from the table, but it is constant for all tests and it is not too big a burden.
In short, I almost always get different sets of valid parameters for my SP and although they are not chosen truly randomly, they are random enough for me. I just need data that is representative of the source table in question, which is usually different on each execution.
[…] just a random set of values, but a random number of them. To start with I’ll be using TOP and (ABS(CHECKSUM(NEWID()) % 5)) to generate a random number number of rows. I’m also using CROSS APPLY because that will call […]