Pattern searches beyond ‘%string%’.

4

June 3, 2013 by Kenneth Fisher

I was asked today to take a table with a social security column and put all 0s if the column had any letters in it. This is the typical social security column that actually has more rows with ‘XXX-XX-XXXX’ than it does valid SSNs. There are even cases that look like ‘1234-AB-56TS’ etc.

One option, if I was working in SQL 2012 (which I am very much looking forward to), is to use the new functions TRY_CAST and TRY_CONVERT like so.

SELECT CASE WHEN TRY_CONVERT(int, REPLACE(SSN_Col,'-','')) IS NULL 
    THEN '000000000'
    ELSE SSN_Col END AS Tested_SSN

However I’m not working in SQL 2012 so my best bet is a pattern match. Going on the basis that there are some people who haven’t worked with pattern matching beyond LIKE ‘%TestString%’ I’m going to explain each pattern as I go along. For those that are more familiar with pattern matching please feel free to skim these bits.

I’ve seen patterns for SSNs like this:

SSN_Col LIKE 
    '[0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]'

By putting []s around one or more values you can test a single character for the pattern within the []s. So [a19c] tests to see if a single character is an a, 1, 9 or c. To test against a range you use a dash. So [0-9] tests a single character for a number from 0 to 9.

Now this has some obvious advantages over the TRY_CAST/CONVERT method. It’s a little bit more complicated but you get to put in an exact pattern. 123-45-6789. Numbers where you want them, dashes where you want them. However in my particular case my data isn’t all that good so I’m not sure if I have 1, 2 or no dashes.

So I can’t use as exact a pattern, but I could do this.

REPLACE(SSN_Col,'-','') LIKE 
    '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'

Now it doesn’t matter if I have dashes or not, I’ve stripped them out. I’m now testing that I have exactly 9 numbers. Absolutely perfect for testing against an SSN with uncertain delimiters. Unfortunately because of the REPLACE around the column any indexes on the SSN_Col table can’t be used.

So what’s another option? What I ended up with is going the negative approach. I want to find any character that ISN’T in the list. In pattern matching a ^ is used for just that. First we construct a pattern to find a given character. This is fairly basic and I think most DBAs have done something along these lines.

SSN_Col LIKE '%A%'

Now that is done we add the test for the existence of a number.

SSN_Col LIKE '%[0-9]%'

And last but least we test for the existence of any character that is NOT a number.

SSN_Col LIKE '%[^0-9]%'

And if you feel like allowing some symbols, for example the dash from above and maybe an @ sign then do this.

SSN_Col LIKE '%[^0-9-@]%'

In case I just confused anyone, the second dash is fine because it isn’t being used as a range. You can also add in an escape character if you want.

SSN_Col LIKE '%[^0-9\-@]%' ESCAPE '\' 

The ESCAPE clause of LIKE lets us set the escape character for this string.

As you can see pattern matching can get fairly complex. These are just a few samples of what pattern searching can do. To see the rest of the pattern matching options check out the BOL entry for Like (http://msdn.microsoft.com/en-us/library/ms179859.aspx) and Pattern Matching in Search Conditions(http://msdn.microsoft.com/en-us/library/ms187489(v=sql.105).aspx).

4 thoughts on “Pattern searches beyond ‘%string%’.

  1. sqlpathy says:

    Very nicely drafted. Thanks, Lokesh

  2. Andy Hayes says:

    This is a very nice demonstration of some pattern matching techniques Kenneth, thanks for sharing.

  3. Raju Angani says:

    Very well demonstrated, *****

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 3,753 other subscribers

Follow me on Twitter

ToadWorld Pro of the Month November 2013
%d bloggers like this: