Dataphor SQL RAC (Relational Application Companion)

A site of hope for those looking for a true relational database system

Saturday, December 20, 2008

Extracting numbers from a string

This example is based on the question asked in the thread:

Dec 18, 2008
Pulling a number out of an nvarchar field

The OP was interested in pulling out the 1st occurrence of a number in a
string. The string has numbers and letters. So if the string is 'XY40A3' 
we want the number 40 which is the 1st of two numbers in the string.

This is very easy to do in Dataphor. Just as a table of numbers comes in
handy for solving many different problems so does a list of letters. 
The idea here is to treat each letter in the string as a 'delimiter'. We
then split the string using the list of letters as the delimiters so
what results are the number(s) in the string.

We can store the list (the letters of the alphabet) in a temporary table
for easy access.

create session table AlphaTable
 { AlphaList:list(String),
   key{ }
             { {'A','B','C','D','E','F','G','H','I','J','K','L','M','N',
                'O','P','Q','R','S','T','U','V','W','X','Y','Z'} AlphaList}

For example if we split a string, transform it to a table  and remove any
blanks we'll just have numbers left. If we order by sequence (Index) it will
show the numbers as they occur from left to right. 

select ToTable('XY40A3'.Split(AlphaTable[].AlphaList),'StrNum','Index' )
                where StrNum>' ' 
                  order by {Index};                  

StrNum Index 
------ ----- 
40     2     
3      3     
Not only would it be easy to get the 1st number but we can get the
occurrence of any number easily. The 1st occurrence is just a special
case of the general problem of getting the Nth occurrence in a string.

By using ToTable(ToList(cursor by Index (that follows the order of
numbers from left to right in the string) we can create a consecutive rank
from 1 to N over the table of numbers that will allow direct access to the 
Nth number (if it exists).
                         ToTable('XY40A3RAD853'.Split(AlphaTable[].AlphaList),'StrNum','Index' )
                         where StrNum>' ' 
                         {ToInteger(StrNum) Num,Index}
                         order by {Index}
         {Num,Index,sequence+1 NthIndex} ;
Num Index NthIndex 
--- ----- -------- 
3   3     2        
40  2     1        
853 6     3               
Here is the an operator for the Nth occurrence that takes into account
lower case letters and returns a -1 if the Nth occurrence doesn't exist.
create session operator NthNum(AStr:String,N:Integer):Integer
var T1:= 
                         ToTable(Upper(AStr).Split(AlphaTable[].AlphaList),'StrNum','Index' )
                         where StrNum>' ' 
                         {ToInteger(StrNum) Num,Index}
                         order by {Index}
         {Num,Index,sequence+1 NthIndex};
result:=IfNil((T1 adorn{key{NthIndex}})[N].Num,-1);

select NthNum('SF346fs47sGs759 ',1); //returns 346
select NthNum('SF346fs47sGs759 ',2); //returns 37
select NthNum('SF346fs47sGs759 ',3); //returns 759  
select NthNum('SF346fs47sGs759 ',4); //returns -1

Here a table of strings in stored in an Sql Server database from which
we can extract the 1st occurrence of a number.

create table FooStrings
 datacol:String nil,
 row{1 keycol, 'XYZ40AB' datacol},
 row{2, 'WX32A'},
 row{3, '27 blah'},
 row{4, 'A87BNC30'},
 row{5, 'XY40A3'},
 row{6, 'TWFD'},
 row{7, 'XYA53GH5JGV934'},
 row{8, '7'},
 row{9, nil}
select FooStrings add{NthNum(IfNil(datacol,' '),1) MyNumber}
                   with {IgnoreUnsupported = 'true'}
                    order by {keycol};
keycol datacol        MyNumber 
------ -------------- -------- 
1      XYZ40AB        40       
2      WX32A          32       
3      27 blah        27       
4      A87BNC30       87       
5      XY40A3         40       
6      TWFD           -1       
7      XYA53GH5JGV934 53       
8      7              7        
9      <No Value>     -1   

Think how easy the operator can be modified if it's desired to return
the minimun or maximum number etc.

If you would like to see a coherent and easy to follow all t-sql solution
that doesn't cobble together string functions into spaghetti code see:
RAC - Are you Coordinated?

But I think you'll agree the much preferred solution is in Dataphor 


Anonymous said...

Іt's very effortless to find out any topic on net as compared to textbooks, as I found this post at this website.

Have a look at my page -
Also visit my web page Suggested Reading

Anonymous said...

Ѕimρly ωant to ѕаy уour агticle iѕ aѕ surprіѕing.

The сleаrness in your ρost iѕ sіmρly sрeсtаculаr
and і can аssume yоu aгe an
еxрeгt on this ѕubject.
Well ωith уour permіssion let mе to grаb уour
feed to keep up tο date with forthcoming ρoѕt.
Thanks a milliоn and please continue the enjoуablе woгk.

Have a lοoκ at my blog - relieve pain with a tens unit

Anonymous said...

This eхcellent websitе tгuly has аll
the іnfο I wanted concеrnіng thiѕ ѕubјect and
didn't know who to ask.

My weblog -

Anonymous said...

Its like you read my mind! You appear to know so much about this, like you wrote the book in it or something.
I think that you can do with some pics to drive the message home
a bit, but instead of that, this is fantastic blog.
A great read. I will certainly be back.

Look into my webpage: website ()

About Me

My photo
Phoenix, Arizona, United States