Dataphor SQL RAC (Relational Application Companion)


A site of hope for those looking for a true relational database system

Saturday, August 12, 2006

Dataphor - a Banana Split

A common problem in sql is to split a delimited string without
using cursors or any type of procedural code. Talk about how
to have fun!:) And to spice it up, how about multiple delimiters.
Ok here's an example of splitting a string over rows with multiple
delimiters, colon(:), comma(,) and slash(/). The column 'STRING'
is what we want to tear apart over 'ID'. You will love this!

// Keywords are in bold.
select 
table
 {
 row {1 ID, 'AA: BB, CC/ DD, EE/ FF: GG' STRING},
 row {2,'KK, JJ: II, HH, AA: BB'},
 row {3,'LL : MM , NN / PP'},
 row {4,'QQQ / RRRR'},
 key {ID}
 }
  times 
   ToTable({0,1,2,3,4,5,6,7,8,9,10},'DIGIT')     
    where STRING.Split( { ':',',','/' } ).Count()>DIGIT
     add {STRING.Split( { ':',',','/' } ).Count() cnt,
          STRING.Split( { ':',',','/' } )[DIGIT].Trim() strpart}
          {ID,strpart,DIGIT+1 index,cnt} 
        order by {ID,index }

ID strpart index cnt
-- ------- ----- ---
1  AA      1     7
1  BB      2     7
1  CC      3     7
1  DD      4     7
1  EE      5     7
1  FF      6     7
1  GG      7     7
2  KK      1     6
2  JJ      2     6
2  II      3     6
2  HH      4     6
2  AA      5     6
2  BB      6     6
3  LL      1     4
3  MM      2     4
3  NN      3     4
3  PP      4     4
4  QQQ     1     2
4  RRRR    2     2

Lets go over this. I promise it won't take long, it's that simple.
First we'll get over the clutter of the virtual tables and create
some permanent ones. First we'll create table 'examplesplit'.

create table examplesplit
from
 table
 {
 row {1 ID, 'AA: BB, CC/ DD, EE/ FF: GG' STRING },
 row {2,'KK, JJ: II, HH, AA: BB'},
 row {3,'LL : MM , NN / PP'},
 row {4,'QQQ / RRRR'},
 key {ID}
 }
  adorn  STRING tags {Storage.Length = "100"}}

select examplesplit
ID STRING                 
-- --------------------------
1  AA: BB, CC/ DD, EE/ FF: GG
2  KK, JJ: II, HH, AA: BB 
3  LL : MM , NN / PP      
4  QQQ / RRRR

Don't worry about adorn, I'll come back to that in due time.
Now create a table of digits called 'numbers' with a column 'DIGIT'.

create table numbers
from
 ToTable({0,1,2,3,4,5,6,7,8,9,10},'DIGIT')

select numbers
DIGIT
-----
0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10

Now here we go with 2 versions of the new query to grap hold of
of you regardless of your inclination:)

select 
 examplesplit times numbers    
    where STRING.Split( { ':', ',','/' } ).Count()>DIGIT
     add {STRING.Split( { ':', ',','/' } ).Count() cnt,
          STRING.Split( { ':', ',','/' } )[DIGIT].Trim() strpart}
          {ID,strpart,DIGIT+1 index,cnt} 
       order by {ID,index}
select 
 examplesplit times numbers    
    where Split(STRING,{ ':',',','/' } ).Count()>DIGIT
     add {Split(STRING,{ ':',',','/' } ).Count() cnt,
          Split(STRING,{ ':',',','/' } )[DIGIT].Trim() strpart}
          {ID,strpart,DIGIT+1 index,cnt} 
       order by {ID,index}

ID strpart index cnt
-- ------- ----- ---
1  AA      1     7
1  BB      2     7
1  CC      3     7
1  DD      4     7
1  EE      5     7
1  FF      6     7
1  GG      7     7
etc.

The 'STRING.Split' is for those that are inclined to believe in
UFO's (objects) and their modus operandi of abductions (methods).
The 'Split(STRING' is for those that have a history of renditions
at the hands of sql.

Now 'examplesplit times numbers' is like sql's 'cross join'. Now
comes the kewl part. The Spit thingie splits a string so easily
because it consumes (as in eats up) the string as a 'list'. This
is a BIG deal. When is a door not a door? When it's a jar:) When is
a 'string' more than itself, when it's a 'list':) So with knowledge
of the delimiters Split is operating on a 'list'. If you have a list
you should know how many items are in it. And you do with Count():

STRING.Split( { ':',',','/' } ).Count()
Split(STRING,{ ':',',','/' } ).Count()

Lists are 0 based. So for ID 1 there are 7 strings in the list.
The elements in the list are therefore 0-6. So the where restriction
forms rows for each ID that have DIGIT less than the Count(). And
DIGIT can be used as an index into the list to return a particular
string, ie. 'strpart'. So to the table formed by times and where, for
each row we add the Count() as column 'cnt' and a particular string 
as column 'strpart' by using DIGIT in either of the expressions:

STRING.Split( { ':',',','/' } )[DIGIT].Trim() strpart
Split(STRING,{ ':',',','/' } )[DIGIT].Trim() strpart

You will see that this [] thingie, called an indexer, can be used not
only on lists but also tables whatever form they take. This construct
is crazy kewl and I'll show just how kewl and powerful it is as we
work our way thru Dataphor. Trust me:)

The expression surrounded by the swigglies:

{ID,strpart,DIGIT+1 index,cnt}

is used to return the columns you want and also to define new columns
which is what 'index' is.

By the way did it register that this is exactly the opposite of sql!
In sql you state what you want returned before you even tell it what
to process. Here we process stuff and after we're finished we can
pick and choose what we want. In sql it's like going to a resturant
and asking for the check before you've eaten let alone even ordered!
Waitress:'Welcome to Alice's Restaurant sir'. Sqler:'Hi, check please'.
Waitress thinking:'what a nitwit':) This is a big deal in writing
queries as you will see.

And did you notice that there is no 'from' in 'select'! It's not there
because you don't need it! It's redundant. Why do you have to write
'from' and the table. Just stating the table name should be sufficient.
Writing both is like paying for the same piece of real estate twice.
Come on, this isn't brain surgury it's just common sense and logical.
And Dataphor is filled with both:)

As for order by...well guess:)

I hope your enthused about Dataphors banana split compared to the
splitting headache in sql:)

Stay tuned:)

No comments:

About Me

My photo
Phoenix, Arizona, United States