Dataphor SQL RAC (Relational Application Companion)


A site of hope for those looking for a true relational database system

Sunday, June 03, 2007

Dataphor - String differences operator

The following is a string operator based on a suggestion in the thread:

microsoft.public.sqlserver.programming
Friday, June 01, 2007 7:53 AM
Compare String
http://tinyurl.com/38e7zg

It is written in D4 using Dataphor from:
www.alphora.com

(Go to operator Cstr).

Given a comma delimited string in the form of:
' string1 , string2 '
the the operator returns a row with the following columns:
removed - the difference between string1 and string2 (string1 diff-> string2). 
          It is the strings that remain in string1 after removing any string
          also found in string2.
added -   is the same as removed but in the opposite direction, ie. the
          difference between string2 diff-> string1.
IsRem -   a boolean. True if there is a difference in the direction of removal,
          false otherwise.
IsAdd -   boolean, same as IsRem but from the direction of added.
IsSame -  a boolean, true if there are no differences from either direction
          and the order of individual string parts are the same in string1
          and string2. If the string parts in each string are equal and they
          occur in the same position they are the same otherwise they are not. 
          This scalar is based on an observation by Gert-Jan Strik in the
          thread.
          
In the event that there is no difference the operator returns the string
for the appropriate direction. If there's no difference for removed string1
is returned cleanly formatted, ie a blank between each string part. The
same for added. Whether or not a difference exists can be determined by
checking IsRem and IsAdd. The same formatting extends for differences.
Removed and added are always formatted with a blank between each string
part.

Note that it is assumed that a blank is the delimiter within string1
and string2. This can easily be changed as the 'Split' operator can
take a list of delimiters. Also note how nicely the D4 'without' construct
lends itself to obtaining the difference in either direction between the
strings. Lastly, instead of passing in a delimited string a list could
have been used. In this case it does not appear to make much difference
which one is used.

The operator returns a row type. No declarative programming is used. Only
dedicated list, string and relational operators are used. While a bit long
winded compared to a dedicated D4 declarative approach, it illustrates the
relative ease with which the expressive power of the language can be used.

***********************************************************

Examples:

1.
 This is the example from the thread. It shows multiple differences in
 each direction.
 select Cstr('The red fox is big and fat. The,The green frog is slimey. The g');
 This returns the following row:

 removed              added                IsRem IsAdd IsSame 
 -------------------- -------------------- ----- ----- ------ 
 red fox big and fat. green frog slimey. g True  True  False  

2.
 A single difference in each direction.
 select Cstr(' B1  C1 A1 , D1  B1  A1');

 removed added IsRem IsAdd IsSame 
 ------- ----- ----- ----- ------ 
 C1      D1    True  True  False  

3.
 select Cstr(' white  red blue  black  , red yellow  pink '); 

 removed          added       IsRem IsAdd IsSame 
 ---------------- ----------- ----- ----- ------ 
 white blue black yellow pink True  True  False  

4.
 select Cstr(' white  red blue  black  , white  red blue  black pink '); 

 Since there is no difference from removal (string1) string1 is returned
 and IsRem is False. There is a single difference from added and IsAdd
 is True.

 removed              added IsRem IsAdd IsSame 
 -------------------- ----- ----- ----- ------ 
 white red blue black pink  False True  False  

5.
 select Cstr(' white  red blue  black pink  , white  red blue  black  '); 

 Same as the last example but the difference is reversed.

 removed added                IsRem IsAdd IsSame 
 ------- -------------------- ----- ----- ------ 
 pink    white red blue black True  False False  

6.
 select Cstr(' white  red blue  black pink  , white  red blue  black  pink'); 

 There is no difference in either direction and string1 and string2
 are returned. Since the string parts (colors) are the same and occur in
 the same order IsSame is True.

 removed                   added                     IsRem IsAdd IsSame 
 ------------------------- ------------------------- ----- ----- ------ 
 white red blue black pink white red blue black pink False False True   

7.
 select Cstr(' white  red blue  black pink  , black  pink blue  white  red'); 

 The colors in each string match but occur in a different order. Therefore
 IsSame is False

 removed                   added                     IsRem IsAdd IsSame 
 ------------------------- ------------------------- ----- ----- ------ 
 white red blue black pink black pink blue white red False False False  

8.
 This returns only the scalar removed from the row.

 select Cstr(' B1  C1 A1 , D1  B1  A1').removed ;

 C1

9.
 This returns it as the column Rem.

 select TableDee add {Cstr(' B1  C1 A1 , D1  B1  A1').removed Rem} ;

 Rem 
 --- 
 C1  

10.
 Assigning the result to a variable. Note that the argument (operand) to
 the operator can be constructed within the call.

 var Rem:String;
 var Str1:=' working with variables in Dataphor ';
 var Str2:=' working with variables';
 var Str:=Cstr(Str1+','+Str2);
 if Str.IsRem then
 Rem:=Str.removed
  else
    Rem:=' ';
 select Rem;
 
 in Dataphor
 
11.
 The differences can be used to add a column to the result showing
 the string parts in common.
 
 var Str1:='BLACK WHITE PURPLE GREEN ORANGE YELLOW';
 var Str2:='WHITE BLACK PINK NAVY PURPLE YELLOW GOLD BEIGE';
 select 
  Cstr(Str1+','+Str2)
    add
       {
        Concat({Str,Del}
         from
         (
          (ToTable((Str1+','+Str2).Split({',',' '}),'Str')
          {Str.Trim() Str} where Str>' ')
            without
             ToTable( ((Cstr(Str1+','+Str2) 
               {removed+' '+added Str}).Str).Split({' '}),'Str')              
          ) add {' ' Del}
             order by {Str})
        commonstr
        }
   {removed,added,commonstr,IsRem,IsAdd};     

removed      added                commonstr                 IsRem IsAdd 
------------ -------------------- ------------------------- ----- ----- 
GREEN ORANGE PINK NAVY GOLD BEIGE BLACK PURPLE WHITE YELLOW True  True  

12.
 Define a table variable to hold multiple rows of the operator.

 var TableStr:=
 table
       {
        Cstr(' A1 b1 C1 d1 , A1 B1 E1 F2 ') add{1 RowID},
        Cstr(' A B C , ') add{2 RowID},
        Cstr('AA A AA  a ,   AA   a')add{3 RowID},
        Cstr(' , ') add {4 RowID},
        Cstr(' 10 8 7 6 , 10   8  7 6 ') add{5 RowID},
        Cstr(' z  1  44_A , 1 44_A z ') add{6 RowID}
       } 
        {RowID,removed,IsRem,added,IsAdd,IsSame} ;   
 select TableStr order by {RowID};
        
RowID removed  IsRem added    IsAdd IsSame     
----- -------- ----- -------- ----- ---------- 
1     b1 C1 d1 True  B1 E1 F2 True  False      
2     A B C    False          False False      
3     A        True  AA a     False False      
4              False          False <No Value> 
5     10 8 7 6 False 10 8 7 6 False True       
6     z 1 44_A False 1 44_A z False False      

13.
 Accessing data in MS Sql Server 2005.
 
 Create some data in the pubs database and access it with Dataphor.
 
 create table Tstrings
  (RID int primary key,Astr varchar(150))
 insert Tstrings values(1,'AA BB CC DD TT, YY NN AA BB R1')
 insert Tstrings values(2,'FF1  4564 33AAA T1 55, 33AAA FF1 33AA Y4564 ')
 insert Tstrings values(3,'100 81 1 353 30 A64,1 81 82 R34 353 A353')
 insert Tstrings values(4,'The red fox is big and fat. The,The green frog is slimey. The g')
 insert Tstrings values(5,'This  is a fun  test, This  is a fun  test  ')
 insert Tstrings values(6,'This  is a fun  test, This test is a fun ')
 insert Tstrings values(7,'Does this stuff make sense? -:), ')
 insert Tstrings values(8,'I hope you,I hope you explore Dataphor ')

Reconcile(Name("Pubs"));//Make the data available to Dataphor.

select
 Tstrings 
  add
     {
      Cstr(Astr).removed removed,Cstr(Astr).added added,
      Cstr(Astr).IsRem IsRem,Cstr(Astr).IsAdd IsAdd,
      Cstr(Astr).IsSame IsSame
      } 
      with {IgnoreUnsupported='true'}
      remove{Astr}; //So the row doesn't get too long-:)
      
RID removed                         added                IsRem IsAdd IsSame 
--- ------------------------------- -------------------- ----- ----- ------ 
1   CC DD TT                        YY NN R1             True  True  False  
2   4564 T1 55                      33AA Y4564           True  True  False  
3   100 30 A64                      82 R34 A353          True  True  False  
4   red fox big and fat.            green frog slimey. g True  True  False  
5   This is a fun test              This is a fun test   False False True   
6   This is a fun test              This test is a fun   False False False  
7   Does this stuff make sense? -:)                      False False False  
8   I hope you                      explore Dataphor     False True  False  

End of examples.

***********************************************************

In sql (MS Sql Server) a procedure can have multiple output parameters.
In D4 there is no such concept as multiple output parameter. Multiple
output parameters in D4 means either multiple operators each returning
a scalar value, returning a row or returning a table with a single row.

create operator Cstr(Str:String):
 row{removed:String,IsRem:Boolean,IsSame:Boolean,added:String,IsAdd:Boolean}
begin
result:=row of typeof(result){};
var ListStr:=Str.Split({','});
//Check that input operand Str has a single comma, raise error and exit if false.
var StrCnt:=ListStr.Count();
if StrCnt<>2 then
begin
raise Error
("Input Str: "+ Unicode({39})+ Trim(Str) + Unicode({39}) +
  " has " + ToString(StrCnt-1) + " comma's, 
  the operator expects a single comma delimited string ");
exit;
end;
var Idummy:Integer;
if ListStr.Count()=1 then
Idummy:=ListStr.Add(' ');
//Form row (all scalar values) and modify row values when appropriate.
result:=
row
{
//removed
 (
 IfNil(Concat ({Str,Del}
  from
 (
  (
   (ToTable(ListStr[0].Split({' '}),'Str') {Str.Trim() Str} where Str>' ' )     
    without
     (ToTable(ListStr[1].Split({' '}),'Str') {Str.Trim() Str} where Str>' ') 
   )
   join
    ((ToTable(ListStr[0].Split({' '}),'Str') {Str.Trim() Str,sequence seq} where Str>' ')   
      add{' ' Del})
  )
      order by {seq} 
  ),ListStr[0])
  ) removed,
//added
 (
  IfNil(Concat ({Str,Del}
  from
  (
   (
    (ToTable(ListStr[1].Split({' '}),'Str') {Str.Trim() Str} where Str>' ')     
    without
     (ToTable(ListStr[0].Split({' '}),'Str') {Str.Trim() Str} where Str>' ') 
   )
   join
    ((ToTable(ListStr[1].Split({' '}),'Str') {Str.Trim() Str,sequence seq} where Str>' ')   
      add{' ' Del})
  )
      order by {seq} 
  ),ListStr[1])
  ) added, true IsRem, true IsAdd,  false IsSame
};  
//Check removed.
if (result.removed=ListStr[0]) or
 (
   (
    ToTable((result.removed).Split({' '}),'Str')
     {Str.Trim() Str} where Str>' '
   )  
         =
   (
     ToTable(ListStr[0].Split({' '}),'Str')
      {Str.Trim() Str} where Str>' '
   )   
 )  
then
update result 
 set 
    {
     IsRem:=false,
     removed:=IfNil(Concat( {Str,Del}
               from 
                  (
                   ToTable(ListStr[0].Split({' '}),'Str')
                     where Str>' ' add{' ' Del}
                  )
                    order by {sequence}
                    ),' ')          
    };
//Check added.
if (result.added=ListStr[1]) or
 (
   (
    ToTable((result.added).Split({' '}),'Str')
     {Str.Trim() Str} where Str>' '
   )  
         =
   (
     ToTable(ListStr[1].Split({' '}),'Str')
      {Str.Trim() Str} where Str>' '
   )   
 )  
then
update result 
 set 
    {
     IsAdd:=false,
     added:=IfNil(Concat( {Str,Del}
               from 
                  (
                   ToTable(ListStr[1].Split({' '}),'Str')
                     where Str>' ' add{' ' Del}
                  )
                    order by {sequence}
                    ),' ')          
    };
//If IsRem is false, are the 2 strings the same. 
if not(result.IsRem) then
if
(
(ToTable(ToList(cursor(ToTable(ListStr[0].Split({' '}),'Str')
                 {Str.Trim() Str,sequence seq} 
                    where Str>''
                     order by {seq}))) remove {seq})   
                   =
(ToTable(ToList(cursor(ToTable(ListStr[1].Split({' '}),'Str')
                 {Str.Trim() Str,sequence seq} 
                    where Str>''
                     order by {seq}))) remove {seq})
) then
update result
       set {IsSame:=true};  
//We cannot logically say 2 blank strings are the same, can we?       
if Length(Replace(Replace(Str,',',''),' ',''))=0 then
 update result
       set {IsSame:=nil};
        
end;

No comments:

About Me

My photo
Phoenix, Arizona, United States