Beyond Sql: July 2008

Wednesday, July 16, 2008

Demoralization by trigger

In an interesting blog by Conor Cunningham, 'The Trouble with Triggers', he says:
"The problem with this area is that there is a great temptation to think
 about databases procedurally, as you would with a programming language
 like C++ or C#.  You can write code that looks like a procedural function
 call and have it get called for each insert into table!  Before you know
 it, non-database programmers are checking in code to your production sysem.
 Suddenly, your application grinds to a halt because a trigger plan has no
 index or is poorly designed.
 Databases are great tools for working on SETS of rows."

In another related blog, 'Triggers...Evil?', there is this comment with an 
insightful Freudian slip:

James Luetkehoelter said:
>Some sorts of demoralization lend themselves to triggers...

I would say this hits the nail on the head. I could understand a developer
getting a case of depression triggered by Conors article. Triggers were
'implemented' to work efficiently on tables (sets) not on rows. The principle
that's operating here is that how something was implemented to be most
effective is the basis for what's best in application development. Are you
kidding me, has everyone gone nuts? ☺ Because triggers don't consider a row
as a primary concept 'functional' programmers, application developers, must
'unlearn' their database contrarian views. This is Celkos 'them' vs. 'us' 
nonsense. Never mind that the real subject is application development and
possibly a theory that would best serve it, the basis for key concepts is 
what a bunch of programmers did for a MS product manager. Talk about the
tail wagging the dog ☺ Not only is the absence of a 'row' type or at least
concept antithetical to a relational dbms but it's central to application
development. Perhaps to even the score MS decided developers should learn
entities and unlearn sql entirely, LINQ. Or perhaps we'll get a hole new
science of application development based on what works fastest ☺

Sunday, July 06, 2008

An opposite constraint

This article discusses an issue raised in the thread:

microsoft.public.sqlserver.programming
Sunday, June 29, 2008 
'Storing a collection of lines'
http://tinyurl.com/56dpd4

Give two columns in a table suppose you want to eliminate the opposite
data where there's no mathematical or logical relationship between
the columns. For example, consider a trip between two cities. It's 
equally likely a trip could start and end in either direction. If the 
table already has:

column A  column B
--------  --------
NEW YORK  CHICAGO

We want to prevent the opposite from being entered:

column A  column B
--------  --------
CHICAGO   NEW YORK  

If the table has:

column A  column B
--------  --------
CHICAGO   NEW YORK  

We want to prevent the opposite from being entered:

column A  column B
--------  --------
NEW YORK  CHICAGO

Because there's no relationship between the columns an sql check
constraint can't be used. But in Dataphor a simple transition
constraint can be used. The Opposite constraint simply checks if 
the opposite data for columns A and B already exists in the table. 
If it does the current row is rejected. The constraint works the
same way for a single insert as it does for inserting multiple rows
(a table). 
For example:

create session table MyTable
{ A:String,B:String,C:String,key{A,B} };

alter table MyTable
{
  create transition constraint Opposite
   on insert //The current (row) values for columns A and B are accessed 
             //by prefixing each with 'new', ie. new.A, new.B . 
    not exists (MyTable {A X,B Y} {Y A,X B} where A=new.A and B=new.B)  
 tags        //A custom error message can be written using the current 
             //row values (new.A, new.B).
 { DAE.Message =
   "'For A: ' + new.A + ' and B: '+ new.B + ' there is a opposite, A: ' + new.B + ' and B: ' + new.A " 
 }   
};

These insert succeed:

insert row{'NEW YORK' A,'CHICAGO' B,'1c' C} into MyTable;
insert row{'CALIFORNIA' B,'TEXAS' A,'1d' C} into MyTable;

Inserting an opposite will fail and the custom error message will be raised:

insert row{'NEW YORK' B,'CHICAGO' A,'1e' C} into MyTable;
"For A: CHICAGO and B: NEW YORK there is a opposite, A: NEW YORK and B: CHICAGO"

insert row{'CALIFORNIA' A,'TEXAS' B,'1d' C} into MyTable;
"For A: CALIFORNIA and B: TEXAS there is a opposite, A: TEXAS and B: CALIFORNIA"

Given that the table contains the data "NEW YORK" (A) and "CHICAGO" (B), 
inserting the following rows as a table will fail:

insert table{
             row{'DENVER' A,'BOSTON' B,'1c' C},
             row{'RENO' B,'MIAMI' A,'1d' C},
             row{'CHICAGO' A,'NEW YORK' B,'1e' C} 
            } into MyTable;
       
"For A: CHICAGO and B: NEW YORK there is a opposite, A: NEW YORK and B: CHICAGO. 

Note that the primary key constraint will eliminate the same A and B cities from
being entered twice but entering the opposite cities does not violate it. That's
what the Opposite constraint is for. The Opposite constraint is much simpler than
an sql solution using triggers.

A one-one requirement constraint

The issue of a one-one requirement, a new order inserted must be
accompanied by a detail row (insert), was raised in the thread:

microsoft.public.sqlserver.programming
Thursday, June 26, 2008 
'update joined tables'
http://tinyurl.com/69hh5c

While this is a non-trivial exercise in sql it's quite easy in the
relational system Dataphor. For example:

create session table SOrders
{
 order_nbr:Integer,
 A:Integer,
 key{order_nbr}
}; 
 
create session table SOrderDetails
 {
  order_nbr:Integer,
  sku:Integer,
  B:Integer,
  C:String,
  key{order_nbr,sku},
  reference Details_SOrders{ order_nbr } references SOrders { order_nbr }
 };

This constraint makes a detail row a requirement for a new order. The user
defined message will be displayed anytime the constraint is violated.

alter table SOrders
  {
 create transition constraint MustHaveOrderDetails
  on insert 
    exists(SOrderDetails where order_nbr=new.order_nbr)  
   tags
 {
   DAE.Message =
          "'A detail row with order#: ' + ToString(new.order_nbr) + ' is required'" 
        }    
  };   
 
Inserting just a new order will violate the constraint:
insert row{1 order_nbr, 14 A} into SOrders; 
Violation of constraint "MustHaveOrderDetails", "A detail row with order#: 1 is required."

Inserting data can easily be done via a view.

create session view VSOrderDetails
                SOrders join SOrderDetails;//A natural join on order_nbr.

The MustHaveOrderDetails constraint works for the insertion of a single
row or a table.

insert row{1 order_nbr, 1 sku, 14 A, 5 B, 'Joe' C} into VSOrderDetails;     

delete SOrders;

Insert into the tables thru the view.

insert table
            {
             row{1 order_nbr, 1 sku, 14 A, 5 B, 'Joe' C},
             row{1,2,9,23,'Steve'},
             row{2,3,34,2,'Larry'}
            } into VSOrderDetails;   

select SOrders;

order_nbr A  
--------- -- 
1         14 
2         34 

select SOrderDetails;            

order_nbr sku B  C     
--------- --- -- ----- 
1         1   5  Joe   
1         2   23 Steve 
2         3   2  Larry 

This rather simple example shows off the much higher level of abstraction
offered to developers with a relational system like Dataphor over sql. In
sql only one table may be updated with a view. In Dataphor there is no
concept of updating multiple tables. There is only the idea of updating a
view/expression which is a table. It's the system responsibility to resolve
a view to its base tables given the constraints in the schema. The user need
only realize a table is being used for data modification and it's irrelevant
how that table was constructed as well as the number of tables in its definition.

Friday, July 04, 2008

Justifying an Sql Cop Out

This is my response to Joe Celko in the thread:

microsoft.public.sqlserver.programming
Thursday, June 26, 2008 
'update joined tables'
http://tinyurl.com/69hh5c

Joe invokes the 'NP complete' argument as the basis for why sql can't
handle complicated modeling issues. The specific modeling issue here
concerns a schema where there are multiple paths between tables due to
cascading updates. My view is the NP complete argument is essentially a
cop out. The same schema with multiple cascading updates that is rejected
by sql server is perfectly acceptable in Dataphor (along with associated
views involving multiple tables that can resolve data to the base tables).
(I suggest reading thru the whole thread, it's interesting ☺)

My response in the thread:

What do we call a man who demurs to an obstinate woman on every decision? We 
call him a whimp, a guy who doesn't think it's worth the effort to present 
his counter point because he's convinced it's impossible to change her 
mind. The bottom line is, if it's so hard to change her mind why even 
bother. And an easy out for taking any responsibility for control of the 
outcome. Here you're using the idea of an NP complete problem as a crutch 
for the mess sql has left developers in. Woman may be an NP complete problem 
for man but it doesn't follow that given a relationship all men are whimps. 
Just because there isn't a universal and instantaneous quick fix technique 
to change a woman's mind in any situation doesn't bum out all men. Some guys 
are inventive and creative and come up with techniques that will work at 
least in some situations. Isn't that what lying, pleading and begging are 
for?:) NP complete system problems are brick walls not because there's no 
way to solve them but because there's no good quick fix from a systems point 
of view. But that doesn't mean that dbms should whimp out on them. You're using 
NP to suggest an all or nothing game. Since no efficient computational 
scheme exists to cover all situations where referential integrity involving 
cascading updates comes into play then sql is going take its ball and just 
go home. This is at most nonsense, at the very least no sense. You're setting 
poor sql up as the victim here and whining about Petri Nets! The perfect 
solution doesn't exist, so what. Sql is/was just being damn lazy. All they 
had to do is talk to a bunch of guys who have been married for twenty years 
to get a clue:) Like they couldn't use some heuristics or approximations? 
Nope they just whimped out. Same with views. Sql is the victim of a yet to 
be found super quick universal solution to updatability. This is simply an sql 
crutch for abandoning the idea totally! They couldn't be creative and cover 
some percentage of possible expressions that could be resolved to base tables? 
Apparently not. So sql gets away with only being able to update a single table
and sticks a huge inconsistency between a table and view/expression in the face
of developers.

Here's an sql server example that has multiple cascading update paths:

-- Updating order_nbr in TableA cascades to order_nbr in TableB and TableC.
create table TableA
(
 order_nbr Integer,
 A         Integer,
 primary key(order_nbr)
) ;
 
 --Updating sku in TableB cascades to sku in TableC.
 create table TableB
(
 order_nbr Integer,
 sku       Integer,
 B         Integer
 primary key(order_nbr,sku),
CONSTRAINT  Details_TableB foreign key (order_nbr) REFERENCES TableA (order_nbr)
ON UPDATE CASCADE
) ;
  
 -- TableC has a dependency/path on TableA and TableB.
 create table TableC
(
 order_nbr Integer,
 sku       Integer,
 ID        Integer,
 C         char(1),
primary key (order_nbr,sku,ID),
CONSTRAINT Details_TableC1 FOREIGN KEY (order_nbr)     REFERENCES TableA (order_nbr)
ON UPDATE CASCADE,
CONSTRAINT Details_TableC2 FOREIGN KEY (order_nbr,sku) REFERENCES TableB (order_nbr,sku)
ON UPDATE CASCADE
 ) ;

Now is this a non-trival problem for a dbms. Yes it is. Is it insolvable in 
terms of solution and efficieny? Of course not. But when you try to create 
TableC you're told by sql server that it can't do it:

"Introducing FOREIGN KEY constraint 'Details_TableC2' on table 'TableC' may 
 cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or 
 ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints."

Game over. Anything other than the most trivial path is outside the scope of 
the 'system' to handle. Users are forever told the sql game is 'what' not 
'how'. But when it's convenient and expedient you turn the idea around and 
say 'how' can this be done when there's no known (NP complete) solution. 
Nice sophistry!:) Now this same DDL, with some modification, is perfectly 
fine in Dataphor using sql server to persist the tables. And using views, 
such as TableA join TableB, is also fine. The views will resolve data to the 
tables and the update cascades work as expected. Do you really think that 
this problem is NP complete for the Sql Server team but not for Dataphor? 
Give me a break:)
Users should understand that because sql doesn't try to solve complex 
modeling issues and doesn't attempt to resolve views/expressions with 
multiple tables that this doesn't mean these things can't be solved. Dbms 
can be built so these things are handled by the 'system' and the user isn't
left twisting in the wind making a mess in the front end. Will something
like Dataphor be able to handle every conceivable combination of cascades
between tables along with other types of constraints? Of course not, but it
will handle a lot of tricky schemes that sql can't. Is Dataphor capable of
resolving (updatability) every view/expression to the base tables? Of course
not, but it will handle a large chunk of them. And in each case these things
are transparent to the user. It would be nice to have a little honesty about
this. Instead of creating straw men and red herrings just tell it like it is. 
The framework/foundation of sql is just not able to handle these issues. This
dye was cast a long time ago (and yes sql was lazy on a lot of fronts). And 
there's little hope of sql addressing these things in the future. This
certainly doesn't make sql irrelevant. It's a 'right tool for the job' world. 
I think sql folks can handle it. After all, not all sql users are whimps:)

Wednesday, July 16, 2008

Demoralization by trigger

Sunday, July 06, 2008

An opposite constraint

A one-one requirement constraint

Friday, July 04, 2008

Justifying an Sql Cop Out

Beyond Sql

Blog Archive

Links

About Me