Dataphor SQL RAC (Relational Application Companion)


A site of hope for those looking for a true relational database system

Friday, July 04, 2008

Justifying an Sql Cop Out

This is my response to Joe Celko in the thread:

microsoft.public.sqlserver.programming
Thursday, June 26, 2008 
'update joined tables'
http://tinyurl.com/69hh5c

Joe invokes the 'NP complete' argument as the basis for why sql can't
handle complicated modeling issues. The specific modeling issue here
concerns a schema where there are multiple paths between tables due to
cascading updates. My view is the NP complete argument is essentially a
cop out. The same schema with multiple cascading updates that is rejected
by sql server is perfectly acceptable in Dataphor (along with associated
views involving multiple tables that can resolve data to the base tables).
(I suggest reading thru the whole thread, it's interesting ☺)

My response in the thread:

What do we call a man who demurs to an obstinate woman on every decision? We 
call him a whimp, a guy who doesn't think it's worth the effort to present 
his counter point because he's convinced it's impossible to change her 
mind. The bottom line is, if it's so hard to change her mind why even 
bother. And an easy out for taking any responsibility for control of the 
outcome. Here you're using the idea of an NP complete problem as a crutch 
for the mess sql has left developers in. Woman may be an NP complete problem 
for man but it doesn't follow that given a relationship all men are whimps. 
Just because there isn't a universal and instantaneous quick fix technique 
to change a woman's mind in any situation doesn't bum out all men. Some guys 
are inventive and creative and come up with techniques that will work at 
least in some situations. Isn't that what lying, pleading and begging are 
for?:) NP complete system problems are brick walls not because there's no 
way to solve them but because there's no good quick fix from a systems point 
of view. But that doesn't mean that dbms should whimp out on them. You're using 
NP to suggest an all or nothing game. Since no efficient computational 
scheme exists to cover all situations where referential integrity involving 
cascading updates comes into play then sql is going take its ball and just 
go home. This is at most nonsense, at the very least no sense. You're setting 
poor sql up as the victim here and whining about Petri Nets! The perfect 
solution doesn't exist, so what. Sql is/was just being damn lazy. All they 
had to do is talk to a bunch of guys who have been married for twenty years 
to get a clue:) Like they couldn't use some heuristics or approximations? 
Nope they just whimped out. Same with views. Sql is the victim of a yet to 
be found super quick universal solution to updatability. This is simply an sql 
crutch for abandoning the idea totally! They couldn't be creative and cover 
some percentage of possible expressions that could be resolved to base tables? 
Apparently not. So sql gets away with only being able to update a single table
and sticks a huge inconsistency between a table and view/expression in the face
of developers.

Here's an sql server example that has multiple cascading update paths:

-- Updating order_nbr in TableA cascades to order_nbr in TableB and TableC.
create table TableA
(
 order_nbr Integer,
 A         Integer,
 primary key(order_nbr)
) ;
 
 --Updating sku in TableB cascades to sku in TableC.
 create table TableB
(
 order_nbr Integer,
 sku       Integer,
 B         Integer
 primary key(order_nbr,sku),
CONSTRAINT  Details_TableB foreign key (order_nbr) REFERENCES TableA (order_nbr)
ON UPDATE CASCADE
) ;
  
 -- TableC has a dependency/path on TableA and TableB.
 create table TableC
(
 order_nbr Integer,
 sku       Integer,
 ID        Integer,
 C         char(1),
primary key (order_nbr,sku,ID),
CONSTRAINT Details_TableC1 FOREIGN KEY (order_nbr)     REFERENCES TableA (order_nbr)
ON UPDATE CASCADE,
CONSTRAINT Details_TableC2 FOREIGN KEY (order_nbr,sku) REFERENCES TableB (order_nbr,sku)
ON UPDATE CASCADE
 ) ;

Now is this a non-trival problem for a dbms. Yes it is. Is it insolvable in 
terms of solution and efficieny? Of course not. But when you try to create 
TableC you're told by sql server that it can't do it:

"Introducing FOREIGN KEY constraint 'Details_TableC2' on table 'TableC' may 
 cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or 
 ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints."

Game over. Anything other than the most trivial path is outside the scope of 
the 'system' to handle. Users are forever told the sql game is 'what' not 
'how'. But when it's convenient and expedient you turn the idea around and 
say 'how' can this be done when there's no known (NP complete) solution. 
Nice sophistry!:) Now this same DDL, with some modification, is perfectly 
fine in Dataphor using sql server to persist the tables. And using views, 
such as TableA join TableB, is also fine. The views will resolve data to the 
tables and the update cascades work as expected. Do you really think that 
this problem is NP complete for the Sql Server team but not for Dataphor? 
Give me a break:)
Users should understand that because sql doesn't try to solve complex 
modeling issues and doesn't attempt to resolve views/expressions with 
multiple tables that this doesn't mean these things can't be solved. Dbms 
can be built so these things are handled by the 'system' and the user isn't
left twisting in the wind making a mess in the front end. Will something
like Dataphor be able to handle every conceivable combination of cascades
between tables along with other types of constraints? Of course not, but it
will handle a lot of tricky schemes that sql can't. Is Dataphor capable of
resolving (updatability) every view/expression to the base tables? Of course
not, but it will handle a large chunk of them. And in each case these things
are transparent to the user. It would be nice to have a little honesty about
this. Instead of creating straw men and red herrings just tell it like it is. 
The framework/foundation of sql is just not able to handle these issues. This
dye was cast a long time ago (and yes sql was lazy on a lot of fronts). And 
there's little hope of sql addressing these things in the future. This
certainly doesn't make sql irrelevant. It's a 'right tool for the job' world. 
I think sql folks can handle it. After all, not all sql users are whimps:)

No comments:

About Me

My photo
Phoenix, Arizona, United States