Dataphor SQL RAC (Relational Application Companion)


A site of hope for those looking for a true relational database system

Monday, July 02, 2007

Sql - History repeats itself simulating lists

As explained here, before sql-99 analytic functions a rank was expressed
in the form of a subquery:

(select aggregate from <foo> where predicate) as rank

where Count was generally used as the aggregate and the predicate was an 
inequality comparison (table2.Y<=table1.Y). As the definition of the rank
was refined the predicate became progressively more complex. If duplicate
values existed and unique ranks desired then another column had to be used
to resolve ties. An expanded predicate could look like:

where (table2.Y<table1.Y) OR (table1.Y=table2.Y AND table2.RowID<=table1.RowID)

Many experts were so uncomfortable with the nature of computing ranks
that they shifted the intent from a rank to a 'line number'. With this
slight of hand they were able to argue that it was a 'format' feature 
and therefore on the conceptual level of 'reporting'. And such things are
always done on the client. 
With the introduction of sql-99 ranking functions and the benefit of clear
hindsight the whole issue of how ranks were previously computed revealed
itself. Compared to the simplicity and efficiency of ranking functions
the sql queries were seen for the odious things they were. What was thought
to be 'cleaver' sql was really a query worthy of Dr. Frankenstein. And any
calls for removal to the client disappeared. So one would think sql users
would have learned an important lesson - the simulation of a construct
by a bunch of unrelated constructs is a poor substitute for the real
thing. But it seems sqls span of attention is short and history is repeating
itself. We can add the idea of simulating a 'list' to the misbegotten ranks.

Just as an aggregate and predicate were used for ranks the 'substring' 
function and a table of numbers are being used to formulate a 'list'.
Just like the rank predicate, the substring expression becomes more 
complex as the nature of the list becomes more refined. There are many
examples of list simulation such as here. But all sql list simulations
are fundamentally flawed. The principle way for sql to represent something
is by a 'table'. So a table is used to represent a list. But a 'table'
is a different animal than a 'list'. Sql can logically represent a rank
because a rank is a number and sql has a type number. But sql has no
type for a list. The same holds for representing a list by any type of
string. A string, no matter delimited or non-delimited is not a list.
One would think that the function 'substring' would lead to questioning
the sql ways to create a list. It is, after all, substring, ie. a function
designed explicitly for a 'string'! With a list sql is doomed before it
even starts to simulate it. The logical inconsistencies of lists in sql
are unfortunately hidden much like ranking functions were hidden. What
has filled this vacuum is so much non-sense. It is truly unfortunate
that the only thing that can reveal the extent of the logical confusion,
not to mention the code mess, is to work with a list type such as in
the D4 language of Dataphor.

No comments:

About Me

My photo
Phoenix, Arizona, United States