No where can the essence of a business problem or model be better shown than thru the means of constraints. Constraints encapsulate the logic of a model and govern the ebb and flow of an application. Constraints in Dataphor, based on the relational model, offer these benefits: * Emphasize data independence * The ability to reference global objects * The use of the declarative method in application development The relational model allows and fosters the clear separation of how a problem is modeled versus physical implementation details. This is the relational concept of data independence. The solution of a business problem should not be encumbered by anything that detracts from the ability to clearly and precisely model it. This is the essence of the concept of the logical model in relational theory. Yet data independence is sorely lacking in sql where the line of distinction between the logical model and performance considerations are too often blurred. The most blatant example is the conceptual entanglement of keys and indexes. A key, which is a constraint, is a logical consideration. An index is a physical consideration. By allowing indexes based on performance considerations to directly enter the logic of the model the clarity of expression that is needed becomes undermined or even lost. The price for a gain in performance is paid for by the ambiguity introduced into the model. Constraints in dataphor can be clearly expressed independent of performance based concepts, ie. indexes. In order to define constraints that need to reference global objects like tables, sql needs to rely on mechanisms like triggers. All to often this leads to problems with declared referential integrity constraints and adds an additional layer of complexity to the problem. Unlike simple column constraints or indexes, sql does not allow a table constraint to be simply and clearly stated in a 'create table' or 'alter table' statement. Dataphor allows this simple but powerfully expressive capability. In sql the idea of metadata is independent of the logical constructs used to derive data. In other words, in sql you can declare an index but you can't declare any additional information about the index. Dataphor allows metadata, information about the object, to be declared along with the object. This additional information is used in a declarative way to make application development easier. Dataphors derivation engine thus absolves the developer of an entire level of work. For example, by declaring metadata about a constaint the meaning of the constraint can be clarified while at the same time saving the developer the work to do this. An interesting example of the confluence of constraints (the logical model) and performance can be found in the article 'Indexes on Computed Columns: Speed Up Queries, Add Business Rules' which can be viewed here. Very cleaver solutions to constraints are proposed in the form of indexes on computed columns. Compare this approach to simply, clearly and explicitly defining a constraint with metadata in dataphor leaving performance optimization as the last step of the application. For example, one problem the article discusses is insuring that VPN tokens are unique in the table while allowing any number of null VPN token numbers. I'll be using dataphor with MS sql server 2005 sp1 as the data repository (backend). Here is the table definition defined in dataphor: (All dataphor keywords are in bold) create table TESTC { EMPLOYEEID : Integer, LOCATIONID : String, DEPTID : String { default 'ONE' }, VPNTOKEN : Integer { nil } , key {EMPLOYEEID} } The primary key is EMPLOYEEID. The 'Integer { nil }' is the dataphor way of saying that VPNTOKEN can take a null (no value) as the data is stored in sql server. A table constrain can now be declared that captures the logic of uniqueness for non null VPNTOKENs and allows unlimited nulls for it. The constraint can be declared using the dataphor alter table. alter table TESTC { create constraint DISTINCTVPNWITHNULLS exists( TESTC group add {Count(VPNTOKEN) CNT,Count(distinct VPNTOKEN) DISTVPNCNT} where CNT=DISTVPNCNT ) tags { DAE.Message = "'VPNTOKEN ' + ToString(VPNTOKEN) + ' is not unique'" } } The 'Count(VPNTOKEN)' and 'Count(distinct VPNTOKEN)' work the same as they do in sql. Both count constructs eliminate null values. So the where predicate is comparing the count of non nulls to the distinct count of non nulls. The group statement has no by statement so the comparison is over the whole table. The logic of the constraint says if where is true then all the non nulls must be unique. If where is false an attempt was made to enter an existing non null VPNTOKEN and thus violiates the uniqueness of VPTOKEN in the table and the constraint. The attempt to enter such a VPNTOKEN will be rejected. Since where is independent of nulls, all null values for VPNTOKEN will satsify the constraint (it evaluates to true) and will be accepted. The only way a null VPNTOKEN will be rejected is if the row being inserted violates the uniqueness of the key of the table which is EMPLOYEEDID. This is the meaning of specifying a key. Only unique values of EMPLOYEEID are acceptable. The tag keyword indicates metadata is supplied with the constraint definition. Dataphor will use the message when the constraint is violated as opposed to a cryptic message returned by sql server when a key/index is violated. Lets insert some data with unique VPNTOKENs insert table { row { 1 EMPLOYEEID,'A' LOCATIONID,'ONE' DEPTID, 1 VPNTOKEN }, row { 2,'A','ONE',2}, row { 3,'B','TWO',3} } into TESTC; Insert some rows with null VPNTOKEN. insert table { row { 4 EMPLOYEEID,'C' LOCATIONID,'ONE' DEPTID}, row { 5 EMPLOYEEID,'C' LOCATIONID,'THREE' DEPTID} } into TESTC This is the data in the table. select TESTC EMPLOYEEID LOCATIONID DEPTID VPNTOKEN ---------- ---------- ------ ---------- 1 A ONE 1 2 A ONE 2 3 B TWO 3 4 C ONE <No Value> 5 C THREE <No Value> Now in the derived form that dataphor has available for every table and view we can try to enter a row such as: row { 7 EMPLOYEEID,'C' LOCATIONID,'THREE' DEPTID,2 VPNTOKEN} Dataphor uses the tag metadata expression to clearly show the user the problem with the specific data that violated the constraint: I submit this is a significant improvement over the type of message returned by sql server which tells you what object (ie. key) caused the violation but gives no information about the 'data' which caused it. The developer is left with this task on the frontend. Any table constraint should be a straightforward short process to represent. As another example, this constraint allows unique VPNTOKENs but at most 1 null value. alter table TESTC { create constraint DISTINCTVPNWITHONENULL exists( TESTC group add { Count() TOTALCNT, Count(VPNTOKEN) CNT, Count(distinct VPNTOKEN) DISTVPNCNT } where (TOTALCNT-CNT < 2) and (CNT=DISTVPNCNT) ) tags { DAE.Message = "'VPNTOKEN ' + IfNil(ToString(VPNTOKEN), 'NULL') + ' is not unique'" } } The where statement uses '(TOTALCNT-CNT < 2)' to check that there is at most 1 null value (there can be 0 or 1) and uses '(CNT=DISTVPNCNT)' to insure that the non null values are unique. With a violation, the message will display a VPNTOKEN value that already exists in the table including 'NULL'. Dataphor by default will display an informative message for any key that is violated. It is not necessary to explicitly state any message. On sql server a primary key violation displays only the object of the violation. For example: 'Cannot insert duplicate key row in object 'dbo.testc' with unique index 'UIDX_TESTC_EMPLOYEEID'. The statement has been terminated.' While dataphor will display the specific data that caused the violation: In the future I'll talk about other types of constraints in dataphor such as column and transition row constraints. Bye for now.
Dataphor SQL RAC (Relational Application Companion)
A site of hope for those looking for a true relational database system
- a one-one requirement constraint with dataphor (1)
- anatomy of sql server part I - what is a stored procedure (1)
- anatomy of sql server part II - the unit test as part of the database (1)
- anatomy of sql server part III - what does deferred name resolution really mean (1)
- censoring sql posts (1)
- creating an opposite constraint in dataphor (1)
- dataphor (2)
- Dataphor (7)
- dataphor # 13 a table as a parameter (1)
- dataphor - download and start working with it (1)
- dataphor - fixed sized word segments (1)
- dataphor # 10 sql mythology (1)
- dataphor # 11 string differences (1)
- dataphor # 12 trimming a string (1)
- dataphor # 14 sql the meaning of Update..From (1)
- dataphor # 15 views with substance (1)
- dataphor # 16 inclusive vs exclusive solutions (1)
- dataphor # 17 a visual look at ranking queries (1)
- dataphor # 18 data scrubbing using lists (1)
- dataphor # 19 create intervals over strings (1)
- dataphor # 20 browsing an sql window (1)
- dataphor # 21 an example of relational division (1)
- dataphor # 22 reusable procedures (1)
- dataphor # 23 repley to Michel (1)
- dataphor # 24 basics of the table type (1)
- dataphor # 25 extending the dense rank function (1)
- dataphor # 26 query a hierarchy with explode (1)
- dataphor # 27 combine strings with Split and Concat (1)
- dataphor # 28 constants and variables or sql and D4 (1)
- dataphor # 29 another example of relational division (1)
- dataphor #1 introduction (1)
- dataphor #2 splitting strings (1)
- dataphor #3 string concatenation (1)
- dataphor #4 comment (1)
- dataphor #5 comment (1)
- dataphor #6 formal definition (1)
- dataphor #7 sql: table this (1)
- dataphor #8 list to table (1)
- dataphor #9 table constraints (1)
- dataphor creating lists in a query (1)
- extracting numbers from a string with dataphor (1)
- jeff modens dynamic crosstabs for sql server (1)
- linq to sql the what and why (1)
- linq to sql as a window of opportunity to sql users (1)
- linq to sql should be important to sql users (1)
- linq to sql vs. older 4GL attempts (1)
- listing missing table item (1)
- Multiple cascade paths to the same table (1)
- RAC (4)
- RAC #1 comment (1)
- RAC #2 example (1)
- RAC #3 finding the Nth number in a string (1)
- RAC #4 Sql Server 2005 ranking functions vs. Rac ranking (1)
- sorting a delimited string by its numerical string parts (1)
- sql an example of extreme implicit conversions (1)
- sql can't handle complicated cascading updates (1)
- sql CTE should be a variable not a value (1)
- sql dense rank for identifying consecutive runs (1)
- sql is there really a table variable (1)
- sql ranking functions explained by relational types (1)
- sql server triggers are best set based (1)
- sql the idea of using substring to simulate lists (1)
- sql the undefined trigger in Sql Server (1)
- sql vs relational on tables (1)
- sql what the sql CTE covers up (1)
- types and procedures (1)
Sunday, September 10, 2006
Dataphor - Are You Constrained?
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment