This is the first in a series of articles examining some basic concepts in Sql Server. What is a stored procedure? This example uses Sql Server 2005 and the Northwind sample database. The following is a very simple stored procedure that returns some aggregates from the Orders table for a given EmployeeID value supplied as an input parameter: CREATE PROCEDURE dbo.GroupByShipCountry @Employee Integer AS SELECT ShipCountry,Count(*) Cnt,Min(Freight) MinFrt,Max(Freight) MaxFrt FROM Orders WHERE EmployeeID=@Employee GROUP BY ShipCountry After executing the above script to create the procedure we ask a most simple question: what is it? What exactly is Sql Server aware of regarding this procedure? We can take as the starting point the definition of a generic procedure. Bol states: 'A stored procedure is a saved collection of Transact-SQL statements ... that can take and return user-supplied parameters.' We can simply ask is there any indication that the server is aware of anything that rises above the level of the t-sql statements defining the procedure. Should we expect to find such thing? We do know that the server recognizes the object Orders in sys.objects as a USER_TABLE. And if we execute the query: SELECT * FROM Northwind.INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME='Orders' we know the server recognizes the columns and their datatypes for the Orders table. We know the server recognizes that the procedure depends on three columns from the Orders table: SELECT object_name(referenced_major_id) Dependent,COLUMN_NAME FROM sys.sql_dependencies A CROSS APPLY (SELECT B.COLUMN_NAME FROM Northwind.INFORMATION_SCHEMA.COLUMNS B WHERE B.TABLE_NAME=object_name(A.referenced_major_id) and A.referenced_minor_id=B.ORDINAL_POSITION) C WHERE object_id=object_id('dbo.GroupByShipCountry') Dependent COLUMN_NAME --------- --------------- Orders EmployeeID Orders Freight Orders ShipCountry We know the procedure works if we execute it: Execute dbo.GroupByShipCountry 8 It retrieves, in the precise wording of bol, 'rows and columns from the Orders table' which are collectively termed a 'result set'. (We will return to the server Orders table and result set term in later articles). ShipCountry Cnt MinFrt MaxFrt --------------- ----------- --------------------- --------------------- Argentina 3 .3300 217.8600 Austria 5 25.2200 353.0700 Brazil 9 5.3200 110.8700 . Venezuela 9 4.4100 141.0600 When we look for specific information about the procedure itself with the sp_help system procedure we see that @Employee is listed as a parameter. The information from sys.objects confirms that the procedure is recognized and it indeed is a definition of a stored procedure: SELECT name,object_id,type,type_desc FROM sys.objects WHERE object_id = OBJECT_ID('dbo.GroupByShipCountry') AND type='P' name object_id type type_desc -------------------- ----------- ---- ---------------------- GroupByShipCountry 1551396646 P SQL_STORED_PROCEDURE In summary there does not exist any evidence that the server is aware of any information not contained in the collection of t-sql statements that define it. In fact the server does not exhibit any information about three of the four columns (Cnt, MinFrt, MaxFrt) that are displayed when it is executed. This leaves us in a very unsteady state. At the very most the server is performing an operation which is not exposed to us to interact with, at the very least it is performing an operation that it is itself unaware of. This situation makes it impossible to look to the server for any characterization of the procedure towards answering the question of what this procedure really is. It is up to us to infer what the procedure is. With the consequence that we can not rely on the server to either characterize what the procedure really is nor depend on the server to guarantee the integrity of the operation. This void in the server must be filled. The unit test Is it fair to say that the failure of the server necessitates a unit test? We can more clearly see the answer to this question if we can clearly see just what is the nature of a unit test that clarifies just what this procedure is. An insightful perspective on the unit test is offered in the excellent article: 'Using unit and functional tests in the development process' by Jeff Canna www.ibm.com/developerworks/library/j-test.html The author writes: 'Many times the development of a system is likened to the building of a house. Unit testing is analogous to a building inspector visiting a house's construction site. He is focused on the various internal systems of the house, the foundation, framing, electrical, plumbing, and so on. He ensures (tests) that the parts of the house will work correctly and safely, that is, meet the building code.' He goes on to say: 'Unit tests tell a developer that the code is doing things right; functional tests tell a developer that the code is doing the right things.' The author is clearly expressing the idea that a unit test is something the goes toward confirming something that is of a very basic nature. Something that is the basis on which any other functional tests are based on. So what sort of unit can we use to confirm the 'the code is doing things right'? And will it answer the question of what the procedure really is? The sort of unit test we are looking for is approached by the insight offered in another excellent article: 'Close those Loopholes - Testing Stored Procedures' by Alex Kuznetsov and Alex Styler www.simple-talk.com/content/print.aspx?article=426 With reference to a procedure very similar to dbo.GroupByShipCountry the authors write: 'Suppose you need to write a unit test for that stored procedure. You might want to verify that your result set has correct structure (correct column names and types)...' And further: 'When you set up your unit test, your result set (or result sets, if your stored procedure returns several ones) is saved into an XML file. All the information necessary for further comparison is saved column names, column types, and the data...When you run your unit test, the result set is compared against the saved XML file. Everything is compared - column names, column types, and all the data.' The idea of verifying that the 'result set has correct structure (correct column names and types).' What the authors refer to as 'correct structure' is what we are really searching for. A test for the correct structure is exactly what the server lacks. And what is this structure that is based on pairs of columns and their datatypes? For it is the columns of this structure that are displayed by executing the procedure: ShipCountry Cnt MinFrt MaxFrt ----------- --- ------ ------ which are based on the structure of column/datatype: ShipCountry:varchar, Cnt:Integer, MinFrt:Money, MaxFrt:Money It is these pairs that define a specific type, a specific type of table. And this simple concept allows us to finally answer the question of what the stored procedure really is: a table. The unit test is a test confirming the specific type of the table returned. What the server calls a 'result set' (and not a 'table') is based on the fact that there does not exist the concept of a type for a table. There does not exist a way for the server to distinguish one table from another. It is the non-existence of this concept in the server that necessitates this most basic unit (test) being made the responsibility of the user. This is the huge hole in sql that must be filled by the user, a role that the database should perform is lost and with it the integrity that it should offer. In part II we show and describe a database system that takes the definition of a stored procedure beyond the statements that define it. Basic Anatomy of Sql Server, part I What is a stored procedure? Basic Anatomy of Sql Server, part II The unit test as part of the database. Basic Anatomy of Sql Server, part III What does deferred name resolution really mean? Some related articles of interest: All tables are typed variables A table as a parameter Reusable procedures
Dataphor SQL RAC (Relational Application Companion)
A site of hope for those looking for a true relational database system
- a one-one requirement constraint with dataphor (1)
- anatomy of sql server part I - what is a stored procedure (1)
- anatomy of sql server part II - the unit test as part of the database (1)
- anatomy of sql server part III - what does deferred name resolution really mean (1)
- censoring sql posts (1)
- creating an opposite constraint in dataphor (1)
- dataphor (2)
- Dataphor (7)
- dataphor # 13 a table as a parameter (1)
- dataphor - download and start working with it (1)
- dataphor - fixed sized word segments (1)
- dataphor # 10 sql mythology (1)
- dataphor # 11 string differences (1)
- dataphor # 12 trimming a string (1)
- dataphor # 14 sql the meaning of Update..From (1)
- dataphor # 15 views with substance (1)
- dataphor # 16 inclusive vs exclusive solutions (1)
- dataphor # 17 a visual look at ranking queries (1)
- dataphor # 18 data scrubbing using lists (1)
- dataphor # 19 create intervals over strings (1)
- dataphor # 20 browsing an sql window (1)
- dataphor # 21 an example of relational division (1)
- dataphor # 22 reusable procedures (1)
- dataphor # 23 repley to Michel (1)
- dataphor # 24 basics of the table type (1)
- dataphor # 25 extending the dense rank function (1)
- dataphor # 26 query a hierarchy with explode (1)
- dataphor # 27 combine strings with Split and Concat (1)
- dataphor # 28 constants and variables or sql and D4 (1)
- dataphor # 29 another example of relational division (1)
- dataphor #1 introduction (1)
- dataphor #2 splitting strings (1)
- dataphor #3 string concatenation (1)
- dataphor #4 comment (1)
- dataphor #5 comment (1)
- dataphor #6 formal definition (1)
- dataphor #7 sql: table this (1)
- dataphor #8 list to table (1)
- dataphor #9 table constraints (1)
- dataphor creating lists in a query (1)
- extracting numbers from a string with dataphor (1)
- jeff modens dynamic crosstabs for sql server (1)
- linq to sql the what and why (1)
- linq to sql as a window of opportunity to sql users (1)
- linq to sql should be important to sql users (1)
- linq to sql vs. older 4GL attempts (1)
- listing missing table item (1)
- Multiple cascade paths to the same table (1)
- RAC (4)
- RAC #1 comment (1)
- RAC #2 example (1)
- RAC #3 finding the Nth number in a string (1)
- RAC #4 Sql Server 2005 ranking functions vs. Rac ranking (1)
- sorting a delimited string by its numerical string parts (1)
- sql an example of extreme implicit conversions (1)
- sql can't handle complicated cascading updates (1)
- sql CTE should be a variable not a value (1)
- sql dense rank for identifying consecutive runs (1)
- sql is there really a table variable (1)
- sql ranking functions explained by relational types (1)
- sql server triggers are best set based (1)
- sql the idea of using substring to simulate lists (1)
- sql the undefined trigger in Sql Server (1)
- sql vs relational on tables (1)
- sql what the sql CTE covers up (1)
- types and procedures (1)
Monday, November 05, 2007
Basic Anatomy of Sql Server, part I
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment