This example is based on the thread: microsoft.public.sqlserver.programming Jan 21, 2009 'JET Referential Integrity superior than SQL' http://tinyurl.com/d58ubl Sql Server can't deal with more than 1 cascade to a table. Some nonsense about cycles. I guess if two paths are involved they call it a bi-cycle ☺ It simply is a poor system for relationally modeling a business problem. Simply replace it with Dataphor and use sql server to store the data. Sql server doesn't screw up data storage ☺ Here's how simple this example is with Dataphor (and sql server as the backend). The tables will be persisted in sql server but Dataphor will (mercifully) perform all the integrity checks. create table users { id:Integer, username:String tags{Storage.Length='50'}, key{id}, key{username} }; create table table2 { id:Integer, user1:String tags{Storage.Length='50'}, user2:String tags{Storage.Length='50'}, user3:String tags{Storage.Length='50'} , key{id}, reference user1username {user1} references users {username} update cascade delete cascade, reference user2username {user2} references users {username} update cascade delete cascade, reference user3username {user3} references users {username} update cascade delete cascade }; users:= table{ row{1 id,'john' username}, row{2,'steve'}, row{3,'judy'}, row{4,'larry'}, row{5,'bill'}, row{6,'rita'}}; This row would violate the user3username constraint. insert row{12 id,'john' user1,'rita' user2, 'stan' user3} into table2; //Error: The table users does not have a row with username "stan". table2:=table{ row{10 id,'john' user1,'steve' user2, 'judy' user3}, row{11,'larry','rita', 'bill'}, row{13,'larry','steve', 'steve'}, row{14,'judy','steve', 'john'}, row{15,'bill','judy','rita'}, row{16,'steve','bill', 'judy'}, row{17,'john','bill','john'}}; select table2; id user1 user2 user3 -- ----- ----- ----- 10 john steve judy 11 larry rita bill 13 larry steve steve 14 judy steve john 15 bill judy rita 16 steve bill judy 17 john bill john Check the cascading updates. In table table2 all entries of 'john' are replaced with 'paul'. update users set {username:='paul'} where username='john'; select table2; id user1 user2 user3 -- ----- ----- ----- 10 paul steve judy 11 larry rita bill 13 larry steve steve 14 judy steve paul 15 bill judy rita 16 steve bill judy 17 paul bill paul Check the cascading deletes. Any row in table2 with an entry of 'steve' is deleted. delete users where username='steve'; select table2; id user1 user2 user3 -- ----- ----- ----- 11 larry rita bill 15 bill judy rita 17 paul bill paul Dataphor is open source, get a copy here: http://databaseconsultinggroup.com/downloads/
Dataphor SQL RAC (Relational Application Companion)
A site of hope for those looking for a true relational database system
- a one-one requirement constraint with dataphor (1)
- anatomy of sql server part I - what is a stored procedure (1)
- anatomy of sql server part II - the unit test as part of the database (1)
- anatomy of sql server part III - what does deferred name resolution really mean (1)
- censoring sql posts (1)
- creating an opposite constraint in dataphor (1)
- dataphor (2)
- Dataphor (7)
- dataphor # 13 a table as a parameter (1)
- dataphor - download and start working with it (1)
- dataphor - fixed sized word segments (1)
- dataphor # 10 sql mythology (1)
- dataphor # 11 string differences (1)
- dataphor # 12 trimming a string (1)
- dataphor # 14 sql the meaning of Update..From (1)
- dataphor # 15 views with substance (1)
- dataphor # 16 inclusive vs exclusive solutions (1)
- dataphor # 17 a visual look at ranking queries (1)
- dataphor # 18 data scrubbing using lists (1)
- dataphor # 19 create intervals over strings (1)
- dataphor # 20 browsing an sql window (1)
- dataphor # 21 an example of relational division (1)
- dataphor # 22 reusable procedures (1)
- dataphor # 23 repley to Michel (1)
- dataphor # 24 basics of the table type (1)
- dataphor # 25 extending the dense rank function (1)
- dataphor # 26 query a hierarchy with explode (1)
- dataphor # 27 combine strings with Split and Concat (1)
- dataphor # 28 constants and variables or sql and D4 (1)
- dataphor # 29 another example of relational division (1)
- dataphor #1 introduction (1)
- dataphor #2 splitting strings (1)
- dataphor #3 string concatenation (1)
- dataphor #4 comment (1)
- dataphor #5 comment (1)
- dataphor #6 formal definition (1)
- dataphor #7 sql: table this (1)
- dataphor #8 list to table (1)
- dataphor #9 table constraints (1)
- dataphor creating lists in a query (1)
- extracting numbers from a string with dataphor (1)
- jeff modens dynamic crosstabs for sql server (1)
- linq to sql the what and why (1)
- linq to sql as a window of opportunity to sql users (1)
- linq to sql should be important to sql users (1)
- linq to sql vs. older 4GL attempts (1)
- listing missing table item (1)
- Multiple cascade paths to the same table (1)
- RAC (4)
- RAC #1 comment (1)
- RAC #2 example (1)
- RAC #3 finding the Nth number in a string (1)
- RAC #4 Sql Server 2005 ranking functions vs. Rac ranking (1)
- sorting a delimited string by its numerical string parts (1)
- sql an example of extreme implicit conversions (1)
- sql can't handle complicated cascading updates (1)
- sql CTE should be a variable not a value (1)
- sql dense rank for identifying consecutive runs (1)
- sql is there really a table variable (1)
- sql ranking functions explained by relational types (1)
- sql server triggers are best set based (1)
- sql the idea of using substring to simulate lists (1)
- sql the undefined trigger in Sql Server (1)
- sql vs relational on tables (1)
- sql what the sql CTE covers up (1)
- types and procedures (1)
Thursday, January 22, 2009
Multiple cascade paths to the same table
Saturday, December 20, 2008
Extracting numbers from a string
This example is based on the question asked in the thread: microsoft.public.sqlserver.programming Dec 18, 2008 Pulling a number out of an nvarchar field http://tinyurl.com/3quyap The OP was interested in pulling out the 1st occurrence of a number in a string. The string has numbers and letters. So if the string is 'XY40A3' we want the number 40 which is the 1st of two numbers in the string. This is very easy to do in Dataphor. Just as a table of numbers comes in handy for solving many different problems so does a list of letters. The idea here is to treat each letter in the string as a 'delimiter'. We then split the string using the list of letters as the delimiters so what results are the number(s) in the string. We can store the list (the letters of the alphabet) in a temporary table for easy access. create session table AlphaTable { AlphaList:list(String), key{ } }; AlphaTable:= table { row { {'A','B','C','D','E','F','G','H','I','J','K','L','M','N', 'O','P','Q','R','S','T','U','V','W','X','Y','Z'} AlphaList} }; For example if we split a string, transform it to a table and remove any blanks we'll just have numbers left. If we order by sequence (Index) it will show the numbers as they occur from left to right. select ToTable('XY40A3'.Split(AlphaTable[].AlphaList),'StrNum','Index' ) where StrNum>' ' order by {Index}; StrNum Index ------ ----- 40 2 3 3 Not only would it be easy to get the 1st number but we can get the occurrence of any number easily. The 1st occurrence is just a special case of the general problem of getting the Nth occurrence in a string. By using ToTable(ToList(cursor by Index (that follows the order of numbers from left to right in the string) we can create a consecutive rank from 1 to N over the table of numbers that will allow direct access to the Nth number (if it exists). select ToTable ( ToList ( cursor ( ( ToTable('XY40A3RAD853'.Split(AlphaTable[].AlphaList),'StrNum','Index' ) where StrNum>' ' {ToInteger(StrNum) Num,Index} ) order by {Index} ) ) ) {Num,Index,sequence+1 NthIndex} ; Num Index NthIndex --- ----- -------- 3 3 2 40 2 1 853 6 3 Here is the an operator for the Nth occurrence that takes into account lower case letters and returns a -1 if the Nth occurrence doesn't exist. create session operator NthNum(AStr:String,N:Integer):Integer begin var T1:= ToTable ( ToList ( cursor ( ( ToTable(Upper(AStr).Split(AlphaTable[].AlphaList),'StrNum','Index' ) where StrNum>' ' {ToInteger(StrNum) Num,Index} ) order by {Index} ) ) ) {Num,Index,sequence+1 NthIndex}; result:=IfNil((T1 adorn{key{NthIndex}})[N].Num,-1); end; select NthNum('SF346fs47sGs759 ',1); //returns 346 select NthNum('SF346fs47sGs759 ',2); //returns 37 select NthNum('SF346fs47sGs759 ',3); //returns 759 select NthNum('SF346fs47sGs759 ',4); //returns -1 Here a table of strings in stored in an Sql Server database from which we can extract the 1st occurrence of a number. create table FooStrings { keycol:Integer, datacol:String nil, key{keycol} }; FooStrings:= table { row{1 keycol, 'XYZ40AB' datacol}, row{2, 'WX32A'}, row{3, '27 blah'}, row{4, 'A87BNC30'}, row{5, 'XY40A3'}, row{6, 'TWFD'}, row{7, 'XYA53GH5JGV934'}, row{8, '7'}, row{9, nil} }; select FooStrings add{NthNum(IfNil(datacol,' '),1) MyNumber} with {IgnoreUnsupported = 'true'} order by {keycol}; keycol datacol MyNumber ------ -------------- -------- 1 XYZ40AB 40 2 WX32A 32 3 27 blah 27 4 A87BNC30 87 5 XY40A3 40 6 TWFD -1 7 XYA53GH5JGV934 53 8 7 7 9 <No Value> -1 Think how easy the operator can be modified if it's desired to return the minimun or maximum number etc. If you would like to see a coherent and easy to follow all t-sql solution that doesn't cobble together string functions into spaghetti code see: RAC - Are you Coordinated? But I think you'll agree the much preferred solution is in Dataphor ☺
Saturday, December 06, 2008
Sql server dynamic crosstabs by Jeff Moden
This is the RAC version of Jeff Modens fine article on dynamic crosstabs for sql server at: Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs 2008/12/03 www.sqlservercentral.com/articles/cross+tab/65048/ The data is generated by the method outlined in the article. The table was populated with 1 millions rows - Basic crosstab. 2 secs in QA for S2005. Exec Rac @transform='Sum(SomeMoney) as SumMony', @rows='Someletters2', @pvtcol='(left(datename(mm,DATEADD(mm,DATEDIFF(mm,0,SomeDate),0)),3)+~ ~+ cast(year(DATEADD(mm,DATEDIFF(mm,0,SomeDate),0)) as char(4))) as mthyr', @pvtsort='month(DATEADD(mm,DATEDIFF(mm,0,SomeDate),0))', -- Sort pivot expression by an integer. @from='##JBMTest', @WHERE='SomeDate>=~Jan 1 2008 12:00AM~ AND SomeDate<~Jul 1 2008 12:00AM~', @rowtotalsposition='end',@racheck='y',@shell='n' Someletters2 Funct Jan 2008 Feb 2008 Mar 2008 Apr 2008 May 2008 Jun 2008 Totals ------------ ------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- AA SumMony 685.67 763.64 656.13 575.93 879.13 192.13 3752.63 AB SumMony 927.06 928.98 280.20 632.43 560.99 785.50 4115.16 AC SumMony 791.09 555.18 916.71 273.23 187.48 508.31 3232.00 AD SumMony 250.04 341.58 426.53 645.56 670.13 422.86 2756.70 AE SumMony 809.14 487.21 295.33 625.92 716.12 527.19 3460.91 . . ZY SumMony 776.32 682.98 326.17 677.69 546.87 926.54 3936.57 ZZ SumMony 532.75 500.92 277.93 636.40 607.05 553.43 3108.48 Totals SumMony 433997.25 383211.70 411913.12 411878.29 425431.07 409809.47 2476240.90 Here some additional bells and whistles are thrown in:) -- Executed in 26 secs in QA for S2005. Exec Rac -- The same transformed repeated twice for different purposes. @transform='Sum(SomeMoney) as SumMony & Sum(SomeMoney) as [% row]', @rows='Someletters2', @pvtcol='(left(datename(mm,DATEADD(mm,DATEDIFF(mm,0,SomeDate),0)),3)+~ ~+ cast(year(DATEADD(mm,DATEDIFF(mm,0,SomeDate),0)) as char(4))) as mthyr', @pvtsort='month(DATEADD(mm,DATEDIFF(mm,0,SomeDate),0))', @from='##JBMTest', @WHERE='SomeDate>=~Jan 1 2008 12:00AM~ AND SomeDate<~Jul 1 2008 12:00AM~', @rowtotalsposition='end',@racheck='y',@pformat='_pvtcols_',@shell='n',@translabel='Summary', -- Display min and max sum for each Someletters along with pivot (date) it occurred. -- The min and max are displayed in separate rows. Default is same row for all rowfunctions. @rowfunctions='min(SumMony) & max(SumMony)',@rowfunctionslabel='Min/Max',@displayrowfunctions='m', -- Running sum of pivot columns for each row from left to right. The pivot sum is followed by the -- run in each cell. @colruns='SumMony', -- The percentage of the pivot sum/[row total] displayed in a separate row. @cpercents='[% row] %only' -- a different transform alias to force a separate row. -- We could display in same row as sum (and column runs). Someletters2 Summary Min/Max Jan 2008 Feb 2008 Mar 2008 Apr 2008 May 2008 Jun 2008 Totals ------------ ------- ----------------------- ------------------- ------------------- -------------------- -------------------- -------------------- -------------------- ---------- AA SumMony min(192.13,Jun 2008) 685.67/685.67 763.64/1449.31 656.13/2105.44 575.93/2681.37 879.13/3560.50 192.13/3752.63 3752.63 max(879.13,May 2008) % row 18.3% 20.3% 17.5% 15.3% 23.4% 5.1% - AB SumMony min(280.20,Mar 2008) 927.06/927.06 928.98/1856.04 280.20/2136.24 632.43/2768.67 560.99/3329.66 785.50/4115.16 4115.16 max(928.98,Feb 2008) % row 22.5% 22.6% 6.8% 15.4% 13.6% 19.1% - AC SumMony min(187.48,May 2008) 791.09/791.09 555.18/1346.27 916.71/2262.98 273.23/2536.21 187.48/2723.69 508.31/3232.00 3232.00 max(916.71,Mar 2008) % row 24.5% 17.2% 28.4% 8.5% 5.8% 15.7% - AD SumMony min(250.04,Jan 2008) 250.04/250.04 341.58/591.62 426.53/1018.15 645.56/1663.71 670.13/2333.84 422.86/2756.70 2756.70 max(670.13,May 2008) % row 9.1% 12.4% 15.5% 23.4% 24.3% 15.3% - AE SumMony min(295.33,Mar 2008) 809.14/809.14 487.21/1296.35 295.33/1591.68 625.92/2217.60 716.12/2933.72 527.19/3460.91 3460.91 max(809.14,Jan 2008) % row 23.4% 14.1% 8.5% 18.1% 20.7% 15.2% - AF SumMony min(406.49,May 2008) 788.30/788.30 415.40/1203.70 605.56/1809.26 613.81/2423.07 406.49/2829.56 520.40/3349.96 3349.96 max(788.30,Jan 2008) . . ZY SumMony min(326.17,Mar 2008) 776.32/776.32 682.98/1459.30 326.17/1785.47 677.69/2463.16 546.87/3010.03 926.54/3936.57 3936.57 max(926.54,Jun 2008) % row 19.7% 17.3% 8.3% 17.2% 13.9% 23.5% - ZZ SumMony min(277.93,Mar 2008) 532.75/532.75 500.92/1033.67 277.93/1311.60 636.40/1948.00 607.05/2555.05 553.43/3108.48 3108.48 max(636.40,Apr 2008) % row 17.1% 16.1% 8.9% 20.5% 19.5% 17.8% - Totals SumMony min(383211.70,Feb 2008) 433997.25/433997.25 383211.70/817208.95 411913.12/1229122.07 411878.29/1641000.36 425431.07/2066431.43 409809.47/2476240.90 2476240.90 max(433997.25,Jan 2008) % row 17.5% 15.5% 16.6% 16.6% 17.2% 16.5% -
Tuesday, November 25, 2008
Linq to Sql as an opportunity to sql users
Instead of a dead end, Linq presents opportunities for sql experts. My comment orginally appeared in the www.sqlblog.com post: 'Continuing Industry Consolidation and Other Trends.' 1/16/08 by Kevin Kline http://tinyurl.com/5opxlz I think the Sun acquisition of MySql was a checkmate move. It was Oracle that attempted to slow down MySql with their acquisition of the InnoDB transaction engine. Now they have been paid back in spades. Now it's Sun vs. MS in a new competition for the minds and hearts of developers. The broad view is LINQ/net and Sql Server vs. java and MySql. This is not about databases per se but a new war based on the object/entity model inspired by LINQ. I don't see IBM and Oracle in this war. They will have to be content to battle amongst themselves in the old database wars. (I'll come back to LINQ:) As for 'Checking out the Competition' of course I too applaud Linchi. But honestly we're first 'now' recognizing the advantages of eyes wide open?:) This attitude should be a given for a professional. Perhaps better late than never, perhaps an airing of cheerleading vs.reality checking:) For most ansi features that MS has pre-annouced all one has to do to study them is read about them in Oracle or DB2 documentation (ie. OVER). And as Linchi pointed out it often goes the opposite direction for other types of features. This attitude contributes to the great divides that are common in the industry. And now we come to the 'dead-end routes' like LINQ. I take the opposite view you do. There's a compelling case to be made (I've commented on this matter on this site) that if there is a deadweight MS sees sql as it. LINQ is not just a piece of new technology, it's not just a local variable, it's a global one. LINQ is both an affirmation of the runtime environment for 'all' application development using an object model and a rejection of the sql interface. MS can live with the physical file structure (the idea of relational data, rows and columns) but they don't want to live with the sql interface for application development. MS explains this global move in terms of solving the historic impedance mismatch between two completely different environments. And they have picked their winner and at the same time the loser. The rows and columns abstraction now ends at physical storage. The object abstraction and LINQ will take up everything else. Sql server is now something quite different than it used to be. Almost all developmental changes in server will be based on servicing objects and quite probably at the expense of features associated with a furtherance of the relational model. Look at all the work on partitioned views in S2008. This lies at the heart of how LINQ will translate entity updates. LINQ is still in its enfancy. I would expect it to appear to many just like sql did when it was intially introduced in the early eighties. It will take time to get the matured version. What is truely ironic is I see no real argument in the sql community that LINQ represents a great opportunity for sql developers. MS is inventing a declarative language in the spirit of sql. Don't people see an opportunity to jump in and at least influence the design of the language? Or get involved in the LINQ translation/optimizations to sql. Over time as MS integrates LINQ deeper into server (returning entities) I can assure you the current translations will change:) Sql was most certainly not an implementation of the relational model. So sql folks shouldn't get hung up over this. The relational model would require the same strong typed runtime as net but MS is certainly not going there. But they are going to a place that sql skills can be used. And now Sun is going to go along with them. It's actually a small world if your eyes are open:)
Linq to Sql should be important to sql users
My plea for the sql server community to look closely at Linq. My comment orginally appeared in the www.sqlblog.com post: 'PASS presentation: What would you like to hear about?' 2/23/08 by James Luetkehoelter http://tinyurl.com/5lmj4m Your eccentric. Such people tend to be really bright. So that's kewl. Your passionate, so your highly motivated. That's very good. If you're 'over the top' that means your not afraid of taking risks! That's best of all. If all this is true you can take on the burden of tackling LINQ and the entity framework. Now I'm not talking about the 'how' of it. I'm sure there will be plenty of people presenting point and click slides. What I am talking about is the 'why' and 'what' of it. LINQ/EFM dwarfs everything else in terms of importance in S2008. It's a reflection of a movement in a different direction from what everyone is used to. It's also a reflection of a change in attitude. When I look for sql folks tackling the subject what do I find? Frighteningly little or nothing! Now let me say if you're willing to make the case that sql folks can simply ignore the subject, have at it:) If you even want to make the case that it simply represents an isolated piece of technology go for it. Some sql folk are waxing nostalgic about past 4GL/xbase languages when discussing LINQ. So it may be that many sql folks think it's quite the same idea and wonder why it's structure is different and possibly inferior to what was. Well LINQ is different, it's comparing apples and oranges, and it can't possibly be the same. But how can folks really get a grasp of such a comparison if they don't really understand the 'what' of LINQ. Trying to explain 'what' it is isn't easy especially to those sql folks who are not familiar with net or any other contemporary strongly typed object oriented runtime environment. I think MS is finding out that it's a challenge. Even trickier is to explain the 'why' of it. The motivation for it. Surely it didn't come out of thin air:) And the 'why' involves taking risks. You may frighten or alienate people:) The 'why' cuts to the heart of what most folks believe in. LINQ is a departure. It will have significant implications for what development is and will be like in the future. It will take a very unique person to put all these ideas together so they'll be really understood. Interestingly, all the answers about LINQ are right on the MS website. All that's required is to dig around and back thru the years to find them. With over eight years of development this shouldn't be surprising:) But how many sql folks have bothered to do this homework? From the looks of it very, very few. Presenting concepts is much harder than presenting code. It takes a very special kind of guy to connect the dots here:) I'd be happy to share with you an extensive MS list of articles/notes thru the years about the subject.
Linq to Sql vs. older 4GL attempts
Some thoughts on comparing Linq to older reporting languages. My comment orginally appeared in the www.sqlblog.com post: 'LINQ - Lessons that could have been learned from languages like Progress' 2/25/08 by Greg Low http://tinyurl.com/56powf >it's still very much an add-on rather than being fully baked into the language. I remain somewhat perplexed by just what you mean by this. By definition LINQ is 'burned in'. This is the achievement that MS has accomplished. To imply that it's not fully baked in is like saying a woman is somewhat pregnant. It is or it isn't, you are or you are not:) Either a table/entity or projection can be represented as a variable or it cannot. That all these things 'can' be a variable of some type referenced in the language is the whole point of it no? Your use of 'add-on' seems to imply that LINQ is something external to the language much like an external 4GL like Progress. I don't think this can be further from the truth. In your interview with Pablo Castro he referred to Progress as an example of a language having 'direct line of sight from language to database'. Wasn't he struggling here to convey the idea to sql folks of the fundamentally different natures of the two? To bring the 4GL idea into contemporary languages one shouldn't expect they are going to be similar. And you seem to be implying that LINQ is not as 'natural' as Progress. How could it be? If you first have to define a query variable (table) you certainly can't expect to start your variable with a Select/Find. You define and then perform an action right? In t-sql 'Select @MyNumber' only makes sense if your first declare @MyNumber. Is LINQ any different? And in the sense that 'declare @MyNumber int' is burned into t-sql, isn't 'var customers = from c in db.Customers' burned into C#? I do think sql users should listen to your interview with Pablo. It is proving difficult for MS folks to convey just what they have done to those outside of net. What is worse, misunderstanding it or ignoring it?:) Shouldn't sql folks fully understand why MS thinks it's such an improvement over sql? So far I think either they don't or simply don't care.
Linq to Sql: the 'what' and 'why' of it
The following comments concern 'what' is Linq (to Sql)/EF and the 'why' of it, what motivated MS to develop it. What does MS mean by 'beyond 'relational'? I also explore in what ways Linq, sql and the relational model are related to each other. How these technologies are connected to each other is quite a fascinating picture:) My following 5 comments orginally appeared on www.sqlblog.com in the post: 'Beyond Relational ???' 10/29/07 by Paul Nielsen http://tinyurl.com/686z6h Comment #1 There is the association of relational to mathemetics (set theory). So people criticize sql based on this point of view. Sql allows duplicates rows, doesn't require a table to have a key, dependencies based on ordinal position, is a poorly designed language etc. etc. These things really are critical but the real problem is the prevailing idea that relational is just a question of mathemetics. If it's just mathemetics then allowing duplicate rows is perceived as 'close enough'. All the objections from the set theory point view are not perceived as compelling enough to really question the validity of sql. IMO the real holes of sql have nothing to do with mathemetics. Rather it's the foundation, the computer science if you will, that set theory and relational algebra are embedded in. This point of view is unfortunately not prevalent in IT. What the hell do I mean by the computer science of the relational model? Well first, the set theory that relational depends on is not some special kind of set theory. There is only one set theory. In the same way there is only one computer science, there is no special kind of computer science. But sql has invented such a special computer science and this is the biggest flaw. What am I talking about?:) Consider this, here is a table variable: DECLARE @MyTableVar table( EmpID int NOT NULL primary key, OldVacationHours int, NewVacationHours int, ModifiedDate datetime); Here is a server table: create MyTable EmpID int NOT NULL primary, OldVacationHours int, NewVacationHours int, ModifiedDate datetime); Here's the key question. If @MyTableVar really is a variable then what is MyTable? In other words, @MyTableVar is to variable as MyTable is to ?????. If MyTable is persisted in the database what is it persisted as? What computer science term describes it? Well whatever the hell it is (a constant?) it certainly isn't a variable. And if it isn't a variable then end of ballgame, end of relational model. And what of @MyTableVar? Bol says 'A table variable behaves like a local variable.' and at the same time says 'Assignment operation between table variables is not supported.'. When is a door not a door?..when it's ajar:) Who the hell ever heard of a variable that doesn't support assignment? Who ever heard of a variable that doesn't support comparison? No one. Whatever @MyTableVar really is it sure as hell ain't a variable. In a relational db I should be able to assign the table @MyTableVar, all its rows, to MyTable: MyTable=@MyTableVar And I should be able to compare them. if MyTable=@MyTableVar then print 'All rows in MyTable are in @MyTableVar and all rows in @MyTableVar are in MyTable' else print 'Nope they're not equal' A relational db demands a table be a variable just like an integer variable. Sql simply does not support basic computer science for tables. Whatever a table is in sql it doesn't have a 'type' because computer science is computer science and a variable must be typed. The only way sql can recognize a table is by its name, not its type. This is why sql doesn't support relational division and why dynamic sql must be used so much. A table as a variable is a completely different animal than a table in sql. This is why the expressive power of a relational db is orders of magnitude greater than an sql db. Sql views and constraints are redefined relationally. The 'types' in Dates work: Databases, Types and the Relational Model, The Third Manifesto' 2007 is about the central importance of variables of a particular type (a table as one of many types) in a relational db. What a table as a variable means and its significance. It is really a basic computer science book. Ripping out the mathematics of relational theory (at least trying to copy it), ie. the syntax to join, union tables, without the computer science of relational has done all the damage. MS can't change sql server because they are caught in an crazy computer science. The difference in computer science between sql and net is the impedance mismatch they're trying address. But I'm afraid they still don't get the idea of a table as a variable. This is different than a table as a class. The anonymous type in Linq is distinctly different than a table type. So MS is doing the same thing IBM did forty years ago with the sql System-R. The damage is the difference between a pickup game in a playground and organized sports. You can draw up plays in the dirt but they don't quite work the same as those run in a stadium. We're still doing application development in the playground. Sometimes it works, sometimes it doesn't but we're not basing it on the science of any model. Sql is not a model of anything, it's an invention all its own. Close enough is only for horsehoes:) Maybe my blog will make more sense now:) Comment #2 Wherever you see the word 'relational' just substitute 'fog'. As in fog of war:) > But when you have guys like Don Chamberlin (co-inventor of SQL and > co-designer of XQuery) on your staff, I guess you can afford to > boast your XML prowess. He is revered in the sql world and reviled in the relational one. He was a lead designer of System-R, the prototype of all sql database systems. Those guys created a query language based on Codds description of basic relational operators like projection, union and join. But they did NOT implement the relational model Codd described. They just ripped out these constructs without regard for their meaningfulness in the entire relational model. So what you have today is nothing like the relational model as it was envisioned. (IT successfully marginalizes the huge difference and those that point it out:) And now comes 'beyond relational'. What does this phrase really mean to MS? They are more than willing to tell us. Aside from Jim Grays article/presentation, everyone should read the articles on this site, the 'Comega language': http://research.microsoft.com/Comega/ Especially this article: 'Unifying Tables, Objects and Documents' http://tinyurl.com/yq7c4f Here you'll find history repeating itself. MS, just like IBM did with System-R, has extracted relational operators out of the relational model and put them in an imperative object environment without any regard to relational theory. The great irony is that the extensions that MS added to net to realize projections of columns and xml within net is the foundation for a true relational model! But the compiler generated anonymous type of Linq while a variable is a different beast than the explicit variable that is a table type in the relational model. It's the relational variable that supports assignment and comparison as opposed to the Linq variable that's no where near as smart:) But each supports a 'variable' which is a major step up from sql. Had MS any idea of the friggin true relational model they would make a different kind of history. Talk about dumbing down. Talk about of only academic interest. Talk about relational fog (I should add that Alphora (Dataphor) recognized the ability of the object imperative environment to support the D relational language and implemented it. And it works:) Here is what Anders Hejlsberg, MS VS guru, and now the head of database technology has to say about the disconnect: InfoWorld Interview of Microsoft Distinguished Engineer Anders Hejlsberg 'Microsoft's Hejlsberg touts .Net, C-Omega technologies' June 10, 2005 http://www.infoworld.com/article/05/06/10/HNhejlsberg_1.html "So what we're looking at is really trying to much more deeply integrate the capabilities of query languages and data into the C# programming language. And I don't specifically mean SQL, and I emphatically don't mean just take SQL and slap it into C# and have SQL in there. But rather try to understand what is it expressively that you can do in SQL and add those same capabilities to C#." Anders Hejlsberg is microsofts version of Don Chamberlin at IBM. So what they have done is replace one flawed implementation of sql with another. And this is how they achieve efficiency in application development. Now that is unfriggin believable:) Well there's no free lunches. And I await to be enlightened on just how this environment will replace the concept of the logical relational model in solving business problems. I would say the real meaning of beyond relational is sideways. Comment #3 Consider the MS whitepaper: 'Microsoft SQL Server 2008 and Microsoft Data Platform Development' http://www.microsoft.com/sql/techinfo/whitepapers/sql_2008_dp.mspx Does anyone find it the least bit odd that an sql server technical article is all about VS, LINQ and the entity framework? At the expense of the logical relational model and the sql language. What MS means by 'beyond relational' is 'forget relational already':) Looking at sql server as if it was somehow an embodiment of relational theory is every bit a form of dumbing down as some silly utterance by some poor nitwit at MS. There never was and never will be any 'intent' by MS to offer a 'relational' database. Sql servers only intent now is to be responsive to its biggest customer, visual studio. And that team is as knowledgeable in relational databases as the server team. Not. Why does the community still view sql server thru an imaginary lense? Did you ever hear of somewhat pregnant? If you open the dumbing down door be prepared to greet all those who come thru:) Comment #4 There is no longer a great divide, a debate, an impedance mismatch. MS has issued their own Emancipation Proclamation. And as a result they no longer support the relation model as it is know to developers today. 'A Call to Arms' by Jim Gray, Microsoft Mark Compton, Consultant April 2005 http://www.acmqueue.org/modules.php?name=Content&pa=showpage&pid=293 This paper is an invitation to embrace a new model. It's just as much 'A Farewell to Arms', an emancipation from the relational model which they are leaving behind. What does sql server look like in this new model? 'Interview with David Campbell' General Manager of Strategy, Infrastructure and Architecture of Microsoft SQL Server. May 14, 2007 http://tinyurl.com/6maseb Campbell says: "I believe the next major advance in Information Technology will come from addressing the gap between people and information." That gap is the relational logical model itself. Campbell continues: "The focus will move to the data itself rather than on the machinery used to manipulate it. We'll be less concerned with the plumbing and more concerned with data quality, data protection, and information production." "Most of the data services provided by SQL Server will be driven from a common data model. Whether you're creating a report, building an information cube, or integrating data from another system, you will be able to start from a common model of the key data entities such as 'customer', 'order', or 'prospect'." "Finally, fewer and fewer people will miss, (or remember), the 'open databases' sp_configure option..." The class replaces the table as the basic unit of work. VS replaces QA/SSMS as the interface for application development. There is no concept of relational anything in this object world. Sql constructs are independent of application development. The language of the relational model is replaced with the language of entities. There is no concept of a dba. MS is no longer in the database wars as we know it. They are trading 3rd place in that world for 1st place in another. And they now have the freedom to talk about this new world. It just sounds silly to those who have not left the old one. Ironically some were hoping for a new sub-language to further AD. Perhaps the lesson here is to be careful of what you wish for. I too was hoping they'd enter a new world but not the one they have chosen. Comment #5 > should we be concerned staying in the DB world long with the fear > that we become obsolete one day? Although I'm not an expert I can understand where you're coming from. It would be nice to get a clear and concise answer to where MS is going and what you should do about it. But there is no Oracle when it comes to MS. There is no one position paper, no one person that clearly lays out their five year plan and what it means to you. The experts here have enormous importance and influence in the db community. But they also have an enormous investment. How far can they be reasonably expected to go without putting themselves in an awkward position should they take a position that is not currently in line with company thinking? In the end it's a question of connecting the dots. You get a dot here a dot there. You have to do your homework. Study what they say and write and study what they offer. Sql server pros shouldn't neglect what's going on in VS and it's impact. If you study the company and the various technologies enough you should be able to draw your own picture. Think of it as the MS X-files:)
Sunday, November 23, 2008
Listing Missing Table Item
The following example is based on the post: microsoft.public.sqlserver.programming Friday, November 21, 2008 11:56 AM "T-SQL Comparing a String of Items to a Table of Items; Listing Missing Table Items" http://tinyurl.com/5dvc6o Here's what the future of application development will hopefully be like using a relational system. To program relationally you not only have to think in "terms of sets" (the declarative thing:) but you have to think "in terms of type" (the relation thing). Sql essentially doesn't concern itself with variables and types. They are essentially viewed as foreign thingies found in net. But in a relational system they are every bit as significant as understanding how to write an outer join. The following example illustrates the relational approach using the Dataphor system (which can play nicely with Sql Server ☺) create session table Products { product_key:Integer, part_no: String, options: list(String), //options is defined as a list type, not a string. price: Money, key{product_key} }; All tables, like numbers, are variables with an explicit 'table' type, the column names and their datatypes. And like integers they can be 'assigned' values. Tables are composed of 'rows'. For each row options is input as a 'list' of type string ({'<string>','<string>','<string>'..}). Products:= table { row{11 product_key, 'ASP-20J' part_no, {'Wiz Bang', 'Hopla Enhancer'} options, $10.00 price}, row{12, 'ASP-20R', {'Widget Holder','Wiz Bang', 'Hopla Enhancer'}, $12.00} }; create session table Options { option_key: Integer, product_key: Integer, option: String, key{option_key}, reference OptiontoProduct {product_key} references Products {product_key} }; Options:= table { row{5 option_key, 11 product_key, 'Wiz Bang' option}, row{6, 11, 'Hopla Enhancer'}, row{7, 12, 'Wiz Bang'}, row{8, 12, 'Hopla Enhancer'} }; Here are easy ways to find the missing product from the Options table. Using row in table. First a table of numbers (Numbers) with a single column N (from 0-10K) is used to create a row for each element in the options list bringing along the other Products columns. The row is constructed from this table to see if it's not in options. select ( Products times //Like an sql cross join. Numbers where N<options.Count() //create a row for each element in the list. {product_key,options[N] option} ) where not ( row{product_key product_key,option option} in (Options {product_key,option}) ) ; Using relational division. Because tables (table expressions) are variables one table can be tested to see if it's contained in another. select ( Products times Numbers where N<options.Count() {product_key,options[N] option} ) where not ( table{row{product_key product_key,option option}} <= (Options {product_key,option}) ) ; Using a left join. Test whether the table on the right has a matching row. select ( Products times Numbers where N<options.Count() {product_key,options[N] option} ) left join Options include rowexists //A special Boolean column for whether there is a match. where not rowexists {product_key,option} ; Instead of inputting a list directly, a delimited string can be converted to a list when input. Products:= table{ row{11 product_key, 'ASP-20J' part_no, ('Wiz Bang, Hopla Enhancer '.Split()) options, $10.00 price}, row{12, 'ASP-20R', (' Widget Holder ,Wiz Bang, Hopla Enhancer'.Split()), $12.00} }; Options:= table{ row{5 option_key, 11 product_key, 'Wiz Bang' option}, row{6, 11, 'Hopla Enhancer'}, row{7, 12, 'Wiz Bang'}, row{8, 12, 'Hopla Enhancer'} }; The queries are the same as above except for trimming each element of the list. select ( Products times Numbers where N<options.Count() {product_key,options[N].Trim() option} ) where not ( row{product_key product_key,option option} in (Options {product_key,option}) ) ; select ( Products times Numbers where N<options.Count() {product_key,options[N].Trim() option} ) where not ( table{row{product_key product_key,option option}} <= (Options {product_key,option}) ) ; select ( Products times Numbers where N<options.Count() {product_key,options[N].Trim() option} ) left join Options include rowexists where not rowexists {product_key,option}; All queries produce a result of: product_key option ----------- ------------- 12 Widget Holder Types eliminate violations of normal forms much like education eliminates ignorance ☺ Dataphor is a RAD tool, R(elational) A(ccelerator) D(evelopment). Visit dataphor at: www.dataphor.org
Friday, November 14, 2008
S(ecure) Q(uery) L(anguage)?
Concerning the thread: microsoft.public.sqlserver.programming Nov 10, 2008 "cascading deletes" http://tinyurl.com/6nwjmd For some reason I couldn't get my reply to post thru OE (it got thru via google though). Perhaps there's an MS filter for metaphors ☺ In any event any mature adult should be able to handle it. So here's my reply with a touch of creative writing ☺ 'Most women believe men think with their tool. And it's just as true in IT. Users model business problems in terms of the abilities of their db. The idea that modeling exists independent of ones db is a myth. It's not a question of seepage but of flooding. Modeling business problems in terms of the available sql server constructs is messy precisely because their immature and superficial to the task. The result is you turn away from the db and towards the do-it-myself model. You roll around in your own layer because you can't get layered by the db. It's ridiculous to write a join procedurally but when it comes to modeling it's perfectly acceptable to roll your own. Because the model equivalent of the join is so lacking and messy. The genie isn't going back in the sql server bottle. It's simply to far gone. That's why I advocate Dataphor. There the genie is in the join as well as the modeling. Use Dataphor and put your tool back where your head and shoulders are. You can still use sql server. But you aren't going to get tooled by it :)' www.dataphor.org www.beyondsql.blogspot.com Geoff Schaller wrote: > Andre. > I vote with Hugo here. We manage everything from code, not from TSQL in > SSMS or via some other mechanism so we generally have to code everything > (and that is not as difficult or as expansive as it sounds). Whilst > cascading referential integrity is "nice" from a simplicity point of > view, we've found that the act of deleting something (say an invoice) is > almost never a simple thing. There is reversal of stock levels, > rebalancing totals and if others are running reports when you thought > you wanted to do the delete, it gets messy. > The other thing is that we quite often have to delete the child entries > individually or prevent the parent from being deleted because a child or > two cannot be. Writing all that logic into a trigger and enforcing the > rollback is quite complex. I find code an easier way to manage and > maintain this. To add insult to injury my reply to a post on SQLServerCentral was hacked (edited). SQLServerCentral 11/10/2008 'Arrays in Stroed Prcoedure' http://tinyurl.com/5th5n4 My reply, as shown there under the name rog pike, was edited to read: 'An array is a 'type', something the archaic computer science of sql knows nothing about. You have to move to a 'real' relational system to find a list/array type. You'll find such adult computer science in Dataphor.' My orginal reply was a follows: 'Arrays are in sql server in the same sense as having sex by yourself which may account for the shortsightedness of so many sql mavens. An array is a 'type', something the archaic computer science of sql knows nothing about. You have to move to a 'real' relational system to find a list/array type. You'll find such adult computer science in Dataphor.' Is the site for mature adults or for the whole family? ☺ Just how much protection does sql and its users need? Is this a security or, better yet, an insecurity problem? ☺ Finally, I'll repeat here what I posted in the above thread: 'Apparently someone complained/reported something I wrote as being objectionable. They got their wish as it was magically extracted from the text. What was yanked, besides my chain, was a metaphor, albeit a vivid one, to drive a salient point home. Now I write for adults, I don't do child-speak very well. Nor do I have a predilection to only write drone-on-speak. So, if I can, I won't hesitate to use an adult metaphor to amplify a point in an industry that is usually tone deaf. God forbid IT encourage ability in something other than code or pixels. So if you are an adult, with a surname other than anonymous, please explain just what you found R or X rated. Mature adults usually confront conflicts thru the front door not the back one.'
Monday, September 08, 2008
Sorting a delimited string numerically
Sorting a delimited string by its numerical string parts. This is a possible solution to the problem presented in the article: 'T-SQL Sorting Challenge' By: Itzik Ben-Gan http://www.sqlmag.com/Article/ArticleID/100156/100156.html "You are given a table called t1 with a character string column called val. Each string in the val column holds a dot separated list of integers. Your task is to write a T-SQL solution that sorts the strings based on the integer values constituting the string segments. Note that the number of integers in each string may vary, and is only limited by the column type VARCHAR(500). Extra points if your solution will also support negative integers." So the problem is how to construct a sort expression that represents the positive and negative integers of the ids. This solution uses Dataphor with the data being stored in Sql Server 2005. Sample data The data is populated thru dataphor and persisted in the sql server northwind db. The table t1 in the article is table IZ here. The article uses an sql identity for the primary key id, here it is explicitly declared. create table IZ { id:Integer, val:String tags {Storage.Length='75'}, key{id} }; //Positive value strings. insert row{1 id,'100' val} into IZ; insert row{2 id,'7.4.250' val} into IZ; insert row{3 id,'22.40.5.60.4.100.300.478.19710212' val} into IZ; insert row{4 id,'22.40.5.60.4.99.300.478.19710212' val} into IZ; insert row{5 id,'22.40.5.60.4.99.300.478.9999999' val} into IZ; insert row{6 id,'10.30.40.50.20.30.40' val} into IZ; insert row{7 id,'7.4.250' val} into IZ; //Add negative values. insert row{8 id,'-1' val} into IZ; insert row{9 id,'-2' val} into IZ; insert row{10 id,'-11' val} into IZ; insert row{11 id,'-22' val} into IZ; insert row{12 id,'-123' val} into IZ; insert row{13 id,'-321' val} into IZ; insert row{14 id,'22.40.5.60.4.-100.300.478.19710212' val} into IZ; insert row{15 id,'22.40.5.60.4.-99.300.478.19710212' val} into IZ; Go directly to dataphor solution Go directly to a solution using dataphor and sql server t-sql Go directly to a solution using the Rac utility on sql server 2000 Go directly to a solution using the Rac utility on sql server 2005 ( Rac is a system of stored procedures and functions for sql server designed to simplify solving various data manipulation problems including dynamic crosstabs, complex running sums and ranking, string manipulations etc. )
Stepping thru the logic of the solution Use the dataphor Split operator to split the val string for each id into individual strings starting at Index 1 and going to the number of parts delimited by the period ('.'). Note that the string is converted to an integer. So we're dealing with numbers and not strings. select (IZ add{val.Split({'.'}) StrList,val.Split({'.'}).Count() StrListCnt}) times //times is equivalent to an sql CROSS JOIN. (Numbers where N<10) //A table with a single column N, an integer from 0 to 800. where N<StrListCnt {id,N+1 Index,ToInteger(StrList[N]) StrNum} order by {id,Index} ; id Index StrNum -- ----- -------- 1 1 100 <- id 1 has only a single value. 2 1 7 <- id 2 has 3 values. 2 2 4 2 3 250 3 1 22 <- id 3 has 9 values. 3 2 40 3 3 5 3 4 60 3 5 4 3 6 100 3 7 300 3 8 478 3 9 19710212 4 1 22 . . . Now lets look at the same data but within each Index and within each Index ordered by the string as a number (StrNum). Remember all ordering is ascending. select (IZ add{val.Split({'.'}) StrList,val.Split({'.'}).Count() StrListCnt}) times (Numbers where N<10) where N<StrListCnt {id,N+1 Index,ToInteger(StrList[N]) StrNum} order by {Index,StrNum}; id Index StrNum -- ----- -------- 13 1 -321 <- id 13 has lowest overall number for 1st string part (Index 1). 12 1 -123 11 1 -22 10 1 -11 9 1 -2 8 1 -1 2 1 7 7 1 7 6 1 10 3 1 22 <- ids 3,4,5,14,15 have the same value (22) for Index 1. 4 1 22 5 1 22 14 1 22 15 1 22 1 1 100 <- id 1 has highest overall number for 1st string part (Index 1). . . . 3 8 478 <- Index 8 has the same string value for the 5 ids that have an 8th string part. 4 8 478 5 8 478 14 8 478 15 8 478 5 9 9999999 3 9 19710212 4 9 19710212 14 9 19710212 15 9 19710212 The strings as integers are nicely sorted within each Index over the ids. How can we represent the same ordering within each Index independent of the positive and negative numbers (and strings that indicate positive and negative numbers)? What about with a ranking. So lets rank the numbers within each Index. The combination of the dataphor ToTable, ToList and cursor operators will generate a rank (a column named sequence) that follows the specified cursor ordering. We order the cursor by Index,StrNum to get an ascending rank within each Index based on the StrNum values. The rank is column RowNum. select ToTable( ToList( cursor( ( (IZ add{val.Split({'.'}) StrList,val.Split({'.'}).Count() StrListCnt}) times (Numbers where N<10) where N<StrListCnt {id,N+1 Index,ToInteger(StrList[N]) StrNum} ) order by {Index,StrNum}))) {Index,id,StrNum,sequence+1 RowNum} order by {Index,StrNum}; Index id StrNum RowNum ----- -- -------- ------ 1 13 -321 1 <- id 13 has lowest value (-321) so id 1 gets lowest rank (1) within Index 1. 1 12 -123 2 1 11 -22 3 1 10 -11 4 1 9 -2 5 1 8 -1 6 1 2 7 7 <- ids 2 and 7 get a different rank for the duplicate values of 7 :( 1 7 7 8 1 6 10 9 1 3 22 10 <- all 5 ids get a different rank for the duplicate values of 22 :( 1 4 22 11 1 5 22 12 1 14 22 13 1 15 22 14 1 1 100 15 <- id 1 has highest value (100) so id 1 gets highest rank (15) within Index 1. . . . . 8 3 478 56 8 4 478 57 8 5 478 58 8 14 478 59 8 15 478 60 9 5 9999999 61 9 3 19710212 62 <- all 4 ids get a different rank for the duplicate values of 19710212 :( 9 4 19710212 63 9 14 19710212 64 9 15 19710212 65 ( Note that this rank is equilvant to the sql ranking function ROW_NUMER(). The rank could be obtained in sql using: ROW_NUMBER()OVER(ORDER BY [Index],StrNum) AS RowNum But the rank we want is really based on the sql RANK() function which accounts for duplicate/ties (of StrNum) by giving them the same rank. Therefore it's necessary in dataphor to use a join to append the correct ranks to the table. In sql the join isn't necessary, RANK() can be used directly on the table:ie. RANK()OVER(ORDER BY [Index],StrNum) AS Rank (See t-sql solution) For more on dataphor and sql ranking see: 'The Sql ranking OVERture' http://beyondsql.blogspot.com/2008/04/sql-ranking-overture.html ) If you haven't guessed it by now ☺ the idea is to create a string for each id based on the ranks which will be used to order the ids. But we have a problem because for duplicate values of a number we're getting different ranks. We want the 'same' rank for duplicate values since the same integer cannot be used to distinguish among the ids. We can remedy the different ranks for duplicate values by simply choosing the minimum rank (RowNum) for the value and assigning this rank to all ids. Also note that the ranks continue to ascend over the Indexs. This is ok because any numbers representing the ranks are ok if they correctly maintain the ordering of integer strings values within the Index. select ToTable( ToList( cursor( ( (IZ add{val.Split({'.'}) StrList,val.Split({'.'}).Count() StrListCnt}) times (Numbers where N<10) where N<StrListCnt {id,N+1 Index,ToInteger(StrList[N]) StrNum} ) order by {Index,StrNum}))) {Index,id,StrNum,sequence+1 RowNum} group by {Index,StrNum} add{Min(RowNum) Rank} order by {Index,StrNum}; Index StrNum Rank ----- -------- ---- 1 -321 1 1 -123 2 1 -22 3 1 -11 4 1 -2 5 1 -1 6 1 7 7 <- a rank of 7 can be assigned to the two ids (2,7) with a value of 7 for Index 1. 1 10 9 1 22 10 <- a rank of 10 can be assigned to all ids with a value of 22 for Index 1. 1 100 15 . . . 8 478 56 9 9999999 61 9 19710212 62 <- a rank of 62 can be assigned to all ids with a value of 19710212 for Index 9. What we now have is a table of unique Index/StrNum combinations with a unique rank for each combination. It's only necessary to join this table to the table of split strings (IZ) for all ids by Index and StrNum to properly assign the correct(ed) ranks. (As mentioned above this is the same rank that would be obtained using the sql RANK() function and ordering by Index,StrNum. And note that using the sql RANK() would eliminate the need to do a join in dataphor. Imagine dataphor with native sql like ranking operations ☺ ) Because the objective is to create a string to sort the ids we can't just use the numeric rank, we have to modify it for string ordering. Given two ranks of 7 and 11 if they are strings, '7' and '11', an ascending sort would have '11' come '7': '11' '7' This is the very problem the article is addressing! So we have to modify the the strings to have '7' come '11' ☺. We can modify the '7' by left padding it with '0'. So when we sort ascending we'll have the correct representation of of the true numeric order of the values: '07' '11' How much to pad a rank, how many '0's to insert, is the string length of the maximum rank generated. Because the ranks in dataphor keep ascending regardless of the cursor ordering, the maximum rank (ignoring duplicates) is the count of rows in the table. You could even make an educated guess based on the amount of data and use that ☺. Left padding the string rank (RankStr) based on a maximum length of 2 we now have all the data to finally construct a sorting column for the ids. var UnpackedStrings:= //This variable holds all the split data and will be used in the select query. ToTable(ToList(cursor( ( (IZ add{val.Split({'.'}) StrList,val.Split({'.'}).Count() StrListCnt}) times (Numbers where N<10) where N<StrListCnt {id,N+1 Index,ToInteger(StrList[N]) StrNum} ) order by {Index,StrNum}))) {Index,id,StrNum,sequence+1 RowNum}; var LengthofMaxRank:=2; select (UnpackedStrings {id,Index,StrNum}) join //Join the unique ranks to all split data. This is a natural join (on Index/StrNum). //Create a left padded string (RankStr) from the numeric rank. (UnpackedStrings group by {Index,StrNum} add{Min(RowNum) Rank}) {id,Index,StrNum,Rank,PadLeft(ToString(Rank),LengthofMaxRank,'0') RankStr} order by {Index,Rank}; id Index StrNum Rank RankStr -- ----- -------- ---- ------- 13 1 -321 1 01 <- ranks 1-9 are left padded in the rank string (RankStr). 12 1 -123 2 02 11 1 -22 3 03 10 1 -11 4 04 9 1 -2 5 05 8 1 -1 6 06 2 1 7 7 07 7 1 7 7 07 6 1 10 9 09 3 1 22 10 10 4 1 22 10 10 5 1 22 10 10 14 1 22 10 10 15 1 22 10 10 1 1 100 15 15 . . . . . 3 8 478 56 56 4 8 478 56 56 5 8 478 56 56 14 8 478 56 56 15 8 478 56 56 5 9 9999999 61 61 3 9 19710212 62 62 4 9 19710212 62 62 14 9 19710212 62 62 15 9 19710212 62 62 The id sort column can now be formed by concatenating, using the Concat operator, the RankStr within each id in the order of the rank (either Rank, Index or RankStr). This is easy to see by ordering the above data (table) by id,Rank. The ascending order of Index, Rank and RankStr all reflect where an id lies in value (RankStr) relative to the other ids. The sort expression will be column SortStr. id Index StrNum Rank RankStr -- ----- -------- ---- ------- 1 1 100 15 15 2 1 7 7 07 2 2 4 16 16 2 3 250 30 30 3 1 22 10 10 3 2 40 19 19 3 3 5 24 24 3 4 60 33 33 3 5 4 38 38 3 6 100 49 49 3 7 300 51 51 3 8 478 56 56 3 9 19710212 62 62 The complete dataphor solution var UnpackedStrings:= //Variable that holds all split data and ranks the split strings //numerically within each Index ordered by the numeric string value. ToTable(ToList(cursor( ( (IZ add{val.Split({'.'}) StrList,val.Split({'.'}).Count() StrListCnt}) times (Numbers where N<10) where N<StrListCnt {id,N+1 Index,ToInteger(StrList[N]) StrNum} ) order by {Index,StrNum}))) {Index,id,StrNum,sequence+1 RowNum}; var LengthofMaxRank:= Length(Count(UnpackedStrings).ToString()); //A string length used to left pad the rank strings for //a correct string sort (will be 2 here). select IZ join //Natural join of input table IZ to sort column expression by id. ( //Join unique ranks to all split strings. ( (UnpackedStrings {id,Index,StrNum}) join // Adjust ranks to be unique for each Index/StrNum, duplicate values should get same rank. (UnpackedStrings group by {Index,StrNum} add{Min(RowNum) Rank}) //Left pad rank string with '0' (RankStr) for sorting string correctly. Add empty //string ('') to be used as a delimiter in concatenating the ranks. {id,Index,StrNum,Rank,PadLeft(ToString(Rank),LengthofMaxRank,'0') RankStr,'' Del} ) adorn {key{id,Rank}} //Sql uses physical hints, in dataphor you use logical ones. //Form the sorting expression SortStr to sort ids by concatenating the string ranks //in the order of any of the ranking columns or Index. group by {id} add{Concat(RankStr,Del order by {id,Rank}) SortStr} ) order by {SortStr}; //The object of the exercise, sort the ids by SortStr to get the correct //numerical order of val. id val SortStr -- ---------------------------------- ------------------ 13 -321 01 12 -123 02 11 -22 03 10 -11 04 9 -2 05 8 -1 06 2 7.4.250 071630 7 7.4.250 071630 6 10.30.40.50.20.30.40 09182932434650 14 22.40.5.60.4.-100.300.478.19710212 101924333844515662 15 22.40.5.60.4.-99.300.478.19710212 101924333845515662 5 22.40.5.60.4.99.300.478.9999999 101924333847515661 4 22.40.5.60.4.99.300.478.19710212 101924333847515662 3 22.40.5.60.4.100.300.478.19710212 101924333849515662 1 100 15 Solution using the sql server RANK() function with a pass thru query The solution can be made more compact using the sql RANK() function since dataphor doesn't have a direct equivalent ranking operation. Since sql server can only access persisted tables (and views) and not dataphor expressions (we can't just stick in any dataphor table expression in a pass thru query) we'll create a persisted table to hold all the split data. create table IZSqlRanks //The table will be created in sql server. { id:Integer, Index:Integer, StrNum:Integer, key{id,Index} }; Do the splitting in dataphor (because it knows the difference between a string and a list ☺ ) and then 'assign' the resulting table to the persisted table IZSqlRanks. Relational databases support this kind of assignment for all variables including tables. And all tables in dataphor are variables. (To my sql friends this makes all the difference in the world ☺ ) IZSqlRanks:= (IZ add{val.Split({'.'}) StrList,val.Split({'.'}).Count() StrListCnt}) times (Numbers where N<10) where N<StrListCnt {id,N+1 Index,ToInteger(StrList[N]) StrNum}; //Use a t-sql passthru query to take advantage of the sql RANK() function. The //resulting table will be treated like any other table (expression) in dataphor. //Left pad the string rank (RankStr) for sorting purposes. We're using a total //string length of 2 here so single digit ranks will be padded with a leading '0'. select IZ join ( SQLQuery('SELECT id,[Index],StrNum,RANK()OVER(ORDER BY [Index],StrNum) AS Rank FROM IZSqlRanks') //Use the sql result as if it was a native dataphor expression. {id,Index,StrNum,Rank,PadLeft(ToString(Rank),2,'0') RankStr,'' Del} adorn {key{id,Rank}} //Metadata (a key) pertaining to the table expression. //This key will be efficiently used by the Concat operation. //Concatenate the rank strings for each id to be used as the sort order for ids. group by {id} add{Concat(RankStr,Del order by {id,Rank}) SortStr} ) order by {SortStr}; id val SortStr -- ---------------------------------- ------------------ 13 -321 01 12 -123 02 11 -22 03 10 -11 04 9 -2 05 8 -1 06 2 7.4.250 071630 7 7.4.250 071630 6 10.30.40.50.20.30.40 09182932434650 14 22.40.5.60.4.-100.300.478.19710212 101924333844515662 15 22.40.5.60.4.-99.300.478.19710212 101924333845515662 5 22.40.5.60.4.99.300.478.9999999 101924333847515661 4 22.40.5.60.4.99.300.478.19710212 101924333847515662 3 22.40.5.60.4.100.300.478.19710212 101924333849515662 1 100 15 Solving the problem on sql server 2000 with the Rac utility The Rac solution follows the same logic as the dataphor and sql methods. The 1st Rac execute creates the Index column, from 1 to N, for every string part (delimited integer) an id has. The 2nd Rac execute creates a rank over the ids based on the integers (string parts) within each Index. The rank Rac generates is equivalent to the sql DENSE_RANK() function. This rank, like RANK(), gives duplicate values the same rank but, unlike RANK(), does not take the duplicate ranks into account when generating the next rank. RANK() skips ahead based on ties while DENSE_RANK() consecutively numbers the ranks. Both types of ranks give the same correct sort result for the ids. The 3rd Rac execute concatenates the left padded rank strings for each id and returns the ids sorted by them, correctly :) Note that Rac is called recursively twice. Exec Rac @split='[position]', -- Rac splits each val string from left to right based on a period ('.'). @rows='id & [position]',-- Rac keeps the position each new string part starts in. @pvtcol='val', -- the target column of the split operation. @from='IZ', -- use a counter to generate the Index column (from 1-N) that indicates the individual string part in a val. @separator='.',@rowcounters='id{[Index]}',@counterdatatype='int', @defaults1='y',@rowbreak='n',@racheck='y',@shell='n', @select=' SELECT 1*id AS id,[Index],CAST([1] AS INT) AS StrNum INTO #J1 FROM rac Exec Rac @transform=~_dummy_~, @rows=~[Index] & StrNum & id~, @pvtcol=~Report Mode~, @from=~#J1~, @defaults1=~y~,@rowbreak=~n~,@racheck=~y~,@shell=~n~, /* use a Rac counter to rank the integer string parts within each Index (column Rank).*/ @rowindicators=~StrNum{Rank}_overtable_~,@counterdatatype=~int~, @select= ~if object_id(~~##J2~~) is not null drop table dbo.t1 SELECT 1*id AS id,1*StrNum AS StrNum,1*[Index] AS [Index],1*Rank AS Rank, /* left pad single digit string ranks for proper character sorting.*/ REPLICATE(~~0~~,2-DATALENGTH(CAST(Rank AS VARCHAR(2)))) + CAST(Rank AS VARCHAR(2)) AS RankStr INTO ##J2 FROM rac~ /* concatenate the string ranks (RankStr) within each id into column SortStr.*/ Exec Rac @transform=~Max(RankStr) as RankStr~, @rows=~id~, @pvtcol=~Rank~, @from=~##J2~, @defaults1=~y~,@racheck=~y~,@shell=~n~,@cutpvt=~y~, @concatenate=~RankStr~,@separator=~~,@stringname=~SortStr~, /* return the ids sorted by the concatenated string ranks.*/ @select=~SELECT IZ.id,val,SortStr FROM IZ JOIN rac ON IZ.id=rac.id ORDER BY SortStr DROP TABLE ##J2~' id val SortStr ----------- ------------------------------------ ------------------ 13 -321 01 12 -123 02 11 -22 03 10 -11 04 9 -2 05 8 -1 06 7 7.4.250 071116 2 7.4.250 071116 6 10.30.40.50.20.30.40 08121517202326 14 22.40.5.60.4.-100.300.478.19710212 091314181921272830 15 22.40.5.60.4.-99.300.478.19710212 091314181922272830 5 22.40.5.60.4.99.300.478.9999999 091314181924272829 4 22.40.5.60.4.99.300.478.19710212 091314181924272830 3 22.40.5.60.4.100.300.478.19710212 091314181925272830 1 100 10 More on Rac @ www.rac4sql.net Solving the problem on sql server 2005 with the Rac utility The Rac solution in S2k5 is similar to the one on S2K (go there for more info). But it's simpler since the split string parts (integers) can be ranked directly using a dense rank function not available in S2k. This eliminates a recursive call to Rac so here it's executed twice instead of three times (in S2k). Exec Rac @split='[position]', @rows='id & [position]', @pvtcol='val', @from='IZ', @separator='.',@rowcounters='id{[Index]}',@counterdatatype='int', @defaults1='y',@rowbreak='n',@racheck='y',@shell='n', -- the DENSE_RANK() function, available in S2k5, is used to rank the integer string parts -- and is left padded. @select=' SELECT id,[Index],StrNum,Rank, REPLICATE(~0~,2-DATALENGTH(CAST(Rank AS VARCHAR(2)))) + CAST(Rank AS VARCHAR(2)) AS RankStr INTO #J1 FROM (SELECT 1*id AS id,[Index],CAST([1] AS INT) AS StrNum, DENSE_RANK()OVER(ORDER BY [Index],CAST([1] AS INT)) AS Rank FROM rac ) AS A Exec Rac @transform=~Max(RankStr) as RankStr~, @rows=~id~, @pvtcol=~Rank~, @from=~#J1~, @defaults1=~y~,@racheck=~y~,@shell=~n~,@cutpvt=~y~, @concatenate=~RankStr~,@separator=~~,@stringname=~SortStr~, @select=~SELECT IZ.id,val,SortStr FROM IZ JOIN rac ON IZ.id=rac.id ORDER BY SortStr~' id val SortStr ----------- ------------------------------------ ------------------ 13 -321 01 12 -123 02 11 -22 03 10 -11 04 9 -2 05 8 -1 06 7 7.4.250 071116 2 7.4.250 071116 6 10.30.40.50.20.30.40 08121517202326 14 22.40.5.60.4.-100.300.478.19710212 091314181921272830 15 22.40.5.60.4.-99.300.478.19710212 091314181922272830 5 22.40.5.60.4.99.300.478.9999999 091314181924272829 4 22.40.5.60.4.99.300.478.19710212 091314181924272830 3 22.40.5.60.4.100.300.478.19710212 091314181925272830 1 100 10 More on Rac @ www.rac4sql.net
Subscribe to:
Posts (Atom)
Beyond Sql