This problem was raised in the thread: microsoft.public.sqlserver.programming Monday, October 15, 2007 'Parse and Merge two fields' http://tinyurl.com/2eosa3 The idea is to take two pipe ('|') delimited strings and created one string where each item in the first string is matched with the item in the second string having the same ordinal position. The two columns are People and Position. So given: People:=' Sam | Jane | Gene' Position:= 'Accounting | Finance | Marketing' we want: People_Position:= 'Sam|Accounting|Jane|Finance|Gene|Accounting' Of course in sql this is usually a mess since sql does not support a list type and any of the operations that can transform a delimited string to a list and visa versa. In D4 we simply split the delimited string to a list of items and then create a table from the list. The table will contain the ordinal position as an integer value of each item in the list. So a row has the item (string) value and sequence value of that item in the string as columns. So we just concatenate the columns of each string together (separated by a pipe) and then concatenate these strings over the rows with the Concat operator ordering the concatenated string by the sequence value (over rows). Note that there is no concept of just unique item in a list. A list can have any number of duplicate items. When the list is transformed to a table the sequence number guarantees that same value items will be in the table. In other words the sequence is a key of the table. select ToTable('A|A|B|B|C'.Split({'|'})); value sequence ----- -------- A 0 A 1 B 2 B 3 C 4 And with column names of your choice: select ToTable('A|A|B|B|C'.Split({'|'}),'Str','Seq'); Str Seq --- --- A 0 A 1 B 2 B 3 C 4 (Browse this site for other articles on string operations for more info). Here is some sample data. The data is stored in a Sql Server 2005 database. All queries are written with the D4 relational language of Dataphor. create table JR { id:Integer, People:String tags{Storage.Length='100'}, Position:String tags{Storage.Length='100'}, key{id} }; insert table { row{1 id,'joe|sam|pete|mike' People,'Manager|Team Lead|Worker Ant|Worker Ant' Position}, row{2,'A| C| T ',' JobA|JobC |JobT '}, row{3,'A| C ',' JobA | JobB |JobC '}, //Ummatched strings (lists) row{4,'Bush | Cheney | Rice ' ,' Worser|Worst|Worse '}, row{5,'Z |Y |X ' ,' Backwards Z|Backwards Y |Backwards X '} } into JR; select JR; id People Position -- --------------------- --------------------------------------- 1 joe|sam|pete|mike Manager|Team Lead|Worker Ant|Worker Ant 2 A| C| T JobA|JobC |JobT 3 A| C JobA | JobB |JobC 4 Bush | Cheney | Rice Worser|Worst|Worse 5 Z |Y |X Backwards Z|Backwards Y |Backwards X Also note that we can access the table with a pass-thru query using t-sql: select SQLQuery("SELECT id,People,Position FROM JR"); id People Position -- --------------------- --------------------------------------- 1 joe|sam|pete|mike Manager|Team Lead|Worker Ant|Worker Ant 2 A| C| T JobA|JobC |JobCT 3 A| C JobA | JobB |JobC 4 Bush | Cheney | Rice Worser|Worst|Worse 5 Z |Y |X Backwards Z|Backwards Y |Backwards X Note that the only restriction used here is that both strings have the same number of items. select JR //The table JR in Sql Server 2005. //SQLQuery("SELECT id,People,Position FROM JR") Or we could use this t-sql query . where //The same 'where' statement as in sql. //Check that each string has the same number of items. People.Split({'|'}).Count()=Position.Split({'|'}).Count() with {IgnoreUnsupported = 'true'} //For each row in the table we 'add' a column (People_Position). //The People_Position column is the '|' delimited string of //People and Position item by item. add { //From a table whose rows represents each item in People and Position, //for each row form a string of the person and position separated by //a '|' and concatenate that string over the rows into a single string. Concat({People_Position,Del} from ( //Use Split to create a 'list' of items from the delimited string. //Then create a table from list with ToTable. Column 'Str1' //is the string value of the item from the list. Column 'Seq' //is the item number from the list. So now a table represents //the original input column People. In other words, the rows of //the table are the delimited items in the string from left to right. (ToTable(People.Split({'|'}),'Str1','Seq'){Str1.Trim() Str1,Seq} ) join //A natural join on Seq which is an integer from 0-># of items //in the string. Seq preserves the sequence from left to //to right of the items in the string. (ToTable(Position.Split({'|'}),'Str2','Seq'){Str2.Trim() Str2,Seq} ) {Seq,Str1,Str2,'|' Del,(Str1+'|'+Str2) People_Position} ) order by {Seq}) //The concatenation over rows is ordered by the //Seq column. This is the same order as the items //in the original input columns, People/Position. //Therefore the original order is preserved in //the result (People_Position). People_Position //This is name of the new column added to each row of JR. } {id,People_Position} //These are the only two columns we've chosen //to display from the query. order by {id}; //Order by the values if the id column. id People_Position -- --------------------------------------------------------- 1 joe|Manager|sam|Team Lead|pete|Worker Ant|mike|Worker Ant 2 A|JobA|C|JobC|T|JobT 4 Bush|Worser|Cheney|Worst|Rice|Worse 5 Z|Backwards Z|Y|Backwards Y|X|Backwards X Instead of eliminating rows where the item counts don't agree we can insert the string 'Unmatched' for those id values. select //Instead of table JR we access the table in sql server with a t-sql query. //In D4 the query does not return an sql 'resultset' but instead is treated //as a table 'variable' which is of a nature fundamentally different than sql. SQLQuery("SELECT id,People,Position FROM JR") //Store the restriction on count as a boolean value (T/F) in column TestCnt. add{People.Split({'|'}).Count()=Position.Split({'|'}).Count() TestCnt} with {IgnoreUnsupported = 'true'} add { //If the TestCnt is true concatenate the rows, else return a nil (null) value. //A case statement just like the sql case statement could also be used. if TestCnt then Concat({People_Position,Del} from ( (ToTable(People.Split({'|'}),'Str1','Seq'){Str1.Trim() Str1,Seq} ) join (ToTable(Position.Split({'|'}),'Str2','Seq'){Str2.Trim() Str2,Seq} ) {Seq,Str1,Str2,'|' Del,(Str1+'|'+Str2) People_Position} ) order by {Seq}) else nil as String People_Position } //Insert the unmatched string for a nil value of the People_Position column. {id,IfNil(People_Position,'*** Unmatched Strings ***') Field} order by {id}; id Field -- --------------------------------------------------------- 1 joe|Manager|sam|Team Lead|pete|Worker Ant|mike|Worker Ant 2 A|JobA|C|JobC|T|JobT 3 *** Unmatched Strings *** 4 Bush|Worser|Cheney|Worst|Rice|Worse 5 Z|Backwards Z|Y|Backwards Y|X|Backwards X D4 allows all the expressive power you need to easily test for any criteria. Here we add some additional rows to table JR that should be eliminated. insert table { //Digits in Postion. row{6 id,'joe|sam|pete|mike' People,'Manager|Team Lead|Worker Ant1|Worker Ant2' Position}, //Digits in People and Position row{7 id,'joe|sam|pete1|mike' People,'Manager|Team Lead|Worker Ant1|Worker Ant' Position} } into JR; select JR; id People Position -- --------------------- ----------------------------------------- 1 joe|sam|pete|mike Manager|Team Lead|Worker Ant|Worker Ant 2 A| C| T JobA|JobC |JobT 3 A| C JobA | JobB |JobC 4 Bush | Cheney | Rice Worser|Worst|Worse 5 Z |Y |X Backwards Z|Backwards Y |Backwards X 6 joe|sam|pete|mike Manager|Team Lead|Worker Ant1|Worker Ant2 7 joe|sam|pete1|mike Manager|Team Lead|Worker Ant1|Worker Ant We add some additional restrictions to the where statement to eliminate rows with strings that contain any non letters. select JR where ( ( People.Split({'|'}).Count()=Position.Split({'|'}).Count() ) //We also test that each item is only letters (no digits or other junk). //This can be done any numbers of ways. Here we just compare the count //of all items to the count of items eliminated by characters other than //letters. and ( People.Split({'|'}).Count() = Count(ToTable(People.Split({'|'}),'F1Str','F1Seq') //Test that items are made up of letters only. To eliminate //blanks contaiminating the test we remove blank character before //testing. where IsLetter(Replace(Trim(F1Str),' ',''))) ) and ( Position.Split({'|'}).Count() = Count(ToTable(Position.Split({'|'}),'F2Str','F2Seq') where IsLetter(Replace(Trim(F2Str),' ',''))) ) ) with {IgnoreUnsupported = 'true'} add { Concat({People_Position,Del} from ( (ToTable(People.Split({'|'}),'Str1','Seq'){Str1.Trim() Str1,Seq} ) join (ToTable(Position.Split({'|'}),'Str2','Seq'){Str2.Trim() Str2,Seq} ) {Seq,Str1,Str2,'|' Del,(Str1+'|'+Str2) People_Position} ) order by {Seq}) People_Position } {id,People_Position} order by {id}; id People_Position -- --------------------------------------------------------- 1 joe|Manager|sam|Team Lead|pete|Worker Ant|mike|Worker Ant 2 A|JobA|C|JobC|T|JobT 4 Bush|Worser|Cheney|Worst|Rice|Worse 5 Z|Backwards Z|Y|Backwards Y|X|Backwards X Thanks for stopping by ☺ .
Dataphor SQL RAC (Relational Application Companion)
A site of hope for those looking for a true relational database system
- a one-one requirement constraint with dataphor (1)
- anatomy of sql server part I - what is a stored procedure (1)
- anatomy of sql server part II - the unit test as part of the database (1)
- anatomy of sql server part III - what does deferred name resolution really mean (1)
- censoring sql posts (1)
- creating an opposite constraint in dataphor (1)
- dataphor (2)
- Dataphor (7)
- dataphor # 13 a table as a parameter (1)
- dataphor - download and start working with it (1)
- dataphor - fixed sized word segments (1)
- dataphor # 10 sql mythology (1)
- dataphor # 11 string differences (1)
- dataphor # 12 trimming a string (1)
- dataphor # 14 sql the meaning of Update..From (1)
- dataphor # 15 views with substance (1)
- dataphor # 16 inclusive vs exclusive solutions (1)
- dataphor # 17 a visual look at ranking queries (1)
- dataphor # 18 data scrubbing using lists (1)
- dataphor # 19 create intervals over strings (1)
- dataphor # 20 browsing an sql window (1)
- dataphor # 21 an example of relational division (1)
- dataphor # 22 reusable procedures (1)
- dataphor # 23 repley to Michel (1)
- dataphor # 24 basics of the table type (1)
- dataphor # 25 extending the dense rank function (1)
- dataphor # 26 query a hierarchy with explode (1)
- dataphor # 27 combine strings with Split and Concat (1)
- dataphor # 28 constants and variables or sql and D4 (1)
- dataphor # 29 another example of relational division (1)
- dataphor #1 introduction (1)
- dataphor #2 splitting strings (1)
- dataphor #3 string concatenation (1)
- dataphor #4 comment (1)
- dataphor #5 comment (1)
- dataphor #6 formal definition (1)
- dataphor #7 sql: table this (1)
- dataphor #8 list to table (1)
- dataphor #9 table constraints (1)
- dataphor creating lists in a query (1)
- extracting numbers from a string with dataphor (1)
- jeff modens dynamic crosstabs for sql server (1)
- linq to sql the what and why (1)
- linq to sql as a window of opportunity to sql users (1)
- linq to sql should be important to sql users (1)
- linq to sql vs. older 4GL attempts (1)
- listing missing table item (1)
- Multiple cascade paths to the same table (1)
- RAC (4)
- RAC #1 comment (1)
- RAC #2 example (1)
- RAC #3 finding the Nth number in a string (1)
- RAC #4 Sql Server 2005 ranking functions vs. Rac ranking (1)
- sorting a delimited string by its numerical string parts (1)
- sql an example of extreme implicit conversions (1)
- sql can't handle complicated cascading updates (1)
- sql CTE should be a variable not a value (1)
- sql dense rank for identifying consecutive runs (1)
- sql is there really a table variable (1)
- sql ranking functions explained by relational types (1)
- sql server triggers are best set based (1)
- sql the idea of using substring to simulate lists (1)
- sql the undefined trigger in Sql Server (1)
- sql vs relational on tables (1)
- sql what the sql CTE covers up (1)
- types and procedures (1)
Wednesday, October 17, 2007
Dataphor - Merging strings together
Subscribe to:
Post Comments (Atom)
1 comment:
It will not work in fact, that is exactly what I think.
Post a Comment