Oracle

Tuesday, September 20, 2011

Strings to Rows 1

Union All Method

Now, here is a solution that uses parsing logic from String to Columns: Separated Values - 10g+.

set null "(null)"

set pagesize 45

variable d varchar2(1)

execute :d := ','

break on key skip 1 duplicates

select

  key,

  1 as position,

  rtrim( regexp_substr( c || :d, '.*?' || :d, 1, 1 ), :d ) as val

from   t

UNION ALL

select

  key,

  2 as position,

  rtrim( regexp_substr( c || :d, '.*?' || :d, 1, 2 ), :d ) as val

from   t

where  regexp_substr( c || :d, '.*?' || :d, 1, 2 ) is not null

UNION ALL

select

  key,

  3 as position,

  rtrim( regexp_substr( c || :d, '.*?' || :d, 1, 3 ), :d ) as val

from   t

where  regexp_substr( c || :d, '.*?' || :d, 1, 3 ) is not null

order by

  key, position, val ;

KEY   POSITION VAL

--- ---------- ----------

r01          1 v1

r02          1 v1

r02          2 (null)

r03          1 v1

r03          2 (null)

r03          3 (null)

r04          1 (null)

r04          2 v2

r04          3 (null)

r05          1 (null)

r05          2 (null)

r05          3 v3

r06          1 v1

r06          2 v2

r07          1 v1

r07          2 v2

r07          3 (null)

r08          1 v1

r08          2 v2

r08          3 v3

r09          1 (null)

r10          1 (null)

r10          2 (null)

r11          1 (null)

r11          2 (null)

r11          3 (null)

When strings contain a large number of values we need a more flexible and efficient technique than the method presented in UNION ALL Method. One strategy involves using a parsing technique like one of those described in SQL Techniques Tutorials: String to Columns with an integer series generator like the ones in SQL Techniques Tutorials: Integer Series Generators. While many different combinations of the two types of techniques are possible, we will only examine a couple of these combinations here.
Before we see the first solution though, let's have a look at the test data we will use.

select * from t order by key ;

KEY C          NOTE

--- ---------- -------------------------

r01 v1         One value

r02 v1,        Two values, one empty

r03 v1,,       Three values, two empty

r04 ,v2,       Three values, two empty

r05 ,,v3       Three values, two empty

r06 v1,v2      Two values

r07 v1,v2,     Three values, one empty

r08 v1,v2,v3   Three values, none empty

r09 (null)     One empty value

r10 ,          Two values, two empty

r11 ,,         Three values, all empty

Now let's examine one possible solution.

set null "(null)"

set pagesize 45

break on key skip 1 duplicates

variable d varchar2(1)

execute :d := ','

select

  ta.key ,

  i.column_value as position ,

  substr

  ( ta.c2,

    instr( ta.c2, :d, 1, i.column_value  ) + 1,

    instr( ta.c2, :d, 1, i.column_value + 1 )

      - ( instr( ta.c2, :d, 1, i.column_value ) + 1 )

  as val

from

  ( select

      key ,

      :d || c || :d as c2 ,

      length( c || :d ) - nvl( length( replace( c, :d, null ) ), 0 ) as value_count

    from t

  ) ta

      inner join table( integer_varray_type( 1,2,3,4,5 ) ) i

      on ( i.column_value <= ta.value_count )

order by

  key, position

KEY   POSITION VAL

--- ---------- ----------

r01          1 v1

r02          1 v1

r02          2 (null)

r03          1 v1

r03          2 (null)

r03          3 (null)

r04          1 (null)

r04          2 v2

r04          3 (null)

r05          1 (null)

r05          2 (null)

r05          3 v3

r06          1 v1

r06          2 v2

r07          1 v1

r07          2 v2

r07          3 (null)

r08          1 v1

r08          2 v2

r08          3 v3

r09          1 (null)

r10          1 (null)

r10          2 (null)

r11          1 (null)

r11          2 (null)

r11          3 (null)

How It Works

To understand how the query works consider its individual components.

Component 1

First we look at values derived from T.

clear breaks

select

  key ,

c ,

  LENGTH( C || :D ) - NVL( LENGTH( REPLACE( C, :D, NULL ) ), 0 ) AS VALUE_COUNT

from  t

order by 1

KEY C          VALUE_COUNT

--- ---------- -----------

r01 v1                   1

r02 v1,                  2

r03 v1,,                 3

r04 ,v2,                 3

r05 ,,v3                 3

r06 v1,v2                2

r07 v1,v2,               3

r08 v1,v2,v3             3

r09 (null)               1

r10 ,                    2

r11 ,,                   3

In this query VALUE_COUNT is simply the total number of delimited values in the string, determined by deriving the total number of delimiter characters in the string plus one. If your delimiter is more than one character long you will need to adjust this logic to return a correct count.

Component 2

The next component is a simple integer series generator.

clear breaks

select * from table( integer_varray_type( 1,2,3,4,5 ) ) ;

COLUMN_VALUE

------------

It contains a number of rows equal to the maximum number we expect to find in VALUE_COUNT. In our example we expect no more than 5 delimited values per string.
This technique is described in more detail at Integer Series Generators: Type Constructor Expression Method. It requires a custom collection type (created in the Setup code for this section) defined like this.

desc integer_varray_type

 integer_varray_type VARRAY(10) OF NUMBER(38)

If we did not have access to such a type or could not create our own there are other integer series generation techniques to choose from. See the feature graph at the end of SQL Techniques Tutorials: Integer Series Generators for a list of alternatives.

Components 1 + 2

Next we combine components 1 and 2 to form a single query that returns one row for each delimited value in each STRING.

break on key skip 1 duplicates

column c2 format a12

select

  ta.key, ta.c2, ta.value_count, i.column_value

from

  ( select

      key ,

      :d || c || :d as c2 ,

      LENGTH( C || :D ) - NVL( LENGTH( REPLACE( C, :D, NULL ) ), 0 ) AS VALUE_COUNT

    from t

  ) ta

    inner join TABLE( INTEGER_VARRAY_TYPE( 1,2,3,4,5 ) ) I

      on ( i.column_value <= ta.value_count )

order by

  1, 2, 3, 4

KEY C2           VALUE_COUNT COLUMN_VALUE

--- ------------ ----------- ------------

r01 ,v1,                   1            1

r02 ,v1,,                  2            1

r02 ,v1,,                  2            2

r03 ,v1,,,                 3            1

r03 ,v1,,,                 3            2

r03 ,v1,,,                 3            3

r04 ,,v2,,                 3            1

r04 ,,v2,,                 3            2

r04 ,,v2,,                 3            3

r05 ,,,v3,                 3            1

r05 ,,,v3,                 3            2

r05 ,,,v3,                 3            3

r06 ,v1,v2,                2            1

r06 ,v1,v2,                2            2

r07 ,v1,v2,,               3            1

r07 ,v1,v2,,               3            2

r07 ,v1,v2,,               3            3

r08 ,v1,v2,v3,             3            1

r08 ,v1,v2,v3,             3            2

r08 ,v1,v2,v3,             3            3

r09 ,,                     1            1

r10 ,,,                    2            1

r10 ,,,                    2            2

r11 ,,,,                   3            1

r11 ,,,,                   3            2

r11 ,,,,                   3            3

Component 3

To extract individual values from each string we use the parsing logic described in the String to Columns: Separated Values tutorial. Here is an example of how the logic extracts the first value from a delimited string.

column first_val format a9

select

c2,

  instr( c2, :d, 1, 1 ) position_of_first_delim ,

  instr( c2, :d, 1, 2 ) position_of_second_delim ,

  substr

  ( c2,

    instr( c2, :d, 1, 1 ) + 1,        -- == position of 1st character in value

    instr( c2, :d, 1, 2 )

      - ( instr( c2, :d, 1, 1 ) + 1 ) -- == length of 1st value

  as first_val

from

  ( select :d || c || :d as c2

    from   t

    where  key = 'r08'

C2           POSITION_OF_FIRST_DELIM POSITION_OF_SECOND_DELIM FIRST_VAL

------------ ----------------------- ------------------------ ---------

,v1,v2,v3,                         1                        4 v1

Final Solution

Adding the parsing logic from Component 3 to the query from "Components 1 + 2" yields the final solution.

select

  ta.key ,

  i.column_value as position ,

  substr

  ( ta.c2,

    instr( ta.c2, :d, 1, i.column_value  ) + 1,

    instr( ta.c2, :d, 1, i.column_value + 1 )

      - ( instr( ta.c2, :d, 1, i.column_value ) + 1 )

  as val

from

  ( select

      key ,

      :d || c || :d as c2 ,

      length( c || :d ) - nvl( length( replace( c, :d, null ) ), 0 ) as value_count

    from t

  ) ta

    inner join table( integer_varray_type( 1,2,3,4,5 ) ) i

      on ( i.column_value <= ta.value_count )

order by

  key, position

KEY   POSITION VAL

--- ---------- ----------

r01          1 v1

r02          1 v1

r02          2 (null)

r03          1 v1

r03          2 (null)

r03          3 (null)

r04          1 (null)

r04          2 v2

r04          3 (null)

r05          1 (null)

r05          2 (null)

r05          3 v3

r06          1 v1

r06          2 v2

r07          1 v1

r07          2 v2

r07          3 (null)

r08          1 v1

r08          2 v2

r08          3 v3

r09          1 (null)

r10          1 (null)

r10          2 (null)

r11          1 (null)

r11          2 (null)

r11          3 (null)

Strings to Rows

String to Rows

This section presents various techniques for transforming a string of delimited values like this one

----------

v1,v2,v3

into a number of rows with values appearing in a single column, like this.

POSITION VAL

---------- ----------

1 v1

2 v2

3 v3

We will use strings of comma separated values (CSV) in most of our examples, but the techniques can be easily modified for fixed format strings, pipe delimited strings, tab delimited strings, or quoted values by subtituting the appropriate parsing logic from SQL Techniques Tutorials: String to Columns for the CSV parsing logic in the examples.

Caveat

Storing more than one value in a delimted string in a single column (aka a "repeating group") often leads to performance and maintenance problems. Such designs break the first rule of Database Normalization, First Normal Form, giving you a non-normal table. Operations that are trivial with normalized tables become unecessarily complex with non-normal tables (see the solution to this requirement for an example of this complexity). If you are designing a new system avoid storing data in delimited strings whenever possible.

The techniques in this section are intended for cases where delimited strings are encountered outside the database or for those of you who must use them for reasons beyond your control.

String to Rows

Number to Words

Numbers to Words

This tutorial demonstrates how to transform numbers like 1, 12, and 123 into English words like "ONE", "TWELVE", and "ONE HUNDRED TWENTY-THREE". This transformation is a common requirement for cheque printing applications.

For Numbers Between 0 and 9,999,999

For numbers between 0 and 9,999,999 use the following technique which leverages the FF datetime formatting element (introduced in Oracle 9i) and the SP datetime format suffix (see Dates and Times: Format Suffixes ).

set linesize 100
set recsep   off

column numbers format a12
column words   format a80 word_wrapped

set null '(null)'

select
  to_char(i,'999,999,999') numbers ,
  to_char( to_timestamp( lpad(i,9,'0'), 'FF9' ), 'FFSP' ) as words
from n2w
where i between 0 and 9999999
order by i ;

NUMBERS      WORDS
------------ --------------------------------------------------------------------------------
           0 ZERO
           1 ONE
          12 TWELVE
         123 ONE HUNDRED TWENTY-THREE
       1,234 ONE THOUSAND TWO HUNDRED THIRTY-FOUR
      12,345 TWELVE THOUSAND THREE HUNDRED FORTY-FIVE
     123,456 ONE HUNDRED TWENTY-THREE THOUSAND FOUR HUNDRED FIFTY-SIX
   1,234,567 ONE MILLION TWO HUNDRED THIRTY-FOUR THOUSAND FIVE HUNDRED SIXTY-SEVEN
   1,721,057 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND FIFTY-SEVEN
   1,721,058 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND FIFTY-EIGHT
   1,721,300 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND THREE HUNDRED
   1,721,423 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND FOUR HUNDRED TWENTY-THREE
   1,721,424 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND FOUR HUNDRED TWENTY-FOUR
   5,373,484 FIVE MILLION THREE HUNDRED SEVENTY-THREE THOUSAND FOUR HUNDRED EIGHTY-FOUR
   5,373,485 FIVE MILLION THREE HUNDRED SEVENTY-THREE THOUSAND FOUR HUNDRED EIGHTY-FIVE
   7,777,777 SEVEN MILLION SEVEN HUNDRED SEVENTY-SEVEN THOUSAND SEVEN HUNDRED SEVENTY-SEVEN
   9,999,999 NINE MILLION NINE HUNDRED NINETY-NINE THOUSAND NINE HUNDRED NINETY-NINE

This technique also works for many values larger than 9,999,999, but those whose output strings are greater than 78 characters long will trigger an error (as tested in Oracle 10g XE).
For example, the number 777,777,007 produces a 78 character string without a problem

select to_char( to_timestamp( lpad(i,9,'0'), 'FF9' ), 'FFSP' ) as words
from n2w
where i = 777777007 ;

WORDS
--------------------------------------------------------------------------------
SEVEN HUNDRED SEVENTY-SEVEN MILLION SEVEN HUNDRED SEVENTY-SEVEN THOUSAND SEVEN

but 777,777,011, which should produce a 79 character string, triggers an error.

select to_char( to_timestamp( lpad(i,9,'0'), 'FF9' ), 'FFSP' ) as words
from n2w
where i = 777777011 ;
select to_char( to_timestamp( lpad(i,9,'0'), 'FF9' ), 'FFSP' ) as words
                *
ERROR at line 1:
ORA-01877: string is too long for internal buffer

To work around this problem we can use CAST (see Laurent Schneider's blog post on return size of to_char for more info.).

select CAST( to_char( to_timestamp( lpad(i,9,'0'), 'FF9' ), 'FFSP' ) AS VARCHAR2(100) ) as words
from n2w
where i = 777777011 ;

WORDS
--------------------------------------------------------------------------------
SEVEN HUNDRED SEVENTY-SEVEN MILLION SEVEN HUNDRED SEVENTY-SEVEN THOUSAND ELEVEN

For Numbers Between -999,999,999 and 999,999,999

To transform nulls and negative numbers as well as positive number we can use the following technique.

select
  to_char(i,'999,999,999') numbers ,
  case sign( i )
    when -1 then 'NEGATIVE '
    else null
    end ||
  case
    when i is null then
      'NOT ASSIGNED'
    else
      cast
      (
        to_char
        ( to_timestamp( lpad(abs(i),9,'0'), 'FF9' ), 'FFSP' )
        as varchar2(100)
      )
  end as words
from
  n2w
order by i
;

NUMBERS      WORDS
------------ --------------------------------------------------------------------------------
-123,456,789 NEGATIVE ONE HUNDRED TWENTY-THREE MILLION FOUR HUNDRED FIFTY-SIX THOUSAND SEVEN
             HUNDRED EIGHTY-NINE
          -1 NEGATIVE ONE
           0 ZERO
           1 ONE
          12 TWELVE
         123 ONE HUNDRED TWENTY-THREE
       1,234 ONE THOUSAND TWO HUNDRED THIRTY-FOUR
      12,345 TWELVE THOUSAND THREE HUNDRED FORTY-FIVE
     123,456 ONE HUNDRED TWENTY-THREE THOUSAND FOUR HUNDRED FIFTY-SIX
   1,234,567 ONE MILLION TWO HUNDRED THIRTY-FOUR THOUSAND FIVE HUNDRED SIXTY-SEVEN
   1,721,057 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND FIFTY-SEVEN
   1,721,058 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND FIFTY-EIGHT
   1,721,300 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND THREE HUNDRED
   1,721,423 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND FOUR HUNDRED TWENTY-THREE
   1,721,424 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND FOUR HUNDRED TWENTY-FOUR
   5,373,484 FIVE MILLION THREE HUNDRED SEVENTY-THREE THOUSAND FOUR HUNDRED EIGHTY-FOUR
   5,373,485 FIVE MILLION THREE HUNDRED SEVENTY-THREE THOUSAND FOUR HUNDRED EIGHTY-FIVE
   7,777,777 SEVEN MILLION SEVEN HUNDRED SEVENTY-SEVEN THOUSAND SEVEN HUNDRED SEVENTY-SEVEN
   9,999,999 NINE MILLION NINE HUNDRED NINETY-NINE THOUSAND NINE HUNDRED NINETY-NINE
  12,345,678 TWELVE MILLION THREE HUNDRED FORTY-FIVE THOUSAND SIX HUNDRED SEVENTY-EIGHT
  20,000,000 TWENTY MILLION
  77,777,777 SEVENTY-SEVEN MILLION SEVEN HUNDRED SEVENTY-SEVEN THOUSAND SEVEN HUNDRED
             SEVENTY-SEVEN
 123,456,789 ONE HUNDRED TWENTY-THREE MILLION FOUR HUNDRED FIFTY-SIX THOUSAND SEVEN HUNDRED
             EIGHTY-NINE
 200,000,000 TWO HUNDRED  MILLION
 777,777,007 SEVEN HUNDRED SEVENTY-SEVEN MILLION SEVEN HUNDRED SEVENTY-SEVEN THOUSAND SEVEN
 777,777,011 SEVEN HUNDRED SEVENTY-SEVEN MILLION SEVEN HUNDRED SEVENTY-SEVEN THOUSAND ELEVEN
 777,777,777 SEVEN HUNDRED SEVENTY-SEVEN MILLION SEVEN HUNDRED SEVENTY-SEVEN THOUSAND SEVEN
             HUNDRED SEVENTY-SEVEN
(null)       NOT ASSIGNED

Gotchas

Extra Spaces

FFSP generates extra spaces between some words, like the value

TWO HUNDRED  MILLION

in the results of the preceeding query. If this is a problem the result can be corrected with a little extra logic. Just use REPLACE on the transformed value.

select
  replace
  ( cast
    ( to_char( to_timestamp( lpad(i,9,'0'), 'FF9' ), 'FFSP' )
      as varchar2(100)
    )
  , '  '
  , ' '
  ) as words
from n2w
where i = 200000000 ;

WORDS
--------------------------------------------------------------------------------
TWO HUNDRED MILLION

JSP Approach

Before I published my FFSP technique here in June 2007 the de facto solution for transforming numbers into words was the JSP approach, which operates on Julian calendar dates. Beware of the JSP approach. It has some serious limitations that are not immediately obvious when you first use it. This is how it typically works

select
  to_char(i,'999,999,999') numbers ,
  to_char( to_date( i, 'J' ), 'JSP' ) as words
from n2w
where test_group = 'A' ;

NUMBERS      WORDS
------------ --------------------------------------------------------------------------------
(null)       (null)
           1 ONE
          12 TWELVE
         123 ONE HUNDRED TWENTY-THREE
       1,234 ONE THOUSAND TWO HUNDRED THIRTY-FOUR
      12,345 TWELVE THOUSAND THREE HUNDRED FORTY-FIVE
     123,456 ONE HUNDRED TWENTY-THREE THOUSAND FOUR HUNDRED FIFTY-SIX
   1,234,567 ONE MILLION TWO HUNDRED THIRTY-FOUR THOUSAND FIVE HUNDRED SIXTY-SEVEN
   1,721,057 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND FIFTY-SEVEN
   1,721,424 ONE MILLION SEVEN HUNDRED TWENTY-ONE THOUSAND FOUR HUNDRED TWENTY-FOUR
   5,373,484 FIVE MILLION THREE HUNDRED SEVENTY-THREE THOUSAND FOUR HUNDRED EIGHTY-FOUR

and these are the limitations. First, the technique does not work with the number zero.

select to_char( to_date( 0, 'J' ), 'JSP' ) as words
from dual ;
select to_char( to_date( 0, 'J' ), 'JSP' ) as words
                         *
ERROR at line 1:
ORA-01854: julian date must be between 1 and 5373484

Second, it does not support numbers greater than 5,373,484, which is the date 9999-12-31 in Julian days.

select to_char( to_date( 5373484 + 1, 'J' ), 'JSP' ) as words
from dual ;
select to_char( to_date( 5373484 + 1, 'J' ), 'JSP' ) as words
                                 *
ERROR at line 1:
ORA-01854: julian date must be between 1 and 5373484

Third, since the Julian calendar does not have a year zero in it the technique fails for numbers between 1,721,058 (which would be 0000-01-01) and 1,721,423 (which would be 0000-12-31).

select to_char( to_date( 1721058, 'J' ), 'JSP' ) as words
from dual ;
select to_char( to_date( 1721058, 'J' ), 'JSP' ) as words
                         *
ERROR at line 1:
ORA-01841: (full) year must be between -4713 and +9999, and not be 0

select to_char( to_date( 1721423, 'J' ), 'JSP' ) as words
from dual ;
select to_char( to_date( 1721423, 'J' ), 'JSP' ) as words
                         *
ERROR at line 1:
ORA-01841: (full) year must be between -4713 and +9999, and not be 0

Columns to Rows

UNPIVOT (11g)

Transforming columns into rows is easily accomplished with the SELECT command's UNPIVOT clause, introduced in Oracle 11g. (For database versions earlier than 11g see the remaining topics in this section for alternative techniques.)

Consider this data table.

select key, c1, c2, c3 from t order by key ;

KEY C1 C2 C3

--- ----- ----- -----

r1 v1 v2 v3

r2 v1 v2

r3 v1 v3

r4 v1

r5 v2 v3

r6 v2

r7 v3

We can convert each row of C1, C2, and C3 values into three rows of values under a single column as follows.

break on key skip 1 duplicates

select key, source, val

from

t UNPIVOT INCLUDE NULLS

( VAL FOR( SOURCE ) IN

( C1 AS 'C1',

C2 AS 'C2',

C3 AS 'C3'

)

order by key, source ;

KEY SOURCE VAL

--- ------ -----

r1 C1 v1

r1 C2 v2

r1 C3 v3

r2 C1 v1

r2 C2 v2

r2 C3

r3 C1 v1

r3 C2

r3 C3 v3

r4 C1 v1

r4 C2

r4 C3

r5 C1

r5 C2 v2

r5 C3 v3

r6 C1

r6 C2 v2

r6 C3

r7 C1

r7 C2

r7 C3 v3

r8 C1

r8 C2

r8 C3

If we do not need the rows with NULL values from C1, C2, or C3 we can use an EXCLUDE NULLS clause to filter them out.

select key, source, val

from

t unpivot EXCLUDE NULLS

( val for( source ) in

( c1 as 'C1',

c2 as 'C2',

c3 as 'C3'

)

order by key, source ;

KEY SOURCE VAL

--- ------ -----

r1 C1 v1

r1 C2 v2

r1 C3 v3

r2 C1 v1

r2 C2 v2

r3 C1 v1

r3 C3 v3

r4 C1 v1

r5 C2 v2

r5 C3 v3

r6 C2 v2

r7 C3 v3

If no INCLUDE/EXCLUDE NULLS clause is specified the default behaviour is EXCLUDE NULLS.

select key, source, val

from

t UNPIVOT

( val for( source ) in

( c1 as 'C1',

c2 as 'C2',

c3 as 'C3'

)

order by key, source ;

KEY SOURCE VAL

--- ------ -----

r1 C1 v1

r1 C2 v2

r1 C3 v3

r2 C1 v1

r2 C2 v2

r3 C1 v1

r3 C3 v3

r4 C1 v1

r5 C2 v2

r5 C3 v3

r6 C2 v2

r7 C3 v3

Gotchas

If the value column datatypes are not all in the same group, e.g. numeric, character, date, an error will be raised.

select key, source, val

from

t unpivot

( val for( source ) in

( n1 as 'N1', -- N1 is a NUMBER column

c2 as 'C2', -- C2 is a VARCHAR2 column

c3 as 'C3' -- C3 is a VARCHAR2 column

)

order by key, source ;

c2 as 'C2', -- C2 is a VARCHAR2 column

ERROR at line 6:

ORA-01790: expression must have same datatype as

corresponding expression

UNION ALL Method

In Oracle 10g or earlier one technique for transforming columns into rows uses the UNION ALL set operator. (For Oracle 11g see UNPIVOT (11g).)
Consider this data table.

select key, c1, c2, c3 from t order by key ;

KEY C1    C2    C3

--- ----- ----- -----

r1  v1    v2    v3

r2  v1    v2

r3  v1          v3

r4  v1

r5        v2    v3

r6        v2

r7              v3

r8

We can convert each row of C1, C2, and C3 values into three rows of values under a single column as follows.

break on key skip 1 duplicates

select key, 'C1' as source, c1 as val from t union all

select key, 'C2' as source, c2 as val from t union all

select key, 'C3' as source, c3 as val from t

order by key, source ;

KEY SOURCE VAL

--- ------ -----

r1  C1     v1

r1  C2     v2

r1  C3     v3

r2  C1     v1

r2  C2     v2

r2  C3

r3  C1     v1

r3  C2

r3  C3     v3

r4  C1     v1

r4  C2

r4  C3

r5  C1

r5  C2     v2

r5  C3     v3

r6  C1

r6  C2     v2

r6  C3

r7  C1

r7  C2

r7  C3     v3

r8  C1

r8  C2

r8  C3

If we do not need the rows with NULL values from C1, C2, or C3 we can use WHERE clauses to filter them out.

select key, 'C1' as source, c1 as val from t where c1 is not null union all

select key, 'C2' as source, c2 as val from t where c2 is not null union all

select key, 'C3' as source, c3 as val from t where c3 is not null

order by key, source ;

KEY SOURCE VAL

--- ------ -----

r1  C1     v1

r1  C2     v2

r1  C3     v3

r2  C1     v1

r2  C2     v2

r3  C1     v1

r3  C3     v3

r4  C1     v1

r5  C2     v2

r5  C3     v3

r6  C2     v2

r7  C3     v3

MODEL Method

In Oracle 10g another technique for transforming columns into rows uses the SELECT command's MODEL clause (not available prior to 10g). If you are unfamiliar with the MODEL clause review the SQL Features Tutorials: MODEL Clause tutorial before proceeding. (For Oracle 11g see UNPIVOT (11g). For versions prior to 10g see UNION ALL Method or Collection Method)
Consider this data table.

select key, c1, c2, c3 from t order by key ;

KEY C1    C2    C3

--- ----- ----- -----

r1  v1    v2    v3

r2  v1    v2

r3  v1          v3

r4  v1

r5        v2    v3

r6        v2

r7              v3

r8

We can convert each row of C1, C2, and C3 values into three rows of values under a single column as follows.

break on key duplicates skip 1

select key, source, val

from   t

model

  return updated rows

  partition by ( key )

  dimension by ( 0 as i )

  measures     ( 'xx' as source, 'xxxxx' as val, c1, c2, c3 )

  rules upsert all

    source[ 1 ] = 'C1' ,

    source[ 2 ] = 'C2' ,

    source[ 3 ] = 'C3' ,

    val[ 1 ] = c1[ 0 ] ,

    val[ 2 ] = c2[ 0 ] ,

    val[ 3 ] = c3[ 0 ]

order by key, source ;

KEY SOURCE VAL

--- ------ -----

r1  C1     v1

r1  C2     v2

r1  C3     v3

r2  C1     v1

r2  C2     v2

r2  C3

r3  C1     v1

r3  C2

r3  C3     v3

r4  C1     v1

r4  C2

r4  C3

r5  C1

r5  C2     v2

r5  C3     v3

r6  C1

r6  C2     v2

r6  C3

r7  C1

r7  C2

r7  C3     v3

r8  C1

r8  C2

r8  C3

Understanding How it Works

To understand how the solution works we will start with a simple MODEL query and build towards the final solution one step at a time.

clear breaks

column I format 99

select key, I, SOURCE, VAL, c1, c2, c3

from   t

where  key = 'r1'

model

  partition by ( key )

  dimension by ( 0 AS I )

  measures     ( 'xx' AS SOURCE, 'xxxxx' AS VAL, c1, c2, c3 )

  rules        ()

KEY   I SOURCE VAL   C1    C2    C3

--- --- ------ ----- ----- ----- -----

r1    0 xx     xxxxx v1    v2    v3

select key, i, source, val, c1, c2, c3

from   t

where  key = 'r1'

model

  partition by ( key )

  dimension by ( 0 as i )

  measures     ( 'xx' as source, 'xxxxx' as val, c1, c2, c3 )

  RULES UPSERT ALL

    SOURCE[ 1 ] = 'C1' ,

    SOURCE[ 2 ] = 'C2' ,

    SOURCE[ 3 ] = 'C3'

ORDER BY I ;

KEY   I SOURCE VAL   C1    C2    C3

--- --- ------ ----- ----- ----- -----

r1    0 xx     xxxxx v1    v2    v3

r1    1 C1

r1    2 C2

r1    3 C3

select key, i, source, val, c1, c2, c3

from   t

where  key = 'r1'

model

  partition by ( key )

  dimension by ( 0 as i )

  measures     ( 'xx' as source, 'xxxxx' as val, c1, c2, c3 )

  rules upsert all

    source[ 1 ] = 'C1' ,

    source[ 2 ] = 'C2' ,

    source[ 3 ] = 'C3' ,

    VAL[ 1 ] = C1[ 0 ] ,

    VAL[ 2 ] = C2[ 0 ] ,

    VAL[ 3 ] = C3[ 0 ]

order by i ;

KEY   I SOURCE VAL   C1    C2    C3

--- --- ------ ----- ----- ----- -----

r1    0 xx     xxxxx v1    v2    v3

r1    1 C1     v1

r1    2 C2     v2

r1    3 C3     v3

select key, i, source, val

from   t

where  key = 'r1'

model

  RETURN UPDATED ROWS

  partition by ( key )

  dimension by ( 0 as i )

  measures     ( 'xx' as source, 'xxxxx' as val, c1, c2, c3 )

  rules upsert all

    source[ 1 ] = 'C1' ,

    source[ 2 ] = 'C2' ,

    source[ 3 ] = 'C3' ,

    val[ 1 ] = c1[ 0 ] ,

    val[ 2 ] = c2[ 0 ] ,

    val[ 3 ] = c3[ 0 ]

order by i ;

KEY   I SOURCE VAL

--- --- ------ -----

r1    1 C1     v1

r1    2 C2     v2

r1    3 C3     v3

Performance Comparison

The following tables show performance metrics for one run each of the three Column to Row techniques described in the preceeding tutorials.

· UNPIVOT (11g)

· UNION ALL Method

· Collection Method

· MODEL Method

Each run transforms 10 columns from an 800 row table into a 8,000 row result set. See the log file from these tests for more details.

The tests were performed using SQL*Plus against an Oracle 11g Enterprise Edition Release 11.1.0.6.0 database running on Oracle Enterprise Linux 5.

Statistics

The following table shows database statistics where values for one method differ by more than 100 from another method.

METRIC_NAME UNPIVOT UNION ALL Collection MODEL

------------------------------ ------------ ------------ ------------ ------------

Elapsed Time (1/100 sec) 30 32 32 43

table scan rows gotten 8,800 157,493 142,869 1,600

session uga memory max 123,452 123,452 123,452 246,904

bytes sent via SQL*Net to clie 127,123 118,454 118,719 127,935

session logical reads 135 755 641 72

consistent gets 83 703 588 20

consistent gets from cache 83 703 588 20

consistent gets from cache (fa 82 691 551 19

no work - consistent read gets 44 635 540 8

table scan blocks gotten 44 614 540 8

recursive calls 36 295 9 9

session uga memory 65,464 0 65,464 65,464

sorts (rows) 8,000 0 0 0

See Statistics Descriptions for a description of each metric.

Latch Gets

The following table shows total latch gets for each method.

     UNPIVOT    UNION ALL   Collection        MODEL

------------ ------------ ------------ ------------

       2,150        3,771        2,463        1,863

Techniques that use a small number of latches scale better than techniques that use a large number of latches
Warning: Results on your own systems with your own data will differ from these results. Results will even differ from one set of test runs to the next on the same machine. Run your own tests and average the results from multiple runs before making performance decisions.

Oracle

Tuesday, September 20, 2011

Strings to Rows 2

Strings to Rows 1

How It Works

Component 1

Component 2

Components 1 + 2

Component 3

Final Solution

Strings to Rows

String to Rows

Number to Words

Numbers to Words

For Numbers Between 0 and 9,999,999

For Numbers Between -999,999,999 and 999,999,999

Gotchas

Extra Spaces

JSP Approach

Columns to Rows

UNION ALL Method

MODEL Method

Understanding How it Works

Latch Gets

Oracle Escape Characters

Search This Blog