Oracle: Rows to Strings in Different Methods

In this section show how to collect values from multiple rows into a single, delimited string. This process is commonly known as "string aggregation". In practice it takes a table that looks like this

GROUP_KEY  VAL

---------- ----------

Group 1    a

Group 2    a

Group 2    b

Group 3    a

Group 3    b

Group 3    c

Group 4    a

Group 4    a

Group 4    b

Group 4    b

Group 5    a

Group 5    b

Group 5    d

Group 5    e

Group 5    (null)

Group 6    (null)

Group 7    (null)

Group 7    (null)

Group 8    a

Group 8    a

Group 8    b

Group 8    b

Group 8    (null)

Group 8    (null)

(null)     (null)

and yields a set of strings that look like this.

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,a,b,b

Group 5    a,b,d,e,

Group 6    (null)

Group 7    ,

Group 8    a,a,b,b,

(null)     (null)

Over the years many techniques for performing string aggregation have appeared. They all have their strengths and weaknesses so a few different methods will be examined here. Of the methods I've found on the web to-date we will explore four popular approaches.

Hierarchical Method

This shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like these

GROUP_KEY VAL

---------- ----------

Group 3 a

Group 3 b

Group 3 c

and yields a string like this.

GROUP_KEY STRING

---------- ---------------

Group 3 a,b,c

Oracle's hierarchical features will be used to achieve this result. Other tutorials in this section present alternate approaches. For tutorials on understanding Oracle's hierarchical features see SQL Features Tutorials: Hierarchical Data.

First, let's take a look at the test data.

desc t

Name Null? Type

----------------------------- -------- --------------------

GROUP_KEY VARCHAR2(10)

VAL VARCHAR2(10)

set null "(null)"

break on group_key skip 1 duplicates

select * from t order by group_key, val ;

GROUP_KEY VAL

---------- ----------

Group 1 a

Group 2 a

Group 2 b

Group 3 a

Group 3 b

Group 3 c

Group 4 a

Group 4 b

Group 5 a

Group 5 b

Group 5 d

Group 5 e

Group 5 (null)

Group 6 (null)

Group 7 (null)

Group 8 a

Group 8 b

Group 8 (null)

(null) (null)

Now here is the solution.

clear breaks

select

tb.group_key ,

substr

( sys_connect_by_path( tb.val, ',' ) ,

) as string

from

( select

group_key ,

val ,

row_number() over

( partition by group_key

order by val

)

as val_index

from

) tb

where

connect_by_isleaf = 1

connect by

val_index = prior val_index + 1 and

group_key = prior group_key

start with

val_index = 1

;

GROUP_KEY STRING

---------- ---------------

Group 1 a

Group 2 a,b

Group 3 a,b,c

Group 4 a,a,b,b

Group 5 a,b,d,e,

Group 6 (null)

Group 7 ,

Group 8 a,a,b,b,,

(null) (null)

To control the order of the delimited values, change the "ORDER BY" clause in the ROW_NUMBER() function.

select

tb.group_key ,

substr

( sys_connect_by_path( tb.val, ',' ) ,

) as string

from

( select

group_key ,

val ,

row_number() over

( partition by group_key

ORDER BY VAL DESC -- sort in descending order this time

)

as val_index

from

) tb

where

connect_by_isleaf = 1

connect by

val_index = prior val_index + 1 and

group_key = prior group_key

start with

val_index = 1

;

GROUP_KEY STRING

---------- ---------------

Group 1 a

Group 2 b,a

Group 3 c,b,a

Group 4 b,b,a,a

Group 5 ,e,d,b,a

Group 6 (null)

Group 7 ,

Group 8 ,,b,b,a,a

(null) (null)

To filter out duplicate VAL values, like those in Group 4, use a subquery as indicated below.

select

tb.group_key ,

substr

( sys_connect_by_path( tb.val, ',' ) ,

) as string

from

( select

group_key ,

val ,

row_number() over

( partition by group_key

order by val

)

as val_index

from

( SELECT DISTINCT GROUP_KEY, VAL FROM T ) tc -- provides distinct values

) tb

where

connect_by_isleaf = 1

connect by

val_index = prior val_index + 1 and

group_key = prior group_key

start with

val_index = 1

;

GROUP_KEY STRING

---------- ---------------

Group 1 a

Group 2 a,b

Group 3 a,b,c

Group 4 a,b

Group 5 a,b,d,e,

Group 6 (null)

Group 7 (null)

Group 8 a,b,

(null) (null)

To filter null values out of your results, like the ones in Groups 5 and 6, include a WHERE clause in the FROM clause subquery.

select

tb.group_key ,

substr

( sys_connect_by_path( tb.val, ',' ) ,

) as string

from

( select

group_key ,

val ,

row_number() over

( partition by group_key

order by val

)

as val_index

from

WHERE

VAL IS NOT NULL -- filters out null VAL values

) tb

where

connect_by_isleaf = 1

connect by

val_index = prior val_index + 1 and

group_key = prior group_key

start with

val_index = 1

;

GROUP_KEY STRING

---------- ---------------

Group 1 a

Group 2 a,b

Group 3 a,b,c

Group 4 a,a,b,b

Group 5 a,b,d,e

Group 8 a,a,b,b

Object Method

This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like these

GROUP_KEY  VAL

---------- ----------

Group 3    a

Group 3    b

Group 3    c

and yields a string like this.

GROUP_KEY  STRING

---------- ---------------

Group 3    a,b,c

Oracle's object relational features will be used to achieve this result. Other tutorials in this section present alternate approaches.

Prerequisites

This method uses the object relational SQL COLLECT function which was introduced in Oracle 10g. It therefore will not work in earlier versions. It also requires a VARCHAR2 table type and a custom function to work. The VARCHAR2 table type is created in the Setup topic for this section.

desc varchar2_table_type

 varchar2_table_type TABLE OF VARCHAR2(10)

The function is created as follows.

create function nested_table_to_csv

( p_table     in varchar2_table_type ,

  p_delimiter in varchar2 default ','

return varchar2

is

  ----------------------------------------------------------------------

  -- Usage:

--

  -- p_table

  --   - contains the table you wish to convert to a string

--

  -- p_delimiter

  --   - contains the varchar2 string you want to see between each

  --     value in the returned string

  --   - default value is ',' (comma)

  --   - set it to null if you don't want any delimiters

  --   - e.g. p_delimiter => '/' returns

--

  --       a/b/c/d

  ----------------------------------------------------------------------

  i        binary_integer  := p_table.first ;

  v_return varchar2(32767) := null ;

begin

  while i is not null

  loop

    if i = p_table.first then

      v_return := p_table(i);

    else

      v_return := v_return || p_delimiter || p_table(i);

    end if;

    i := p_table.next(i) ;

  end loop;

  return( v_return );

end nested_table_to_csv;

show errors

No errors.

Test Data

desc t

 Name                          Null?    Type

 ----------------------------- -------- --------------------

 GROUP_KEY                              VARCHAR2(10)

 VAL                                    VARCHAR2(10)

set null "(null)"

break on group_key skip 1 duplicates

select * from t order by group_key, val ;

GROUP_KEY  VAL

---------- ----------

Group 1    a

Group 2    a

Group 2    b

Group 3    a

Group 3    b

Group 3    c

Group 4    a

Group 4    a

Group 4    b

Group 4    b

Group 5    a

Group 5    b

Group 5    d

Group 5    e

Group 5    (null)

Group 6    (null)

Group 7    (null)

Group 7    (null)

Group 8    a

Group 8    a

Group 8    b

Group 8    b

Group 8    (null)

Group 8    (null)

(null)     (null)

The Solution

We can now run the solution, which uses the type and function we created above in conjunction with COLLECT .

clear breaks

select

  group_key ,

  NESTED_TABLE_TO_CSV

    ( cast( COLLECT(val) as VARCHAR2_TABLE_TYPE )

    ) as string

from

group by

  group_key

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,a,b,b

Group 5    a,b,d,e

Group 6    (null)

Group 7    (null)

Group 8    a,a,b,b

(null)     (null)

There are two important things to note about this solution. First, unlike other solutions we have seen so far, these results do not include empty values in the strings for Groups 5 and 8. As well, the string for "Group 7" only contains one null, not two. This is because COLLECT is an aggregate function and all aggregate functions, except COUNT(*) and GROUPING, ignore null values.
Second, since COLLECT returns a nested table which is, by definition, an unordered set of data we cannot guarantee the values in our delimited string will appear in any given order. If you need ordered values you can add sorting logic to the NESTED_TABLE_TO_CSV function.
To filter out duplicate values, like those in Groups 4 and 8, we simply add the DISTINCT argument to the COLLECT function.

select

  group_key ,

  nested_table_to_csv

    ( cast( collect( DISTINCT val ) as varchar2_table_type )

    ) as string

from

group by

  group_key

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,b

Group 5    a,b,d,e

Group 6    (null)

Group 7    (null)

Group 8    a,b

(null)     (null)

Tom Kyte's STRAGG

This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like these

GROUP_KEY  VAL

---------- ----------

Group 3    a

Group 3    b

Group 3    c

and yields a string like this.

GROUP_KEY  STRING

---------- ---------------

Group 3    a,b,c

A technique originally published by Tom Kyte at Ask Tom "Function to concatenate output" employing a user-defined function called STRAGG will be demonstrated. Other tutorials in this section discuss alternate approaches.
While Tom Kyte himself says he no longer uses STRAGG (see The Tom Kyte Blog:Stringing them up...) some people still prefer it over other solutions. STRAGG is also a good case study for anyone interested in using the Extensibility Framework to write their own aggregate functions.

Prerequisites

create or replace type stragg_type as object

  string varchar2(4000),

  static function ODCIAggregateInitialize

    ( sctx in out stragg_type )

    return number ,

  member function ODCIAggregateIterate

    ( self  in out stragg_type ,

      value in     varchar2

    ) return number ,

  member function ODCIAggregateTerminate

    ( self        in  stragg_type,

      returnvalue out varchar2,

      flags in number

    ) return number ,

  member function ODCIAggregateMerge

    ( self in out stragg_type,

      ctx2 in     stragg_type

    ) return number

);

create or replace type body stragg_type

is

  static function ODCIAggregateInitialize

  ( sctx in out stragg_type )

  return number

is

  begin

    sctx := stragg_type( null ) ;

    return ODCIConst.Success ;

  end;

  member function ODCIAggregateIterate

  ( self  in out stragg_type ,

    value in     varchar2

  ) return number

is

  begin

    self.string := self.string || ',' || value ;

    return ODCIConst.Success;

  end;

  member function ODCIAggregateTerminate

  ( self        in  stragg_type ,

    returnvalue out varchar2 ,

    flags       in  number

  ) return number

is

  begin

    returnValue := ltrim( self.string, ',' );

    return ODCIConst.Success;

  end;

  member function ODCIAggregateMerge

  ( self in out stragg_type ,

    ctx2 in     stragg_type

  ) return number

is

  begin

    self.string := self.string || ctx2.string;

    return ODCIConst.Success;

  end;

end;

create or replace function stragg

  ( input varchar2 )

  return varchar2

  deterministic

  parallel_enable

  aggregate using stragg_type

Test Data

desc t

 Name                          Null?    Type

 ----------------------------- -------- --------------------

 GROUP_KEY                              VARCHAR2(10)

 VAL                                    VARCHAR2(10)

set null "(null)"

break on group_key skip 1 duplicates

select * from t order by group_key, val ;

GROUP_KEY  VAL

---------- ----------

Group 1    a

Group 2    a

Group 2    b

Group 3    a

Group 3    b

Group 3    c

Group 4    a

Group 4    a

Group 4    b

Group 4    b

Group 5    a

Group 5    b

Group 5    d

Group 5    e

Group 5    (null)

Group 6    (null)

Group 7    (null)

Group 7    (null)

Group 8    a

Group 8    a

Group 8    b

Group 8    b

Group 8    (null)

Group 8    (null)

(null)     (null)

Aggregate Solution

select

  group_key ,

  STRAGG( VAL ) as string

from

group by

  group_key

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,b,b,a

Group 5    a,b,d,e

Group 6    (null)

Group 7    (null)

Group 8    a,a,b,b

(null)     (null)

It is important to note that, unlike some other solutions we have seen, these results do not include empty values in the strings for Groups 5 and 8. As well, the string for "Group 7" only contains one null, not two. This is because STRAGG, and all other aggregate functions except COUNT(*) and GROUPING, ignore null values.
To only see distinct values in our result strings we can add the DISTINCT argument to the STRAGG function call. Note how the results for Groups 4 and 8 below differ from those above.

select

  group_key ,

  stragg( DISTINCT val ) as string

from

group by

  group_key

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,b

Group 5    a,b,d,e

Group 6    (null)

Group 7    (null)

Group 8    a,b

(null)     (null)

Analytic Version

STRAGG can also be used as an analytical function without changing any of the underlying code.

break on group_key skip 1 duplicates

select

  group_key ,

  val ,

  stragg( val ) OVER

      PARTITION BY GROUP_KEY

      ORDER BY VAL

      RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

    as string

from

GROUP_KEY  VAL        STRING

---------- ---------- ---------------

Group 1    a          a

Group 2    a          a,b

Group 2    b          a,b

Group 3    a          a,b,c

Group 3    b          a,b,c

Group 3    c          a,b,c

Group 4    a          a,a,b,b

Group 4    a          a,a,b,b

Group 4    b          a,a,b,b

Group 4    b          a,a,b,b

Group 5    a          a,b,d,e

Group 5    b          a,b,d,e

Group 5    d          a,b,d,e

Group 5    e          a,b,d,e

Group 5    (null)     a,b,d,e

Group 6    (null)     (null)

Group 7    (null)     (null)

Group 7    (null)     (null)

Group 8    a          a,a,b,b

Group 8    a          a,a,b,b

Group 8    b          a,a,b,b

Group 8    b          a,a,b,b

Group 8    (null)     a,a,b,b

Group 8    (null)     a,a,b,b

(null)     (null)     (null)

To change the sort order of values in the string simply change the ORDER BY VAL clause.

break on group_key skip 1 duplicates

select

  group_key ,

  val ,

  stragg( val ) OVER

      partition by group_key

      order by val DESC

      range between unbounded preceding and unbounded following

    as string

from

GROUP_KEY  VAL        STRING

---------- ---------- ---------------

Group 1    a          a

Group 2    b          b,a

Group 2    a          b,a

Group 3    c          c,b,a

Group 3    b          c,b,a

Group 3    a          c,b,a

Group 4    b          b,b,a,a

Group 4    b          b,b,a,a

Group 4    a          b,b,a,a

Group 4    a          b,b,a,a

Group 5    (null)     e,d,b,a

Group 5    e          e,d,b,a

Group 5    d          e,d,b,a

Group 5    b          e,d,b,a

Group 5    a          e,d,b,a

Group 6    (null)     (null)

Group 7    (null)     (null)

Group 7    (null)     (null)

Group 8    (null)     b,b,a,a

Group 8    (null)     b,b,a,a

Group 8    b          b,b,a,a

Group 8    b          b,b,a,a

Group 8    a          b,b,a,a

Group 8    a          b,b,a,a

(null)     (null)     (null)

XML - CSV Method

This tutorial shows how to collect values from multiple rows into a single, CSV delimited string. This is also known as "string aggregation". It takes values like these

GROUP_KEY  VAL

---------- ----------

Group 3    a

Group 3    b

Group 3    c

and yields a string like this one.

GROUP_KEY  STRING

---------- ---------------

Group 3    a,b,c

Oracle's XML features will be used to achieve this result.
In the XML Method tutorial we saw how to use the XMLAGG and XMLELEMENT functions to create an XML delimited string like this.

column string format a62

select

  group_key ,

  XMLAGG

    ( XMLELEMENT( "V", val )

    ).GETSTRINGVAL()

    as string

from

group by

  group_key

GROUP_KEY  STRING

---------- --------------------------------------------------------------

Group 1    <V>a</V>

Group 2    <V>a</V><V>b</V>

Group 3    <V>a</V><V>b</V><V>c</V>

Group 4    <V>a</V><V>b</V><V>b</V><V>a</V>

Group 5    <V>a</V><V>b</V><V></V><V>e</V><V>d</V>

Group 6    <V></V>

Group 7    <V></V><V></V>

Group 8    <V>a</V><V>b</V><V>b</V><V></V><V></V><V>a</V>

(null)     <V></V>

By adding calls to EXTRACT and LTRIM we can convert this solution into one that gives us comma separated values.

select

  group_key ,

  LTRIM

  ( EXTRACT

    ( xmlagg( xmlelement( "V", ',' || val ) ),

      '/V/text()'

) ,

','

  ) as string

from

group by

  group_key

GROUP_KEY  STRING

---------- --------------------------------------------------------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,b,b,a

Group 5    a,b,,e,d

Group 6    (null)

Group 7    (null)

Group 8    a,b,b,,,a

(null)     (null)

To control the sort order of values within the string we can use XMLAGG's ORDER BY clause.

select

  group_key ,

  ltrim

  ( extract

    ( xmlagg

      ( xmlelement( "V", ',' || val )

        ORDER BY VAL DESC

),

      '/V/text()'

) ,

','

  ) as string

from

group by

  group_key

GROUP_KEY  STRING

---------- --------------------------------------------------------------

Group 1    a

Group 2    b,a

Group 3    c,b,a

Group 4    b,b,a,a

Group 5    e,d,b,a

Group 6    (null)

Group 7    (null)

Group 8    b,b,a,a

(null)     (null)

As with the original solution, you can use a FROM clause subquery to produce distinct values and filter out null values if you need to.

select

  group_key ,

  ltrim

  ( extract

    ( xmlagg

      ( xmlelement( "V", ',' || val )

        order by val

),

      '/V/text()'

) ,

','

  ) as string

from

  ( SELECT DISTINCT GROUP_KEY, VAL -- produces distinct values

    FROM   T

    WHERE  VAL IS NOT NULL         -- filters our null values

  ) tb

group by

  group_key

GROUP_KEY  STRING

---------- --------------------------------------------------------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,b

Group 5    a,b,d,e

Group 8    a,b

MAX DECODE Method

This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like these

GROUP_KEY  VAL

---------- ----------

Group 3    a

Group 3    b

Group 3    c

and yields a string like this.

GROUP_KEY  STRING

---------- ---------------

Group 3    a,b,c

The solution to follow is a good choice when the maximum number of values is known and small. See other solutions in this section when this is not the case.

Test Data

desc t

 Name                          Null?    Type

 ----------------------------- -------- --------------------

 GROUP_KEY                              VARCHAR2(10)

 VAL                                    VARCHAR2(10)

set null "(null)"

break on group_key skip 1 duplicates

select * from t order by group_key, val ;

GROUP_KEY  VAL

---------- ----------

Group 1    a

Group 2    a

Group 2    b

Group 3    a

Group 3    b

Group 3    c

Group 4    a

Group 4    a

Group 4    b

Group 4    b

Group 5    a

Group 5    b

Group 5    d

Group 5    e

Group 5    (null)

Group 6    (null)

Group 7    (null)

Group 7    (null)

Group 8    a

Group 8    a

Group 8    b

Group 8    b

Group 8    (null)

Group 8    (null)

(null)     (null)

Solution

In this solution, which uses the DECODE and MAX functions, we assume the total number of values ever collected will not exceed 6.

clear breaks

select

  group_key ,

  max( decode( val_number, 1 , val, null ) ) ||

  max( decode( val_number, 2 , ',' || val, null ) ) ||

  max( decode( val_number, 3 , ',' || val, null ) ) ||

  max( decode( val_number, 4 , ',' || val, null ) ) ||

  max( decode( val_number, 5 , ',' || val, null ) ) ||

  max( decode( val_number, 6 , ',' || val, null ) ) as string

from

  ( select

      group_key,

      row_number() over ( partition by group_key order by val ) as val_number ,

val

    from t

  ) ta

group by group_key

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,a,b,b

Group 5    a,b,d,e,

Group 6    (null)

Group 7    ,

Group 8    a,a,b,b,,

(null)     (null)

To control the order of the delimited values, change the ORDER BY clause in the ROW_NUMBER() argument.

select

  group_key ,

  max( decode( val_number, 1 , val, null ) ) ||

  max( decode( val_number, 2 , ',' || val, null ) ) ||

  max( decode( val_number, 3 , ',' || val, null ) ) ||

  max( decode( val_number, 4 , ',' || val, null ) ) ||

  max( decode( val_number, 5 , ',' || val, null ) ) ||

  max( decode( val_number, 6 , ',' || val, null ) ) as string

from

  ( select

      group_key,

      row_number() over ( partition by group_key ORDER BY VAL DESC ) as val_number ,

val

    from t

  ) t_a

group by group_key

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    b,a

Group 3    c,b,a

Group 4    b,b,a,a

Group 5    ,e,d,b,a

Group 6    (null)

Group 7    ,

Group 8    ,,b,b,a,a

(null)     (null)

To filter duplicate VAL values, like those in Groups 4, 7, and 8 from the end result we use an inline view in the FROM clause.

clear breaks

select

  group_key ,

  max( decode( val_number, 1 , val, null ) ) ||

  max( decode( val_number, 2 , ',' || val, null ) ) ||

  max( decode( val_number, 3 , ',' || val, null ) ) ||

  max( decode( val_number, 4 , ',' || val, null ) ) ||

  max( decode( val_number, 5 , ',' || val, null ) ) ||

  max( decode( val_number, 6 , ',' || val, null ) ) as string

from

  ( select

      group_key,

      row_number() over ( partition by group_key order by val ) as val_number ,

val

    from ( SELECT DISTINCT GROUP_KEY, VAL FROM T ) T_A

  ) t_b

group by group_key

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,b

Group 5    a,b,d,e,

Group 6    (null)

Group 7    (null)

Group 8    a,b,

(null)     (null)

To filter null VAL values, like the ones in Groups 5 and higher from the end result we can add a WHERE clause to the query.

select

  group_key ,

  max( decode( val_number, 1 , val, null ) ) ||

  max( decode( val_number, 2 , ',' || val, null ) ) ||

  max( decode( val_number, 3 , ',' || val, null ) ) ||

  max( decode( val_number, 4 , ',' || val, null ) ) ||

  max( decode( val_number, 5 , ',' || val, null ) ) ||

  max( decode( val_number, 6 , ',' || val, null ) ) as string

from

  ( select

      group_key,

      row_number() over ( partition by group_key order by val ) as val_number ,

val

    from t

    WHERE VAL IS NOT NULL

  ) ta

group by group_key

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,a,b,b

Group 5    a,b,d,e

Group 8    a,a,b,b

LEAD Method

This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like these

GROUP_KEY  VAL

---------- ----------

Group 3    a

Group 3    b

Group 3    c

and yields a string like this.

GROUP_KEY  STRING

---------- ---------------

Group 3    a,b,c

The solution to follow is a good choice when the maximum number of values is known and small. See other solutions in this section when this is not the case.

Test Data

desc t

 Name                          Null?    Type

 ----------------------------- -------- --------------------

 GROUP_KEY                              VARCHAR2(10)

 VAL                                    VARCHAR2(10)

set null "(null)"

break on group_key skip 1 duplicates

select * from t order by group_key, val ;

GROUP_KEY  VAL

---------- ----------

Group 1    a

Group 2    a

Group 2    b

Group 3    a

Group 3    b

Group 3    c

Group 4    a

Group 4    a

Group 4    b

Group 4    b

Group 5    a

Group 5    b

Group 5    d

Group 5    e

Group 5    (null)

Group 6    (null)

Group 7    (null)

Group 7    (null)

Group 8    a

Group 8    a

Group 8    b

Group 8    b

Group 8    (null)

Group 8    (null)

(null)     (null)

Solution

In this solution, which uses the LEAD analytic function, we assume the total number of values ever collected will not exceed 6.

clear breaks

select group_key, string

from

    select

      group_key ,

      row_number() over ( partition by group_key order by val ) as position ,

      val ||

      lead( ','||val, 1 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 2 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 3 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 4 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 5 ) over ( partition by group_key order by val )

      as string

    from t

where position = 1

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,a,b,b

Group 5    a,b,d,e,

Group 6    (null)

Group 7    ,

Group 8    a,a,b,b,,

(null)     (null)

To control the order of the delimited values, change the ORDER BY clauses.

select group_key, string

from

    select

      group_key ,

      row_number() over ( partition by group_key ORDER BY VAL DESC) as position ,

      val ||

      lead( ','||val, 1 ) over ( partition by group_key ORDER BY VAL DESC) ||

      lead( ','||val, 2 ) over ( partition by group_key ORDER BY VAL DESC) ||

      lead( ','||val, 3 ) over ( partition by group_key ORDER BY VAL DESC) ||

      lead( ','||val, 4 ) over ( partition by group_key ORDER BY VAL DESC) ||

      lead( ','||val, 5 ) over ( partition by group_key ORDER BY VAL DESC)

      as string

    from t

where position = 1

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    b,a

Group 3    c,b,a

Group 4    b,b,a,a

Group 5    ,e,d,b,a

Group 6    (null)

Group 7    ,

Group 8    ,,b,b,a,a

(null)     (null)

To filter duplicate VAL values, like those in Groups 4, 7, and 8 from the end result we use an inline view in the FROM clause.

clear breaks

select group_key, string

from

    select

      group_key ,

      row_number() over ( partition by group_key order by val ) as position ,

      val ||

      lead( ','||val, 1 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 2 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 3 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 4 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 5 ) over ( partition by group_key order by val )

      as string

    from ( SELECT DISTINCT GROUP_KEY, VAL FROM T ) T_A

where position = 1

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,b

Group 5    a,b,d,e,

Group 6    (null)

Group 7    (null)

Group 8    a,b,

(null)     (null)

To filter null VAL values, like the ones in Groups 5 and higher from the end result we can add a WHERE clause to the query.

select group_key, string

from

    select

      group_key ,

      row_number() over ( partition by group_key order by val ) as position ,

      val ||

      lead( ','||val, 1 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 2 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 3 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 4 ) over ( partition by group_key order by val ) ||

      lead( ','||val, 5 ) over ( partition by group_key order by val )

      as string

    from t

    WHERE VAL IS NOT NULL

where position = 1

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,a,b,b

Group 5    a,b,d,e

Group 8    a,a,b,b

MODEL Method 1

This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like these

GROUP_KEY  VAL

---------- ----------

Group 3    a

Group 3    b

Group 3    c

and yields a string like this.

GROUP_KEY  STRING

---------- ---------------

Group 3    a,b,c

While the solution to follow, which we will refer to as MODEL Method 1, is the simplest of the three MODEL solution presented in this section it unfortunately scales the worst when a large number of values is selected or very wide STRING values are produced (see MODEL Clause: MODEL Performance Tuning). MODEL Method 2 and MODEL Method 3 provide alternative MODEL techniques that scale better than MODEL Method 1, though they are more complex.

Test Data

desc t

 Name                          Null?    Type

 ----------------------------- -------- --------------------

 GROUP_KEY                              VARCHAR2(10)

 VAL                                    VARCHAR2(10)

set null "(null)"

break on group_key skip 1 duplicates

select * from t order by group_key, val ;

GROUP_KEY  VAL

---------- ----------

Group 1    a

Group 2    a

Group 2    b

Group 3    a

Group 3    b

Group 3    c

Group 4    a

Group 4    a

Group 4    b

Group 4    b

Group 5    a

Group 5    b

Group 5    d

Group 5    e

Group 5    (null)

Group 6    (null)

Group 7    (null)

Group 7    (null)

Group 8    a

Group 8    a

Group 8    b

Group 8    b

Group 8    (null)

Group 8    (null)

(null)     (null)

Solution

This solution uses the SQL MODEL clause which was introduced in Oracle 10g. If you are unfamiliar with MODEL you may wish to review the SQL Features Tutorials: MODEL Clause tutorial before proceeding.

clear breaks

select group_key, substr( string, 2 ) as string

from   t

model

  return updated rows

  partition by ( group_key )

  dimension by ( row_number() over (partition by group_key order by val) as position )

  measures     ( cast( val as varchar2(65) ) as string  ) -- Note 1

  rules

    upsert

    iterate( 6 ) -- Note 2

      until ( presentv(string[iteration_number+2],1,0) = 0 )

    ( string[0] = string[0] || ',' || string[iteration_number+1] )

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,a,b,b

Group 5    a,b,d,e,

Group 6    (null)

Group 7    ,

Group 8    a,a,b,b,,

(null)     (null)

Notes:

1. 65 = v * n + (n-1) where

o v = maximum length of VAL values (this is 10 in our test table)

o n = maximum number of rows with the same GROUP_KEY (6 in our test table)

2. Use any number here greater than or equal to n.

To control the order of the delimited values, change the ORDER BY clause in the arguments for ROW_NUMBER().

select group_key, substr( string, 2 ) as string

from   t

model

  return updated rows

  partition by ( group_key )

  dimension by ( row_number() over (partition by group_key ORDER BY VAL DESC) as position )

  measures     ( cast( val as varchar2(65) ) as string  )

  rules

    upsert

    iterate( 6 )

      until ( presentv(string[iteration_number+2],1,0) = 0 )

    ( string[0] = string[0] || ',' || string[iteration_number+1] )

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    b,a

Group 3    c,b,a

Group 4    b,b,a,a

Group 5    ,e,d,b,a

Group 6    (null)

Group 7    ,

Group 8    ,,b,b,a,a

(null)     (null)

To filter duplicate VAL values, like those in Groups 4, 7, and 8 from the end result we use an inline view in the FROM clause.

select group_key, substr( string, 2 ) as string

from   ( SELECT DISTINCT GROUP_KEY, VAL FROM T ) T

model

  return updated rows

  partition by ( group_key )

  dimension by ( row_number() over (partition by group_key order by val) as position )

  measures     ( cast( val as varchar2(65) ) as string  )

  rules

    upsert

    iterate( 6 )

      until ( presentv(string[iteration_number+2],1,0) = 0 )

    ( string[0] = string[0] || ',' || string[iteration_number+1] )

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,b

Group 5    a,b,d,e,

Group 6    (null)

Group 7    (null)

Group 8    a,b,

(null)     (null)

To filter null VAL values, like the ones in Groups 5 and higher from the end result we can add a WHERE clause to the query.

select group_key, substr( string, 2 ) as string

from   t

WHERE  VAL IS NOT NULL

model

  return updated rows

  partition by ( group_key )

  dimension by ( row_number() over (partition by group_key order by val) as position )

  measures     ( cast( val as varchar2(65) ) as string  ) -- Note 1

  rules

    upsert

    iterate( 6 ) -- Note 2

      until ( presentv(string[iteration_number+2],1,0) = 0 )

    ( string[0] = string[0] || ',' || string[iteration_number+1] )

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,a,b,b

Group 5    a,b,d,e

Group 8    a,a,b,b

In the next tutorial, MODEL Method 1 - Explained, we look at how this technique works.

Gotchas

Distinct

If we attempt to produce distinct values in our strings using SELECT DISTINCT instead of an inline view we will be rewarded with incorrect results. This is because SELECT DISTINCT is evaluated after the MODEL clause is applied, not before.

select DISTINCT group_key, substr( string, 2 ) as string

from   t

model

  return updated rows

  partition by ( group_key )

  dimension by ( row_number() over (partition by group_key order by val) as position )

  measures     ( cast( val as varchar2(65) ) as string  )

  rules

    upsert

    iterate( 6 )

      until ( presentv(string[iteration_number+2],1,0) = 0 )

    ( string[0] = string[0] || ',' || string[iteration_number+1] )

order by group_key ;

GROUP_KEY  STRING

---------- ---------------

Group 1    a

Group 2    a,b

Group 3    a,b,c

Group 4    a,a,b,b

Group 5    a,b,d,e,

Group 6    (null)

Group 7    ,

Group 8    a,a,b,b,,

(null)     (null)

IS PRESENT

When I originally wrote this solution I tried to use the following UNTIL clause

iterate( 6 )

  until ( NOT( STRING[ITERATION_NUMBER+2] IS PRESENT ) = 0 )

Unfortunately it produced some rather nasty errors.

ERROR at line 1:

ORA-03113: end-of-file on communication channel

ERROR:

ORA-03114: not connected to ORACLE

ksedmp: internal or fatal error

ORA-07445: exception encountered: core dump

  [ACCESS_VIOLATION] [_qcss_process_expr+226] [PC:0x1E57EC6] [ADDR:0x18] [UNABLE_TO_READ] []

Current SQL statement for this session:

select group_key, string

from   t

...

----- Call Stack Trace -----

calling              call     entry                argument values in hex

location             type     point                (? means dubious value)

-------------------- -------- -------------------- ----------------------------

_qcss_process_expr+           00000000

_qcss_process_cond+  CALLrel  _qcss_process_expr+  8CA6754 8171468 841BB64

76                            0                    8CA6698

_qcss_process_sprea  CALLrel  _qcss_process_cond+  8CA6754 8171468 841BB58

dsheet+102                    0                    8CA6698

_qcssSemanticCbks+4  CALLrel  _qcss_process_sprea  8CA6754 8171468 841A5E0

5                             dsheet+0

Oracle

Tuesday, September 20, 2011

Rows to Strings in Different Methods

Object Method

Prerequisites

Test Data

The Solution

Tom Kyte's STRAGG

Prerequisites

Test Data

Aggregate Solution

Analytic Version

XML - CSV Method

MAX DECODE Method

Test Data

Solution

LEAD Method

Test Data

Solution

MODEL Method 1

Test Data

Solution

Gotchas

Distinct

IS PRESENT

No comments:

Post a Comment

Oracle Escape Characters

Search This Blog