In this section show how to collect values from multiple rows into a single, delimited string. This process is commonly known as "string aggregation". In practice it takes a table that looks like this
Second, since COLLECT returns a nested table which is, by definition, an unordered set of data we cannot guarantee the values in our delimited string will appear in any given order. If you need ordered values you can add sorting logic to the NESTED_TABLE_TO_CSV function.
To filter out duplicate values, like those in Groups 4 and 8, we simply add the DISTINCT argument to the COLLECT function.
While Tom Kyte himself says he no longer uses STRAGG (see The Tom Kyte Blog:Stringing them up...) some people still prefer it over other solutions. STRAGG is also a good case study for anyone interested in using the Extensibility Framework to write their own aggregate functions.
To only see distinct values in our result strings we can add the DISTINCT argument to the STRAGG function call. Note how the results for Groups 4 and 8 below differ from those above.
In the XML Method tutorial we saw how to use the XMLAGG and XMLELEMENT functions to create an XML delimited string like this.
GROUP_KEY VAL
---------- ----------
Group 1 a
Group 2 a
Group 2 b
Group 3 a
Group 3 b
Group 3 c
Group 4 a
Group 4 a
Group 4 b
Group 4 b
Group 5 a
Group 5 b
Group 5 d
Group 5 e
Group 5 (null)
Group 6 (null)
Group 7 (null)
Group 7 (null)
Group 8 a
Group 8 a
Group 8 b
Group 8 b
Group 8 (null)
Group 8 (null)
(null) (null)
and yields a set of strings that look like this. GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e,
Group 6 (null)
Group 7 ,
Group 8 a,a,b,b,
(null) (null)
Over the years many techniques for performing string aggregation have appeared. They all have their strengths and weaknesses so a few different methods will be examined here. Of the methods I've found on the web to-date we will explore four popular approaches. Hierarchical Method
This shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like these
GROUP_KEY VAL
---------- ----------
Group 3 a
Group 3 b
Group 3 c
and yields a string like this.
GROUP_KEY STRING
---------- ---------------
Group 3 a,b,c
Oracle's hierarchical features will be used to achieve this result. Other tutorials in this section present alternate approaches. For tutorials on understanding Oracle's hierarchical features see SQL Features Tutorials: Hierarchical Data.
First, let's take a look at the test data.
desc t
Name Null? Type
----------------------------- -------- --------------------
GROUP_KEY VARCHAR2(10)
VAL VARCHAR2(10)
set null "(null)"
break on group_key skip 1 duplicates
select * from t order by group_key, val ;
GROUP_KEY VAL
---------- ----------
Group 1 a
Group 2 a
Group 2 b
Group 3 a
Group 3 b
Group 3 c
Group 4 a
Group 4 a
Group 4 b
Group 4 b
Group 5 a
Group 5 b
Group 5 d
Group 5 e
Group 5 (null)
Group 6 (null)
Group 7 (null)
Group 7 (null)
Group 8 a
Group 8 a
Group 8 b
Group 8 b
Group 8 (null)
Group 8 (null)
(null) (null)
Now here is the solution.
clear breaks
select
tb.group_key ,
substr
( sys_connect_by_path( tb.val, ',' ) ,
2
) as string
from
( select
group_key ,
val ,
row_number() over
( partition by group_key
order by val
)
as val_index
from
t
) tb
where
connect_by_isleaf = 1
connect by
val_index = prior val_index + 1 and
group_key = prior group_key
start with
val_index = 1
;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e,
Group 6 (null)
Group 7 ,
Group 8 a,a,b,b,,
(null) (null)
To control the order of the delimited values, change the "ORDER BY" clause in the ROW_NUMBER() function.
select
tb.group_key ,
substr
( sys_connect_by_path( tb.val, ',' ) ,
2
) as string
from
( select
group_key ,
val ,
row_number() over
( partition by group_key
ORDER BY VAL DESC -- sort in descending order this time
)
as val_index
from
t
) tb
where
connect_by_isleaf = 1
connect by
val_index = prior val_index + 1 and
group_key = prior group_key
start with
val_index = 1
;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 b,a
Group 3 c,b,a
Group 4 b,b,a,a
Group 5 ,e,d,b,a
Group 6 (null)
Group 7 ,
Group 8 ,,b,b,a,a
(null) (null)
To filter out duplicate VAL values, like those in Group 4, use a subquery as indicated below.
select
tb.group_key ,
substr
( sys_connect_by_path( tb.val, ',' ) ,
2
) as string
from
( select
group_key ,
val ,
row_number() over
( partition by group_key
order by val
)
as val_index
from
( SELECT DISTINCT GROUP_KEY, VAL FROM T ) tc -- provides distinct values
) tb
where
connect_by_isleaf = 1
connect by
val_index = prior val_index + 1 and
group_key = prior group_key
start with
val_index = 1
;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,b
Group 5 a,b,d,e,
Group 6 (null)
Group 7 (null)
Group 8 a,b,
(null) (null)
To filter null values out of your results, like the ones in Groups 5 and 6, include a WHERE clause in the FROM clause subquery.
select
tb.group_key ,
substr
( sys_connect_by_path( tb.val, ',' ) ,
2
) as string
from
( select
group_key ,
val ,
row_number() over
( partition by group_key
order by val
)
as val_index
from
t
WHERE
VAL IS NOT NULL -- filters out null VAL values
) tb
where
connect_by_isleaf = 1
connect by
val_index = prior val_index + 1 and
group_key = prior group_key
start with
val_index = 1
;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e
Group 8 a,a,b,b
Object Method
This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like theseGROUP_KEY VAL
---------- ----------
Group 3 a
Group 3 b
Group 3 c
and yields a string like this. GROUP_KEY STRING
---------- ---------------
Group 3 a,b,c
Oracle's object relational features will be used to achieve this result. Other tutorials in this section present alternate approaches. Prerequisites
This method uses the object relational SQL COLLECT function which was introduced in Oracle 10g. It therefore will not work in earlier versions. It also requires a VARCHAR2 table type and a custom function to work. The VARCHAR2 table type is created in the Setup topic for this section.desc varchar2_table_type
varchar2_table_type TABLE OF VARCHAR2(10)
The function is created as follows. create function nested_table_to_csv
( p_table in varchar2_table_type ,
p_delimiter in varchar2 default ','
)
return varchar2
is
----------------------------------------------------------------------
-- Usage:
--
-- p_table
-- - contains the table you wish to convert to a string
--
-- p_delimiter
-- - contains the varchar2 string you want to see between each
-- value in the returned string
-- - default value is ',' (comma)
-- - set it to null if you don't want any delimiters
-- - e.g. p_delimiter => '/' returns
--
-- a/b/c/d
----------------------------------------------------------------------
i binary_integer := p_table.first ;
v_return varchar2(32767) := null ;
begin
while i is not null
loop
if i = p_table.first then
v_return := p_table(i);
else
v_return := v_return || p_delimiter || p_table(i);
end if;
i := p_table.next(i) ;
end loop;
return( v_return );
end nested_table_to_csv;
/
show errors
No errors.
Test Data
desc t
Name Null? Type
----------------------------- -------- --------------------
GROUP_KEY VARCHAR2(10)
VAL VARCHAR2(10)
set null "(null)"
break on group_key skip 1 duplicates
select * from t order by group_key, val ;
GROUP_KEY VAL
---------- ----------
Group 1 a
Group 2 a
Group 2 b
Group 3 a
Group 3 b
Group 3 c
Group 4 a
Group 4 a
Group 4 b
Group 4 b
Group 5 a
Group 5 b
Group 5 d
Group 5 e
Group 5 (null)
Group 6 (null)
Group 7 (null)
Group 7 (null)
Group 8 a
Group 8 a
Group 8 b
Group 8 b
Group 8 (null)
Group 8 (null)
(null) (null)
The Solution
We can now run the solution, which uses the type and function we created above in conjunction with COLLECT .clear breaks
select
group_key ,
NESTED_TABLE_TO_CSV
( cast( COLLECT(val) as VARCHAR2_TABLE_TYPE )
) as string
from
t
group by
group_key
;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e
Group 6 (null)
Group 7 (null)
Group 8 a,a,b,b
(null) (null)
There are two important things to note about this solution. First, unlike other solutions we have seen so far, these results do not include empty values in the strings for Groups 5 and 8. As well, the string for "Group 7" only contains one null, not two. This is because COLLECT is an aggregate function and all aggregate functions, except COUNT(*) and GROUPING, ignore null values. Second, since COLLECT returns a nested table which is, by definition, an unordered set of data we cannot guarantee the values in our delimited string will appear in any given order. If you need ordered values you can add sorting logic to the NESTED_TABLE_TO_CSV function.
To filter out duplicate values, like those in Groups 4 and 8, we simply add the DISTINCT argument to the COLLECT function.
select
group_key ,
nested_table_to_csv
( cast( collect( DISTINCT val ) as varchar2_table_type )
) as string
from
t
group by
group_key
;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,b
Group 5 a,b,d,e
Group 6 (null)
Group 7 (null)
Group 8 a,b
(null) (null)
Tom Kyte's STRAGG
This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like theseGROUP_KEY VAL
---------- ----------
Group 3 a
Group 3 b
Group 3 c
and yields a string like this. GROUP_KEY STRING
---------- ---------------
Group 3 a,b,c
A technique originally published by Tom Kyte at Ask Tom "Function to concatenate output" employing a user-defined function called STRAGG will be demonstrated. Other tutorials in this section discuss alternate approaches. While Tom Kyte himself says he no longer uses STRAGG (see The Tom Kyte Blog:Stringing them up...) some people still prefer it over other solutions. STRAGG is also a good case study for anyone interested in using the Extensibility Framework to write their own aggregate functions.
Prerequisites
create or replace type stragg_type as object
(
string varchar2(4000),
static function ODCIAggregateInitialize
( sctx in out stragg_type )
return number ,
member function ODCIAggregateIterate
( self in out stragg_type ,
value in varchar2
) return number ,
member function ODCIAggregateTerminate
( self in stragg_type,
returnvalue out varchar2,
flags in number
) return number ,
member function ODCIAggregateMerge
( self in out stragg_type,
ctx2 in stragg_type
) return number
);
/
create or replace type body stragg_type
is
static function ODCIAggregateInitialize
( sctx in out stragg_type )
return number
is
begin
sctx := stragg_type( null ) ;
return ODCIConst.Success ;
end;
member function ODCIAggregateIterate
( self in out stragg_type ,
value in varchar2
) return number
is
begin
self.string := self.string || ',' || value ;
return ODCIConst.Success;
end;
member function ODCIAggregateTerminate
( self in stragg_type ,
returnvalue out varchar2 ,
flags in number
) return number
is
begin
returnValue := ltrim( self.string, ',' );
return ODCIConst.Success;
end;
member function ODCIAggregateMerge
( self in out stragg_type ,
ctx2 in stragg_type
) return number
is
begin
self.string := self.string || ctx2.string;
return ODCIConst.Success;
end;
end;
/
create or replace function stragg
( input varchar2 )
return varchar2
deterministic
parallel_enable
aggregate using stragg_type
;
/
Test Data
desc t
Name Null? Type
----------------------------- -------- --------------------
GROUP_KEY VARCHAR2(10)
VAL VARCHAR2(10)
set null "(null)"
break on group_key skip 1 duplicates
select * from t order by group_key, val ;
GROUP_KEY VAL
---------- ----------
Group 1 a
Group 2 a
Group 2 b
Group 3 a
Group 3 b
Group 3 c
Group 4 a
Group 4 a
Group 4 b
Group 4 b
Group 5 a
Group 5 b
Group 5 d
Group 5 e
Group 5 (null)
Group 6 (null)
Group 7 (null)
Group 7 (null)
Group 8 a
Group 8 a
Group 8 b
Group 8 b
Group 8 (null)
Group 8 (null)
(null) (null)
Aggregate Solution
select
group_key ,
STRAGG( VAL ) as string
from
t
group by
group_key
;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,b,b,a
Group 5 a,b,d,e
Group 6 (null)
Group 7 (null)
Group 8 a,a,b,b
(null) (null)
It is important to note that, unlike some other solutions we have seen, these results do not include empty values in the strings for Groups 5 and 8. As well, the string for "Group 7" only contains one null, not two. This is because STRAGG, and all other aggregate functions except COUNT(*) and GROUPING, ignore null values. To only see distinct values in our result strings we can add the DISTINCT argument to the STRAGG function call. Note how the results for Groups 4 and 8 below differ from those above.
select
group_key ,
stragg( DISTINCT val ) as string
from
t
group by
group_key
;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,b
Group 5 a,b,d,e
Group 6 (null)
Group 7 (null)
Group 8 a,b
(null) (null)
Analytic Version
STRAGG can also be used as an analytical function without changing any of the underlying code.break on group_key skip 1 duplicates
select
group_key ,
val ,
stragg( val ) OVER
(
PARTITION BY GROUP_KEY
ORDER BY VAL
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
as string
from
t
;
GROUP_KEY VAL STRING
---------- ---------- ---------------
Group 1 a a
Group 2 a a,b
Group 2 b a,b
Group 3 a a,b,c
Group 3 b a,b,c
Group 3 c a,b,c
Group 4 a a,a,b,b
Group 4 a a,a,b,b
Group 4 b a,a,b,b
Group 4 b a,a,b,b
Group 5 a a,b,d,e
Group 5 b a,b,d,e
Group 5 d a,b,d,e
Group 5 e a,b,d,e
Group 5 (null) a,b,d,e
Group 6 (null) (null)
Group 7 (null) (null)
Group 7 (null) (null)
Group 8 a a,a,b,b
Group 8 a a,a,b,b
Group 8 b a,a,b,b
Group 8 b a,a,b,b
Group 8 (null) a,a,b,b
Group 8 (null) a,a,b,b
(null) (null) (null)
To change the sort order of values in the string simply change the
ORDER BY VAL
clause. break on group_key skip 1 duplicates
select
group_key ,
val ,
stragg( val ) OVER
(
partition by group_key
order by val DESC
range between unbounded preceding and unbounded following
)
as string
from
t
;
GROUP_KEY VAL STRING
---------- ---------- ---------------
Group 1 a a
Group 2 b b,a
Group 2 a b,a
Group 3 c c,b,a
Group 3 b c,b,a
Group 3 a c,b,a
Group 4 b b,b,a,a
Group 4 b b,b,a,a
Group 4 a b,b,a,a
Group 4 a b,b,a,a
Group 5 (null) e,d,b,a
Group 5 e e,d,b,a
Group 5 d e,d,b,a
Group 5 b e,d,b,a
Group 5 a e,d,b,a
Group 6 (null) (null)
Group 7 (null) (null)
Group 7 (null) (null)
Group 8 (null) b,b,a,a
Group 8 (null) b,b,a,a
Group 8 b b,b,a,a
Group 8 b b,b,a,a
Group 8 a b,b,a,a
Group 8 a b,b,a,a
(null) (null) (null)
XML - CSV Method
This tutorial shows how to collect values from multiple rows into a single, CSV delimited string. This is also known as "string aggregation". It takes values like theseGROUP_KEY VAL
---------- ----------
Group 3 a
Group 3 b
Group 3 c
and yields a string like this one. GROUP_KEY STRING
---------- ---------------
Group 3 a,b,c
Oracle's XML features will be used to achieve this result. In the XML Method tutorial we saw how to use the XMLAGG and XMLELEMENT functions to create an XML delimited string like this.
column string format a62
select
group_key ,
XMLAGG
( XMLELEMENT( "V", val )
).GETSTRINGVAL()
as string
from
t
group by
group_key
;
GROUP_KEY STRING
---------- --------------------------------------------------------------
Group 1 <V>a</V>
Group 2 <V>a</V><V>b</V>
Group 3 <V>a</V><V>b</V><V>c</V>
Group 4 <V>a</V><V>b</V><V>b</V><V>a</V>
Group 5 <V>a</V><V>b</V><V></V><V>e</V><V>d</V>
Group 6 <V></V>
Group 7 <V></V><V></V>
Group 8 <V>a</V><V>b</V><V>b</V><V></V><V></V><V>a</V>
(null) <V></V>
By adding calls to EXTRACT and LTRIM we can convert this solution into one that gives us comma separated values. select
group_key ,
LTRIM
( EXTRACT
( xmlagg( xmlelement( "V", ',' || val ) ),
'/V/text()'
) ,
','
) as string
from
t
group by
group_key
;
GROUP_KEY STRING
---------- --------------------------------------------------------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,b,b,a
Group 5 a,b,,e,d
Group 6 (null)
Group 7 (null)
Group 8 a,b,b,,,a
(null) (null)
To control the sort order of values within the string we can use XMLAGG's ORDER BY clause. select
group_key ,
ltrim
( extract
( xmlagg
( xmlelement( "V", ',' || val )
ORDER BY VAL DESC
),
'/V/text()'
) ,
','
) as string
from
t
group by
group_key
;
GROUP_KEY STRING
---------- --------------------------------------------------------------
Group 1 a
Group 2 b,a
Group 3 c,b,a
Group 4 b,b,a,a
Group 5 e,d,b,a
Group 6 (null)
Group 7 (null)
Group 8 b,b,a,a
(null) (null)
As with the original solution, you can use a FROM clause subquery to produce distinct values and filter out null values if you need to. select
group_key ,
ltrim
( extract
( xmlagg
( xmlelement( "V", ',' || val )
order by val
),
'/V/text()'
) ,
','
) as string
from
( SELECT DISTINCT GROUP_KEY, VAL -- produces distinct values
FROM T
WHERE VAL IS NOT NULL -- filters our null values
) tb
group by
group_key
;
GROUP_KEY STRING
---------- --------------------------------------------------------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,b
Group 5 a,b,d,e
Group 8 a,b
MAX DECODE Method
This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like theseGROUP_KEY VAL
---------- ----------
Group 3 a
Group 3 b
Group 3 c
and yields a string like this. GROUP_KEY STRING
---------- ---------------
Group 3 a,b,c
The solution to follow is a good choice when the maximum number of values is known and small. See other solutions in this section when this is not the case. Test Data
desc t
Name Null? Type
----------------------------- -------- --------------------
GROUP_KEY VARCHAR2(10)
VAL VARCHAR2(10)
set null "(null)"
break on group_key skip 1 duplicates
select * from t order by group_key, val ;
GROUP_KEY VAL
---------- ----------
Group 1 a
Group 2 a
Group 2 b
Group 3 a
Group 3 b
Group 3 c
Group 4 a
Group 4 a
Group 4 b
Group 4 b
Group 5 a
Group 5 b
Group 5 d
Group 5 e
Group 5 (null)
Group 6 (null)
Group 7 (null)
Group 7 (null)
Group 8 a
Group 8 a
Group 8 b
Group 8 b
Group 8 (null)
Group 8 (null)
(null) (null)
Solution
In this solution, which uses the DECODE and MAX functions, we assume the total number of values ever collected will not exceed 6.clear breaks
select
group_key ,
max( decode( val_number, 1 , val, null ) ) ||
max( decode( val_number, 2 , ',' || val, null ) ) ||
max( decode( val_number, 3 , ',' || val, null ) ) ||
max( decode( val_number, 4 , ',' || val, null ) ) ||
max( decode( val_number, 5 , ',' || val, null ) ) ||
max( decode( val_number, 6 , ',' || val, null ) ) as string
from
( select
group_key,
row_number() over ( partition by group_key order by val ) as val_number ,
val
from t
) ta
group by group_key
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e,
Group 6 (null)
Group 7 ,
Group 8 a,a,b,b,,
(null) (null)
To control the order of the delimited values, change the ORDER BY clause in the ROW_NUMBER() argument. select
group_key ,
max( decode( val_number, 1 , val, null ) ) ||
max( decode( val_number, 2 , ',' || val, null ) ) ||
max( decode( val_number, 3 , ',' || val, null ) ) ||
max( decode( val_number, 4 , ',' || val, null ) ) ||
max( decode( val_number, 5 , ',' || val, null ) ) ||
max( decode( val_number, 6 , ',' || val, null ) ) as string
from
( select
group_key,
row_number() over ( partition by group_key ORDER BY VAL DESC ) as val_number ,
val
from t
) t_a
group by group_key
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 b,a
Group 3 c,b,a
Group 4 b,b,a,a
Group 5 ,e,d,b,a
Group 6 (null)
Group 7 ,
Group 8 ,,b,b,a,a
(null) (null)
To filter duplicate VAL values, like those in Groups 4, 7, and 8 from the end result we use an inline view in the FROM clause. clear breaks
select
group_key ,
max( decode( val_number, 1 , val, null ) ) ||
max( decode( val_number, 2 , ',' || val, null ) ) ||
max( decode( val_number, 3 , ',' || val, null ) ) ||
max( decode( val_number, 4 , ',' || val, null ) ) ||
max( decode( val_number, 5 , ',' || val, null ) ) ||
max( decode( val_number, 6 , ',' || val, null ) ) as string
from
( select
group_key,
row_number() over ( partition by group_key order by val ) as val_number ,
val
from ( SELECT DISTINCT GROUP_KEY, VAL FROM T ) T_A
) t_b
group by group_key
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,b
Group 5 a,b,d,e,
Group 6 (null)
Group 7 (null)
Group 8 a,b,
(null) (null)
To filter null VAL values, like the ones in Groups 5 and higher from the end result we can add a WHERE clause to the query. select
group_key ,
max( decode( val_number, 1 , val, null ) ) ||
max( decode( val_number, 2 , ',' || val, null ) ) ||
max( decode( val_number, 3 , ',' || val, null ) ) ||
max( decode( val_number, 4 , ',' || val, null ) ) ||
max( decode( val_number, 5 , ',' || val, null ) ) ||
max( decode( val_number, 6 , ',' || val, null ) ) as string
from
( select
group_key,
row_number() over ( partition by group_key order by val ) as val_number ,
val
from t
WHERE VAL IS NOT NULL
) ta
group by group_key
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e
Group 8 a,a,b,b
LEAD Method
This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like theseGROUP_KEY VAL
---------- ----------
Group 3 a
Group 3 b
Group 3 c
and yields a string like this. GROUP_KEY STRING
---------- ---------------
Group 3 a,b,c
The solution to follow is a good choice when the maximum number of values is known and small. See other solutions in this section when this is not the case. Test Data
desc t
Name Null? Type
----------------------------- -------- --------------------
GROUP_KEY VARCHAR2(10)
VAL VARCHAR2(10)
set null "(null)"
break on group_key skip 1 duplicates
select * from t order by group_key, val ;
GROUP_KEY VAL
---------- ----------
Group 1 a
Group 2 a
Group 2 b
Group 3 a
Group 3 b
Group 3 c
Group 4 a
Group 4 a
Group 4 b
Group 4 b
Group 5 a
Group 5 b
Group 5 d
Group 5 e
Group 5 (null)
Group 6 (null)
Group 7 (null)
Group 7 (null)
Group 8 a
Group 8 a
Group 8 b
Group 8 b
Group 8 (null)
Group 8 (null)
(null) (null)
Solution
In this solution, which uses the LEAD analytic function, we assume the total number of values ever collected will not exceed 6.clear breaks
select group_key, string
from
(
select
group_key ,
row_number() over ( partition by group_key order by val ) as position ,
val ||
lead( ','||val, 1 ) over ( partition by group_key order by val ) ||
lead( ','||val, 2 ) over ( partition by group_key order by val ) ||
lead( ','||val, 3 ) over ( partition by group_key order by val ) ||
lead( ','||val, 4 ) over ( partition by group_key order by val ) ||
lead( ','||val, 5 ) over ( partition by group_key order by val )
as string
from t
)
where position = 1
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e,
Group 6 (null)
Group 7 ,
Group 8 a,a,b,b,,
(null) (null)
To control the order of the delimited values, change the ORDER BY clauses. select group_key, string
from
(
select
group_key ,
row_number() over ( partition by group_key ORDER BY VAL DESC) as position ,
val ||
lead( ','||val, 1 ) over ( partition by group_key ORDER BY VAL DESC) ||
lead( ','||val, 2 ) over ( partition by group_key ORDER BY VAL DESC) ||
lead( ','||val, 3 ) over ( partition by group_key ORDER BY VAL DESC) ||
lead( ','||val, 4 ) over ( partition by group_key ORDER BY VAL DESC) ||
lead( ','||val, 5 ) over ( partition by group_key ORDER BY VAL DESC)
as string
from t
)
where position = 1
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 b,a
Group 3 c,b,a
Group 4 b,b,a,a
Group 5 ,e,d,b,a
Group 6 (null)
Group 7 ,
Group 8 ,,b,b,a,a
(null) (null)
To filter duplicate VAL values, like those in Groups 4, 7, and 8 from the end result we use an inline view in the FROM clause. clear breaks
select group_key, string
from
(
select
group_key ,
row_number() over ( partition by group_key order by val ) as position ,
val ||
lead( ','||val, 1 ) over ( partition by group_key order by val ) ||
lead( ','||val, 2 ) over ( partition by group_key order by val ) ||
lead( ','||val, 3 ) over ( partition by group_key order by val ) ||
lead( ','||val, 4 ) over ( partition by group_key order by val ) ||
lead( ','||val, 5 ) over ( partition by group_key order by val )
as string
from ( SELECT DISTINCT GROUP_KEY, VAL FROM T ) T_A
)
where position = 1
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,b
Group 5 a,b,d,e,
Group 6 (null)
Group 7 (null)
Group 8 a,b,
(null) (null)
To filter null VAL values, like the ones in Groups 5 and higher from the end result we can add a WHERE clause to the query. select group_key, string
from
(
select
group_key ,
row_number() over ( partition by group_key order by val ) as position ,
val ||
lead( ','||val, 1 ) over ( partition by group_key order by val ) ||
lead( ','||val, 2 ) over ( partition by group_key order by val ) ||
lead( ','||val, 3 ) over ( partition by group_key order by val ) ||
lead( ','||val, 4 ) over ( partition by group_key order by val ) ||
lead( ','||val, 5 ) over ( partition by group_key order by val )
as string
from t
WHERE VAL IS NOT NULL
)
where position = 1
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e
Group 8 a,a,b,b
MODEL Method 1
This tutorial shows how to collect values from multiple rows into a single, comma delimited string. This is also known as "string aggregation". It takes values like theseGROUP_KEY VAL
---------- ----------
Group 3 a
Group 3 b
Group 3 c
and yields a string like this. GROUP_KEY STRING
---------- ---------------
Group 3 a,b,c
While the solution to follow, which we will refer to as MODEL Method 1, is the simplest of the three MODEL solution presented in this section it unfortunately scales the worst when a large number of values is selected or very wide STRING values are produced (see MODEL Clause: MODEL Performance Tuning). MODEL Method 2 and MODEL Method 3 provide alternative MODEL techniques that scale better than MODEL Method 1, though they are more complex. Test Data
desc t
Name Null? Type
----------------------------- -------- --------------------
GROUP_KEY VARCHAR2(10)
VAL VARCHAR2(10)
set null "(null)"
break on group_key skip 1 duplicates
select * from t order by group_key, val ;
GROUP_KEY VAL
---------- ----------
Group 1 a
Group 2 a
Group 2 b
Group 3 a
Group 3 b
Group 3 c
Group 4 a
Group 4 a
Group 4 b
Group 4 b
Group 5 a
Group 5 b
Group 5 d
Group 5 e
Group 5 (null)
Group 6 (null)
Group 7 (null)
Group 7 (null)
Group 8 a
Group 8 a
Group 8 b
Group 8 b
Group 8 (null)
Group 8 (null)
(null) (null)
Solution
This solution uses the SQL MODEL clause which was introduced in Oracle 10g. If you are unfamiliar with MODEL you may wish to review the SQL Features Tutorials: MODEL Clause tutorial before proceeding.clear breaks
select group_key, substr( string, 2 ) as string
from t
model
return updated rows
partition by ( group_key )
dimension by ( row_number() over (partition by group_key order by val) as position )
measures ( cast( val as varchar2(65) ) as string ) -- Note 1
rules
upsert
iterate( 6 ) -- Note 2
until ( presentv(string[iteration_number+2],1,0) = 0 )
( string[0] = string[0] || ',' || string[iteration_number+1] )
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e,
Group 6 (null)
Group 7 ,
Group 8 a,a,b,b,,
(null) (null)
Notes: 1. 65 = v * n + (n-1) where
o v = maximum length of VAL values (this is 10 in our test table)
o n = maximum number of rows with the same GROUP_KEY (6 in our test table)
2. Use any number here greater than or equal to n.
To control the order of the delimited values, change the ORDER BY clause in the arguments for ROW_NUMBER(). select group_key, substr( string, 2 ) as string
from t
model
return updated rows
partition by ( group_key )
dimension by ( row_number() over (partition by group_key ORDER BY VAL DESC) as position )
measures ( cast( val as varchar2(65) ) as string )
rules
upsert
iterate( 6 )
until ( presentv(string[iteration_number+2],1,0) = 0 )
( string[0] = string[0] || ',' || string[iteration_number+1] )
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 b,a
Group 3 c,b,a
Group 4 b,b,a,a
Group 5 ,e,d,b,a
Group 6 (null)
Group 7 ,
Group 8 ,,b,b,a,a
(null) (null)
To filter duplicate VAL values, like those in Groups 4, 7, and 8 from the end result we use an inline view in the FROM clause. select group_key, substr( string, 2 ) as string
from ( SELECT DISTINCT GROUP_KEY, VAL FROM T ) T
model
return updated rows
partition by ( group_key )
dimension by ( row_number() over (partition by group_key order by val) as position )
measures ( cast( val as varchar2(65) ) as string )
rules
upsert
iterate( 6 )
until ( presentv(string[iteration_number+2],1,0) = 0 )
( string[0] = string[0] || ',' || string[iteration_number+1] )
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,b
Group 5 a,b,d,e,
Group 6 (null)
Group 7 (null)
Group 8 a,b,
(null) (null)
To filter null VAL values, like the ones in Groups 5 and higher from the end result we can add a WHERE clause to the query. select group_key, substr( string, 2 ) as string
from t
WHERE VAL IS NOT NULL
model
return updated rows
partition by ( group_key )
dimension by ( row_number() over (partition by group_key order by val) as position )
measures ( cast( val as varchar2(65) ) as string ) -- Note 1
rules
upsert
iterate( 6 ) -- Note 2
until ( presentv(string[iteration_number+2],1,0) = 0 )
( string[0] = string[0] || ',' || string[iteration_number+1] )
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e
Group 8 a,a,b,b
In the next tutorial, MODEL Method 1 - Explained, we look at how this technique works. Gotchas
Distinct
If we attempt to produce distinct values in our strings using SELECT DISTINCT instead of an inline view we will be rewarded with incorrect results. This is because SELECT DISTINCT is evaluated after the MODEL clause is applied, not before.
select DISTINCT group_key, substr( string, 2 ) as string
from t
model
return updated rows
partition by ( group_key )
dimension by ( row_number() over (partition by group_key order by val) as position )
measures ( cast( val as varchar2(65) ) as string )
rules
upsert
iterate( 6 )
until ( presentv(string[iteration_number+2],1,0) = 0 )
( string[0] = string[0] || ',' || string[iteration_number+1] )
order by group_key ;
GROUP_KEY STRING
---------- ---------------
Group 1 a
Group 2 a,b
Group 3 a,b,c
Group 4 a,a,b,b
Group 5 a,b,d,e,
Group 6 (null)
Group 7 ,
Group 8 a,a,b,b,,
(null) (null)
IS PRESENT
When I originally wrote this solution I tried to use the following UNTIL clauseiterate( 6 )
until ( NOT( STRING[ITERATION_NUMBER+2] IS PRESENT ) = 0 )
Unfortunately it produced some rather nasty errors. ERROR at line 1:
ORA-03113: end-of-file on communication channel
ERROR:
ORA-03114: not connected to ORACLE
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump
[ACCESS_VIOLATION] [_qcss_process_expr+226] [PC:0x1E57EC6] [ADDR:0x18] [UNABLE_TO_READ] []
Current SQL statement for this session:
select group_key, string
from t
...
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
_qcss_process_expr+ 00000000
226
_qcss_process_cond+ CALLrel _qcss_process_expr+ 8CA6754 8171468 841BB64
76 0 8CA6698
_qcss_process_sprea CALLrel _qcss_process_cond+ 8CA6754 8171468 841BB58
dsheet+102 0 8CA6698
_qcssSemanticCbks+4 CALLrel _qcss_process_sprea 8CA6754 8171468 841A5E0
5 dsheet+0
No comments:
Post a Comment