Articles
SQL & PL/SQL
![]() | Oracle Database 11g: by Arup Nanda |
Present information in a spreadsheet-type crosstab report from any relational table using simple SQL, and store any data from a crosstab table to a relational table.
As you know, relational tables are, well, tabular—that is, they are presented in a column-value pair. Consider the case of a table named CUSTOMERS.
SQL> desc customers Name Null? Type ----------------------------------------- -------- --------------------------- CUST_ID NUMBER(10) CUST_NAME VARCHAR2(20) STATE_CODE VARCHAR2(2) TIMES_PURCHASED NUMBER(3)
select cust_id, state_code, times_purchased from customers order by cust_id;
CUST_ID STATE_CODE TIMES_PURCHASED
------- ---------- ---------------
1 CT 1
2 NY 10
3 NJ 2
4 NY 4
...
and so on ...
Note how the data is represented as rows of values: For each customer, the record shows the customer's home state and how many times the customer purchased something from the store. As the customer purchases more items from the store, the column times_purchased is updated. Now consider a case where you want to have a report of the purchase frequency each state - that is, how many customers bought something only once, twice, thrice and so on, from each state. In regular SQL, you can issue the following statement:
select state_code, times_purchased, count(1) cnt from customers group by state_code, times_purchased;
ST TIMES_PURCHASED CNT
-- --------------- ----------
CT 0 90
CT 1 165
CT 2 179
CT 3 173
CT 4 173
CT 5 152
...
and so on ...
This is the information you want but it's a little hard to read. A better way to represent the same data may be through the use of crosstab reports, in which you can organized the data vertically and states horizontally, just like a spreadsheet:
Times_purchased
CT NY NJ ...
and so on ...
1 0 1 0 ...
2 23 119 37 ...
3 17 45 1 ...
...
and so on ...
Prior to Oracle Database 11g, you would do that via some sort of a decode function for each value and write each distinct value as a separate column. The technique is quite nonintuitive however. Fortunately, you now have a great new feature called PIVOT for presenting any query in the crosstab format using a new operator, appropriately named pivot. Here is how you write the query:
select * from (
select times_purchased, state_code
from customers t
)
pivot
(
count(state_code)
for state_code in ('NY','CT','NJ','FL','MO')
)
order by times_purchased
/
Here is the output: . TIMES_PURCHASED 'NY' 'CT' 'NJ' 'FL' 'MO' --------------- ---------- ---------- ---------- ---------- ---------- 0 16601 90 0 0 0 1 33048 165 0 0 0 2 33151 179 0 0 0 3 32978 173 0 0 0 4 33109 173 0 1 0 ... and so on ...

Figure 1 Traditional tabular representation
In a crosstab report, you want to transpose the Times Purchased column to the header row as shown in Figure 2. The column becomes the row, as if the column were rotated 90 degrees anti-clockwise to become the header row. This figurative rotation needs to have a pivot point and in this case the pivot point happens to be the count(state_code) expression.
Figure 2 Pivoted representation
This expression needs to be in the syntax of the query:
...
pivot
(
count(state_code)
for state_code in ('NY','CT','NJ','FL','MO')
)
...
Note the header rows in the output:
. TIMES_PURCHASED 'NY' 'CT' 'NJ' 'FL' 'MO' --------------- ---------- ---------- ---------- ---------- ----------
select * from (
select times_purchased as "Puchase Frequency", state_code
from customers t
)
pivot
(
count(state_code)
for state_code in ('NY' as "New York",'CT' "Connecticut",'NJ' "New Jersey",'FL' "Florida",'MO' as "Missouri")
)
order by 1
/
Puchase Frequency New York Connecticut New Jersey Florida Missouri
----------------- ---------- ----------- ---------- ---------- ----------
0 16601 90 0 0 0
1 33048 165 0 0 0
2 33151 179 0 0 0
3 32978 173 0 0 0
4 33109 173 0 1 0
...
and so on ...
The FOR clause can have aliases for the values there, which will become the column headers. For matter there is anti-matter; for pivot there should be "unpivot," right?
Humor aside, there is a genuine need for the reverse of the pivot operation. Suppose you have a spreadsheet that shows the crosstab report shown below:
| Purchase Frequency | New York | Connecticut | New Jersey | Florida | Missouri |
| 0 | 12 | 11 | 1 | 0 | 0 |
| 1 | 900 | 14 | 22 | 98 | 78 |
| 2 | 866 | 78 | 13 | 3 | 9 |
| ... | . |
SQL> desc customers Name Null? Type ----------------------------------------- -------- --------------------------- CUST_ID NUMBER(10) CUST_NAME VARCHAR2(20) STATE_CODE VARCHAR2(2) TIMES_PURCHASED NUMBER(3)
It might be easier to demonstrate this via an example. Let's create a crosstab table first, using the pivot operation:
1 create table cust_matrix
2 as
3 select * from (
4 select times_purchased as "Puchase Frequency", state_code
5 from customers t
6 )
7 pivot
8 (
9 count(state_code)
10 for state_code in ('NY' as "New York",'CT' "Conn",'NJ' "New Jersey",'FL' "Florida",
'MO' as "Missouri")
11 )
12* order by 1
SQL> select * from cust_matrix
2 /
Puchase Frequency New York Conn New Jersey Florida Missouri
----------------- ---------- ---------- ---------- ---------- ---------
1 33048 165 0 0 0
2 33151 179 0 0 0
3 32978 173 0 0 0
4 33109 173 0 1 0
... and so on ...
SQL> desc cust_matrix Name Null? Type ----------------------------------------- -------- --------------------------- Puchase Frequency NUMBER(3) New York NUMBER Conn NUMBER New Jersey NUMBER Florida NUMBER Missouri NUMBER
select *
from cust_matrix
unpivot
(
state_counts
for state_code in ("New York","Conn","New Jersey","Florida","Missouri")
)
order by "Puchase Frequency", state_code
/
Puchase Frequency STATE_CODE STATE_COUNTS
----------------- ---------- ------------
1 Conn 165
1 Florida 0
1 Missouri 0
1 New Jersey 0
1 New York 33048
2 Conn 179
2 Florida 0
2 Missouri 0
...
and so on ...
Note how the each column name has become a value in the STATE_CODE column. How did Oracle know that state_code is a column name? It knew that from the following clause in the query:
for state_code in ("New York","Conn","New Jersey","Florida","Missouri")
Purchase Frequency New York Conn New Jersey Florida Missouri
----------------- ---------- ---------- ---------- ---------- - ---------
1 33048 165 0 0 0
Unpivot may be the opposite action of pivot but don't assume that former can reverse what latter has done. For instance, in the above example, you created a new table, CUST_MATRIX, using a pivot operation on the table CUSTOMERS. Later you used unpivot on the table CUST_MATRIX but that didn't get back the details of the original table CUSTOMERS. Instead the crosstab report was displayed in a different way for you to load into a relational table. So unpivot is not for undoing what pivot did, a fact you should carefully consider before creating a pivoted table and then dropping the original one.
Some of the very interesting uses of unpivot go beyond the usual powerful data manipulation such as the example shown earlier. Oracle ACE Director Lucas Jellema of Amis Technologies has shown how you can generate rows of specific data for testing purposes. Here I will use a slightly modified form of his original code to generate vowels of the English alphabet:
select value from ( ( select 'a' v1, 'e' v2, 'i' v3, 'o' v4, 'u' v5 from dual ) unpivot ( value for value_type in (v1,v2,v3,v4,v5) ) ) /
V - a e i o u
In the above example, note how you had to specify the valid state_codes:
for state_code in ('NY','CT','NJ','FL','MO')
Well, there is another clause in the pivot operation, XML, that allows you to create the pivoted output as XML where you can specify a special clause, ANY, instead of literal values. Here is the example:
select * from ( select times_purchased as "Purchase Frequency", state_code from customers t )The output comes back as CLOB so make sure the LONGSIZE is set to a large value before running the query.
pivot xml ( count(state_code)
for state_code in (any) ) order by 1 /
SQL> set long 99999
Purchase Frequency STATE_CODE_XML ------------------ -------------------------------------------------- 1 <PivotSet><item><column name = "STATE_CODE">CT</co lumn><column name = "COUNT(STATE_CODE)">165</colum n></item><item><column name = "STATE_CODE">NY</col umn><column name = "COUNT(STATE_CODE)">33048</colu mn></item></PivotSet> 2 <PivotSet><item><column name = "STATE_CODE">CT</co lumn><column name = "COUNT(STATE_CODE)">179</colum n></item><item><column name = "STATE_CODE">NY</col umn><column name = "COUNT(STATE_CODE)">33151</colu mn></item></PivotSet> ... and so on ...
In addition to the ANY clause, you can write a subquery. Suppose you have a list of preferred states and you want to select the rows for those states only. You placed the preferred states in a new table called preferred_states:
SQL> create table preferred_states
2 (
3 state_code varchar2(2)
4 )
5 /
Table created.
SQL> insert into preferred_states values ('FL')
2> /
1 row created.
SQL> commit;
Commit complete.
select * from ( select times_purchased as "Puchase Frequency", state_code from customers t ) pivot xml ( count(state_code) for state_code in (select state_code from preferred_states) ) order by 1 /
for state_code in (select distinct state_code from customers)
Pivot adds a very important and practical functionality to the SQL language. Instead of writing convoluted non-intuitive code with a lot of decode functions, you can use the pivot function to create a crosstab report on any relational table. Similarly, you can convert any crosstab report to be stored as a regular relational table using the unpivot operation. Pivot can produce the output in regular text or XML. In the latter case, you don't need to specify the domain of values the pivot operation needs to search.
For more information on pivot and unpivot operations, refer to the Oracle Database 11g SQL Language Reference .
Back to Series TOC