As Published In
Oracle Magazine
July/August 2012

TECHNOLOGY: Ask Tom

  

On Characters, Pivots, and Endings

By Tom Kyte Oracle Employee ACE

 

Our technologist substitutes characters, looks at fishy results, and avoids redo.

I want to insert a value into a table as follows:

create table test (name varchar2(35));
insert into test values ('&Vivek');

But the system asks for a value for the substitution variable. How can I insert an ampersand (&)?

This is probably one of the most frequently asked questions out there, not only on AskTom (asktom.oracle.com) but on any Oracle Database forum.

Before I give the answers, I’ll explain the problem fully. By default, SQL*Plus will scan each line of input and look for an & character. Upon finding it, SQL*Plus will scan the characters after the ampersand and use those as a variable name (the variable name in the above example is Vivek). SQL*Plus will then prompt the user for a value for Vivek—like this:

 

SQL> insert into test (name) 
values ('&Vivek');
Enter value for vivek: Hello World
old   1: insert into test (name) 
values ('&Vivek')
new   1: insert into test (name) 
values ('Hello World')

1 row created.

 

Here you can see how SQL*Plus turned &Vivek into Hello World. Now the question is how to stop it from doing that. The easiest method is simply to issue the SQL*Plus set define off command:

 

SQL> set define off
SQL> insert into test (name) 
values ('&Vivek');
1 row created.

SQL> select * from test;

NAME
———————————————————————
Hello World
&Vivek

 

That prevents SQL*Plus from scanning the input to try to find the substitution character. Another approach is to use a different substitution character:

 

SQL> set define @

SQL> insert into test (name) 
values ( '&Vivek @X' );
Enter value for x: this was x
old   1: insert into test (name) 
values ( '&Vivek @X' )
new   1: insert into test (name) 
values ( '&Vivek this was x' )

1 row created.

 

In this case, the @ character is doing what the & used to do.

There are other approaches, such as avoiding the & character in your SQL:

 

SQL> insert into test 
values (chr(38)||'Vivek  xxx'); 

1 row created.

SQL> select * from test 
where name like '% xxx';

NAME
—————————————————————————
&Vivek  xxx

 

Although that approach works, I’m not a fan of it, because you have to change your SQL statement.

Yet another approach is to use a zero-length substitution variable name, which will make SQL*Plus just leave that & character alone:

 

SQL> insert into test 
values ('&'||'Vivek yyy'); 

1 row created.

SQL> select * from test 
where name like '% yyy';

NAME
—————————————————————————
&Vivek yyy

 

This is perhaps better than using chr(38) to avoid the & character in your SQL, but it still is not something I recommend. By the way, even though I’ve been using SQL*Plus for almost 25 years, I had no idea that the last solution—with ‘&’ ||’Vivek’—would work that way. That was something new I learned from a reader of AskTom!

Dynamic Pivot

I have a table like this:

create table fish (
fish_id number,
fish_type varchar2(3),
fish_weight number);

insert into fish values (1,'COD',20);
insert into fish values(1,'HAD',30);
insert into fish values(2,'COD',45);
insert into fish values(2,'HKE',10);
insert into fish values(2,'LIN',55);
insert into fish values(3,'CTY',90);
insert into fish values (3,'HAD',60);
insert into fish values (3,'COD',52);

I would like it to be displayed as

COD HAD HKE LIN CTY .......
1 20 30 X X
2 45 X 10 55
3 52 60 X X
....

The columns aren’t fixed in number or name, because there can be multiple species.

How can I create this display?

In SQL you need to know the number, name, and datatype of every single column at parse time, so you’ll have to use a bit of dynamic SQL. Before I show the dynamic SQL, I will first develop a static SQL statement that works against the existing data. Listing 1 contains a query that works in all releases of Oracle Database. (Note that in Oracle Database 11g and later releases, I could have used the built-in PIVOT syntax, but it too requires dynamic SQL.)

Code Listing 1: Static SQL for existing data

 

SQL> select fish_id,
  2         sum(decode(fish_type,'COD',fish_weight)) cod,
  3         sum(decode(fish_type,'HAD',fish_weight)) had,
  4         sum(decode(fish_type,'HKE',fish_weight)) hke,
  5         sum(decode(fish_type,'LIN',fish_weight)) lin,
  6         sum(decode(fish_type,'CTY',fish_weight)) cty
  7    from fish
  8   group by fish_id
  9   order by fish_id
 10  /

   FISH_ID        COD        HAD        HKE        LIN        CTY
——————————  —————————  ————————— ——————————  —————————  —————————
         1         20         30
         2         45                    10         55
         3         52         60                               90

 

Now, to make the SQL in Listing 1 dynamic, I will create a stored procedure that executes a query to determine what the column names are and use that information to dynamically construct the pivot query, as shown in Listing 2.

Code Listing 2: Stored procedure that determines column names

 

SQL> create or replace procedure go_fishing( p_cursor in out sys_refcursor )
  2  as
  3      l_query long := 'select fish_id';
  4  begin
  5      for x in (select distinct fish_type from fish order by 1 )
  6      loop
  7          l_query := l_query ||
  8             replace( q'|, sum(decode(fish_type,'$X$',fish_weight)) $X$|',
  9                      '$X$',
 10                      dbms_assert.simple_sql_name(x.fish_type) );
 11      end loop;
 12  
 13      l_query := l_query || ' from fish group by fish_id order by fish_id';
 14  
 15      open p_cursor for l_query;
 16  end;
 17  /

Procedure created.

 

Note: In Listing 2, the fish_type column is obviously a foreign key to another table in which fish_type is the primary key. The table in which fish_type is the primary key is the lookup table for valid fish types. Replace my SELECT DISTINCT... with a simple SELECT against that table.

What I did in the stored procedure in Listing 2 was to generate a list of distinct fish types and add a column to the query for each one. I used the string

 

q'|, sum(decode(fish_type,'$X$',
fish_weight)) $X$|'

 

as a template for the original static SQL:

 

sum(decode(fish_type,'COD',
fish_weight)) cod,

 

The only thing I had to do with the template was replace $X$ (a string I chose at random to represent COD, HAD, and so on) with the value x.fish_type. Note that I did not just replace $X$ with the value x.fish_type blindly. I used the DBMS_ASSERT package to validate that the data I was concatenating into the SQL statement was “safe”—that it was a simple SQL name and not some SQL that would change the meaning of my SQL statement. In short, that DBMS_ASSERT call is protecting against SQL injection. If SQL injection is a topic you are not up to speed on, or if you just want to see some really interesting SQL injection techniques, I encourage you to check out bit.ly/IgU3YQ and bit.ly/K7aAKW.

After I build the query in a string, I use a ref cursor to open a cursor that can be sent back to a client application, as shown in Listing 3.

Code Listing 3: Ref cursor, cursor, and fetch

 

SQL> variable x refcursor
SQL> exec go_fishing( :x )

PL/SQL procedure successfully completed.

SQL> print x

   FISH_ID        COD        CTY        HAD        HKE        LIN
——————————  —————————  ————————— ——————————  —————————  —————————
         1         20                    30
         2         45                               10         55
         3         52         90         60

 

Finding the Last Row

I have created a view based on multiple tables in Oracle Database, and I am able to fetch the records within a view via SELECT. My question is: If there are newly added records and I want to fetch only the newly added records inside the view, what is the SQL query for fetching those? I tried using the rowid, but it didn’t return the results as expected.

You would have to tell me how you could identify “new rows” if you printed them out on a piece of paper. If you cannot tell which rows are “new” or what the “last row” was, neither can I. Rowids are an address of a row on a block in a file—they are not monotonically increasing values. Even if you only insert into a table, you will find that the rows might not be in “insert order”—sometimes they might be, but other times they won’t. In short, you can never count on their being in any sort of order. For example, I used an automatic segment space managed (ASSM) tablespace with an 8 K block size and inserted some data, as shown in Listing 4.

Code Listing 4: Creating table for “last row” test

 

SQL> create table t ( x int, y varchar2(4000), z varchar2(4000) );
Table created.

SQL> insert into t values ( 1, rpad('*',1,'*'), rpad('*',1,'*') );
1 row created.

SQL> connect /
Connected.

SQL> insert into t values ( 2, rpad('*',3000,'*'), rpad('*',3000,'*') );
1 row created.

SQL> connect /
Connected.

SQL> insert into t values ( 3, rpad('*',3000,'*'), rpad('*',3000,'*') );
1 row created.

SQL> connect /
Connected.

SQL> insert into t values ( 4, rpad('*',1,'*'), rpad('*',1,'*') );
1 row created.

 

Now, arguably row “4” is the “last” and newest row in the T table, but if I start looking at the rows—and what database blocks they are on—I’ll see a different story, as shown in Listing 5.

Code Listing 5: Query to see order of rowids

 

SQL> connect /
Connected.

SQL> select dbms_rowid.rowid_block_number(rowid), x from t;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)          X
————————————————————————————————————  —————————
                               18948          1
                               18948          2
                               18948          4
                               18949          3

SQL> select rowid, x from t order by rowid desc;

ROWID                       X
————————————————————  ———————
AAAaxGAAEAAAEoFAAA          3
AAAaxGAAEAAAEoEAAC          4
AAAaxGAAEAAAEoEAAB          2
AAAaxGAAEAAAEoEAAA          1

 

If I were to try to use ROWID to find the “last” row or the newest row, I’d be very much let down at this point. What I did was put a small row 1 on the first block and place a larger row 2 on the same first block, but then when I tried to insert row 3, it was too big to fit on the first block with those rows, so it went to the second block in the table. However, when row 4 came along—it was small again—there was room for it on the first block.

Rows go on a block that has free space on it—enough free space to hold those rows. They do not go to the “last block” in the table; if they did, tables would only grow. We would never be able to reuse space left behind after a delete.

If you need to find the “newest rows” in a table, you’ll have to have some bit of data—a time stamp, for example—associated with each row that would enable you to identify a new row. Some people might mention ORA_ROWSCN as a solution, but the overhead of using it to find “new rows” would be overwhelmingly huge—you’d have to inspect every single row every single time.

Redo and Global Temporary Tables

Do temporary tables generate redo for standard data manipulation language] operations?

I think that because every DML operation generates undo, every DML operation on global temporary tables will generate redo.

In short, there is no redo generated on temporary table blocks. However, any undo generated for those blocks will have redo generated. Thus, many operations against global temporary tables will generate redo as a side effect of generating undo.

The example in Listing 6 demonstrates operations that generate and do not generate redo.

Code Listing 6: No redo created in direct path insert

 

SQL> create global temporary table gtt 
  2  on commit preserve rows 
  3  as 
  4  select * from all_objects where 1=0;
Table created.

SQL> set autotrace on statistics
SQL> insert into gtt select * from all_objects;
72259 rows created.

Statistics
———————————————————————————————
…
     412112  redo size
…
      72259  rows processed

SQL> insert /*+ append */ into gtt select * from all_objects;
72259 rows created.

Statistics
———————————————————————————————
…
          0  redo size
… 
      72259  rows processed

SQL> set autotrace off

 

In Listing 6 there was 0 redo size with a direct path INSERT (but you have to commit the INSERT before you can read it) because the direct path INSERT bypassed undo generation. The conventional path load generated 412,112 bytes of redo, but that was to protect the UNDO information only—not the data loaded. That is apparent if you use a conventional path INSERT into a “normal” table:

 

SQL> create table t
  2  as
  3  select * from all_objects where 1=0;
Table created.

SQL> set autotrace traceonly statistics;
SQL> insert into t select * from 
all_objects;

72862 rows created.

Statistics
——————————————————————————————
…
    8546004  redo size
…
      72259  rows processed

 

Next Steps


 ASK Tom
Tom Kyte answers your most difficult technology questions. Highlights from that forum appear in this column.
 

 DOWNLOAD Oracle Database 11g Release 2
 

LEARN more about SQL injection
 bit.ly/IgU3YQ
 bit.ly/K7aAKW

FOLLOW Oracle Database
 on Twitter
 on Facebook
 

A global temporary table typically significantly reduces the amount of redo generated, but it will not typically eliminate it.

Tricky Unique Constraint

I have a data rule for a table that says a certain pair of columns must be unique if some other field is a certain value. Specifically, if a column t_resource_type value is in the set of values 100000, 1000001, and 1000002, the t_resource_address1 and the t_resource_hst_id values must be unique. How can I construct this?

This is easy to accomplish via a function-based index or virtual columns (in Oracle Database 11g and later only). I’ll demonstrate both. First, I’ll create your table:

 

SQL> create table test_data
  2  (
  3  t_resource_type number(8)
  4  ,t_resource_address1 varchar2(50)
  5  ,t_resource_hst_id number(11)
  6  );
Table created.

 

And then I’ll add a unique index—more specifically, a unique function-based index:

 

SQL> create unique index t_idx 
     on test_data (
  2  case when t_resource_type in 
     (100000, 1000001, 1000002) 
     then t_resource_address1 
     end,
  3  case when t_resource_type in 
     (100000, 1000001, 1000002) then 
     t_resource_hst_id   
     end
  4  ); 

Index created.

 

The two case statements will return either

Null, Null – when the t_resource_type column value is not in the set of values 100000, 1000001, and 1000002

or

t_resource_address1, t_resource_hst_id – when the t_resource_type column is in that set of values.

Because the null, null values are always considered unique, they will not appear in the index—the index will index rows in the table only such that the resource type is in the specified set of values. You’ll have a unique index on only some of the rows in the table.

With Oracle Database 11g and later, I can use a real constraint by using virtual columns, as shown in Listing 7.

Code Listing 7: A real constraint with virtual columns

 

SQL> alter table test_data
  2  add (
  3  t_resource_address1_unq varchar2(50)
  4  generated always as
  5  (case when t_resource_type in (100000, 1000001, 1000002) then 
      t_resource_address1 end)
  6  )
  7  /

Table altered.

SQL> alter table test_data
  2  add (
  3  t_resource_hst_id_unq number(11)
  4  generated always as
  5  (case when t_resource_type in (100000, 1000001, 1000002) then 
      t_resource_hst_id end)
  6  )
  7  /

Table altered.

SQL> alter table test_data
  2  add constraint address_hst_id_unique
  3  unique (t_resource_address1_unq,t_resource_hst_id_unq);

Table altered.

 

This approach adds two columns to the table, and these columns appear to have values only when the resource type is in the specified set of values—otherwise they are null. Because they are “real” columns in the table, you can apply a traditional constraint to them. They consume no storage, because they are virtual columns, but you can do pretty much anything to them that you can do to a “regular” column. In the end, the two approaches are nearly equivalent, in that they both create a function-based index.


Tom Kyte Headshot


Tom Kyte is a database evangelist in Oracle’s Server Technologies division and has worked for Oracle since 1993. He is the author of Expert Oracle Database Architecture (Apress, 2005, 2010) and Effective Oracle by Design (Oracle Press, 2003), among other books.

 

 



Send us your comments