TECHNOLOGY: SQL 101
A Function of CharacterBy Melanie Caffrey
Part 6 in a series on the basics of the relational database and SQL
Part 5 in this series, “An Order of Sorts“ (Oracle Magazine, May/June 2012), introduced the ORDER BY clause of a SQL SELECT statement (or query) and how it behaves in conjunction with certain options and keywords to order (or sort) the data in query results. Now you are ready to start learning how to use SQL functions in your queries to transform result set data so that it displays differently from how it is stored in the database. This article focuses on the SQL character functions (also known as string functions or text functions), which enable you to manipulate how character data is displayed. Subsequent articles in this series will introduce the date and number functions.
To try out the examples in this and subsequent articles in the series, you need access to an Oracle Database instance. If necessary, download and install an Oracle Database edition for your operating system. I recommend installing Oracle Database, Express Edition.
If you install the Oracle Database software, choose the installation option that enables you to create and configure a database. A new database, including sample user accounts and their associated schemas, will be created for you. (Note that SQL_101 is the user account you’ll use for the examples in this series; it’s also the schema in which you’ll create database tables and other objects.) When the installation process prompts you to specify schema passwords, enter and confirm passwords for SYS and SYSTEM and make a note of them.
Finally—whether you installed the database software from scratch or have access to an existing Oracle Database instance—download, unzip, and execute the SQL script to create the tables for the SQL_101 schema that are required for this article’s examples. (View the script in a text editor for execution instructions.)
The most-basic character functions enable you to change the way alphanumeric data is formatted in a result set. For example, the simple query in Listing 1 obtains all unique last name values from the EMPLOYEE table and displays them in all capital letters. It does this by applying the UPPER character function to the LAST_NAME column. Similarly, the query in Listing 2 uses the LOWER character function to display all department location names from the DEPARTMENT table in lowercase letters. All functions take some kind of input parameter(s). Character functions require input parameters that are alphanumeric—either a character (or string) literal or a column with a VARCHAR2, CHAR, or CLOB datatype. The data in the EMPLOYEE table’s LAST_NAME column and the DEPARTMENT table’s LOCATION column is stored with a datatype of VARCHAR2. Recall that a literal character value is any list of alphanumeric characters enclosed in single quotation marks, such as ‘Smith’, ‘73abc’, or ‘15-MAR-1965’.
Code Listing 1: Query that lists every unique last name value in uppercase letters
SQL> set feedback on SQL> select distinct UPPER(last_name) "Uppercase Employee Last Name" 2 from employee 3 order by UPPER(last_name); Uppercase Employee Last Name ————————————————————————————— ECKHARDT FRIEDLI JAMES LEBLANC MICHAELS NEWTON PETERSON 7 rows selected.
Code Listing 2: Query that displays all department locations in lowercase letters
SQL> select name, LOWER(location) "Lowercase Department Location" 2 from department 3 order by location; NAME Lowercase Department Location —————————————— —————————————————————————————— Accounting los angeles Payroll new york 2 rows selected.
Listings 3 and 4 demonstrate the INITCAP function. The query in Listing 3 uses INITCAP to convert certain first and last names from being stored in all lowercase in the EMPLOYEE table to being displayed with initial capital letters. The INITCAP function capitalizes the first letter of a string and lowercases the remainder of the string, as demonstrated by the query in Listing 4. That query also shows that the input parameter for an INITCAP function can consist of a character function’s application to a string or a database column that stores alphanumeric data. Specifically, the query applies the UPPER function to the LAST_NAME column of the EMPLOYEE table for certain employees. The UPPER function is said to be nested inside the INITCAP function. The Oracle Database server applies nested functions in order, from innermost function to outermost function. In Listing 4, the UPPER function converts the values peterson and leblanc to PETERSON and LEBLANC. Then the INITCAP function converts those uppercase values to Peterson and Leblanc.
Code Listing 3: Query that shows certain names converted and with the initial letter capitalized
SQL> set lines 32000 SQL> select first_name, last_name, INITCAP(first_name) "First Name", INITCAP(last_name) "Last Name" 2 from employee 3 where employee_id in (6569, 6570); FIRST_NAME LAST_NAME First Name Last Name —————————— ——————————— ——————————— ————————————— michael peterson Michael Peterson mark leblanc Mark Leblanc 2 rows selected.
Code Listing 4: Query that demonstrates the INITCAP function
SQL> select INITCAP('eMPLOYEE lAST nAMES') "INITCAP Literal", INITCAP(UPPER(last_name)) "Converted Employee Last Name" 2 from employee 3 where employee_id in (6569, 6570); INITCAP Literal Converted Employee Last Name ——————————————————— ————————————————————————————— Employee Last Names Peterson Employee Last Names Leblanc 2 rows selected.
Padding Your Results
To pad something is to add to it. The LPAD and RPAD functions enable you to pad your character-data results by repeating a character, space, or symbol to the left or right of any string. LPAD pads to the left of a string; RPAD pads to the right.
Listing 5 demonstrates the power of the RPAD and LPAD functions. Note that each takes three input parameters: the column name or string literal you want to pad; the length to which the string should be padded; and the character, space, or symbol (the filler) to pad with. For example, the query in Listing 5 specifies that the department name should be right-padded to a total length of 15 with the “.“ filler character. If any department name is exactly 15 characters or longer, no filler character will be added. Because Accounting is 10 characters long, the RPAD function adds five filler characters to its right. The query also specifies that the location should be left-padded to a total length of 15. Because LOS ANGELES is 11 characters long, counting the space, the LPAD function adds four filler characters to its left.
Code Listing 5: Query that applies the RPAD and the LPAD functions
SQL> select RPAD(name, 15, '.') department, LPAD(location, 15, '.') location 2 from department; DEPARTMENT LOCATION ——————————————— ———————————————— Accounting..... ....LOS ANGELES Payroll........ .......NEW YORK 2 rows selected.
The Helpful Dual
Oracle Database provides a single-row, single-column table called DUAL that is useful for many purposes, not the least of which is learning about Oracle functions. DUAL is an Oracle system table owned by the SYS user, not the SQL_101 schema. Many Oracle system tables are made available to all users via public synonyms. Synonyms will be discussed in subsequent articles in this series.
The DUAL table contains no data that’s useful in and of itself. (It has one row with one column—called the DUMMY column—that contains the value X.) You can use DUAL to try out functions that work on string literals and, as you’ll see in subsequent articles in this series, on number literals and even on today’s date.
The following demonstrates the single-row, single-column output of a SELECT statement executed against the DUAL table:
SQL> select * 2 from dual; D - X 1 row selected.
To display the current date, you can query the DUAL table as follows:
SQL> select sysdate 2 from dual; SYSDATE —————————— 18-APR-12 1 row selected.
And finally, the following example shows how you can practice any function in the SELECT clause of a SQL statement, using the DUAL table:
SQL> select rpad('Melanie', 10, '*') Melanie, lpad('Caffrey', 10, '.') Caffrey 2 from dual; MELANIE CAFFREY —————————— —————————— Melanie*** ...Caffrey 1 row selected.
Note that functions work even though there is no usable data in DUAL. In the preceding examples, the SYSDATE function displays the current date and time of the operating system hosting the database, and the RPAD and LPAD functions add padding to my name.
Stringing Strings Together
Sometimes it makes sense to combine certain strings, such as the FIRST_NAME and LAST_NAME values from the EMPLOYEE table, in the result set display. You can use concatenation to accomplish this task—with either the CONCAT function, illustrated in Listing 6, or the (more commonly used) concatenation operator || (two pipe characters), illustrated in Listing 7.
Code Listing 6: Query that demonstrates the CONCAT function
SQL> select CONCAT(first_name, last_name) employee_name 2 from employee 3 order by employee_name; EMPLOYEE_NAME ———————————————————————————— BetsyJames DonaldNewton EmilyEckhardt FrancesNewton MatthewMichaels RogerFriedli markleblanc michaelpeterson 8 rows selected.
Code Listing 7: Query that demonstrates the concatenation operator, ||
SQL> select first_name||' '||last_name employee_name 2 from employee 3 order by employee_name; EMPLOYEE_NAME ——————————————————————— Betsy James Donald Newton Emily Eckhardt Frances Newton Matthew Michaels Roger Friedli mark leblanc michael peterson 8 rows selected.
The CONCAT function takes two parameters and concatenates them. You can also nest multiple CONCAT function calls, as shown in Listing 8. The queries in Listings 7 and 8 concatenate literal strings with column data values. (I prefer the concatenation operator, because it has unlimited input parameters and makes the concatenated output more readable.)
Code Listing 8: Query that demonstrates nested CONCAT calls
SQL> select CONCAT(first_name, CONCAT(' ', last_name)) employee_name 2 from employee 3 order by employee_name; EMPLOYEE_NAME ———————————————————————— Betsy James Donald Newton Emily Eckhardt Frances Newton Matthew Michaels Roger Friedli mark leblanc michael peterson 8 rows selected.
Giving Your Data a Trim
Sometimes you want to remove unwanted spaces or characters from data when you display it. For example, data inserted into a table column via a form application might include extraneous characters or spaces—preceding and/or following the actual data value—that the form input field doesn’t trim.
Listing 9 shows a query that trims extra spaces from string values. The TRIM function in Listing 9 takes two parameters. The first parameter is the character, symbol, or space (delimited by single quotes) to be removed. The second parameter specifies the string literal or column value to be trimmed. The TRIM function supports three keywords: LEADING, TRAILING, and BOTH. The example in Listing 9 uses the TRAILING keyword to right-trim the FIRST_NAME value. The TRIM function applied to the LAST_NAME value specifies the LEADING keyword to left-trim the spaces from that value. And, as you can see, the spaces to the right of the LAST_NAME value remain and are included in the output.
Code Listing 9: Query that trims extra spaces
SQL> select '''' ||TRIM(TRAILING ' ' FROM 'Ashton ') || '''' first_name, '''' || TRIM(LEADING ' ' FROM ' Cinder ') || '''' last_name 2 from dual; FIRST_NA LAST_NAME ———————— ——————————— 'Ashton' 'Cinder ' 1 row selected.
Compare the output in Listing 9 with that in Listing 10, which trims the rightmost extra spaces from the LAST_NAME value. When no keyword is specified, the default behavior for the TRIM function is to trim leading as well as trailing characters. The older RTRIM and LTRIM functions are available for backward compatibility.
Code Listing 10: Query that trims extra spaces, including rightmost extra spaces
SQL> select '''' || TRIM(TRAILING ' ' FROM 'Ashton ') || '''' first_name, '''' || TRIM(' Cinder ') || '''' last_name 2 from dual; FIRST_NA LAST_NAM ———————— ———————— 'Ashton' 'Cinder' 1 row selected.
Searching for Strings Within Strings
When you need to search column values for similar string pattern values, you can do so with the INSTR character function. INSTR—which stands for in string—returns the position of a substring within a string value. Listing 11 demonstrates the INSTR function applied to the LAST_NAME column of the EMPLOYEE table to locate all occurrences of the “ton“ substring. As you can see, the INSTR function takes as input the literal or column value you want to search, followed by the substring pattern to search for. In Listing 11, the INSTR function finds the “ton“ pattern in only two column data values—both of them Newton—and returns 4 as their position. Because it did not find the search string in any other values, the output for those values is 0.
Code Listing 11: Query that demonstrates the INSTR character function
SQL> select last_name, INSTR(last_name, 'ton') ton_starting_point 2 from employee 3 order by last_name; LAST_NAME TON_STARTING_POINT ——————————————— ————————————————————————— Eckhardt 0 Friedli 0 James 0 Michaels 0 Newton 4 Newton 4 leblanc 0 peterson 0 8 rows selected.
Two additional parameters—starting position and occurrence—are optional. The starting position specifies the character in the string from which to begin your search. The default behavior is for the search to begin at the first character—otherwise known as character position 1. The occurrence parameter lets you specify which occurrence of the substring you’d like to find. For example, the word Mississippi includes two occurrences of the “issi“ substring. To search for the starting-position location of the second occurrence of this pattern, you must provide the INSTR function with an occurrence parameter of 2:
SQL> select INSTR('Mississippi', 'issi', 1, 2) 2 from dual; INSTR('MISSISSIPPI','ISSI',1,2) —————————————————————————————————— 5 1 row selected.
Extracting Strings from Strings
Sometimes you need to extract a portion of a string for your desired output. The SUBSTR (for substring) character function can assist you with this task. Listing 12 shows a query that uses the SUBSTR function to extract the first three characters of every LAST_NAME value from the EMPLOYEE table. The SUBSTR function takes two required parameters and one optional input parameter. The first parameter is the literal or column value on which you want the SUBSTR function to operate. The second parameter is the position of the starting character for the substring, and the optional third parameter is the number of characters to be included in the substring. If the third parameter is not specified, the SUBSTR function will return the remainder of the string.
Code Listing 12: Query that demonstrates the SUBSTR character function
SQL> select last_name, SUBSTR(last_name, 1, 3) 2 from employee 3 order by last_name; LAST_NAME SUB ————————————————————— ———— Eckhardt Eck Friedli Fri James Jam Michaels Mic Newton New Newton New leblanc leb peterson pet 8 rows selected.
Listing 13 demonstrates the SUBSTR and INSTR functions working together to display the portion of every LAST_NAME value from the EMPLOYEE table that contains the “ton“ substring. In this example, the output from the INSTR function provides the value for the input parameter that specifies the position for the SUBSTR function’s starting character. In the LAST_NAME values in which the substring “ton“ is not found, the entire LAST_NAME value is returned, for two reasons: SUBSTR treats a starting position of 0 the same as a starting position of 1 (that is, as the first position in the string), and because the query omits the optional length parameter, the full remainder of the string is returned.
Code Listing 13: Query that demonstrates the INSTR and SUBSTR character functions
SQL> select last_name, INSTR(last_name, 'ton') ton_position, SUBSTR(last_name, INSTR(last_name, 'ton')) substring_ton 2 from employee 3 order by last_name; LAST_NAME TON_POSITION SUBSTRING_TON ————————— ———————————— ————————————————— Eckhardt 0 Eckhardt Friedli 0 Friedli James 0 James Michaels 0 Michaels Newton 4 ton Newton 4 ton leblanc 0 leblanc peterson 0 peterson 8 rows selected.
When Size Matters
Occasionally you need to determine a string’s length—for example, to determine the maximum number of characters a form entry field should permit. Listing 14 shows a query that uses the LENGTH function to display the length of all FIRST_NAME values from the EMPLOYEE table.
Code Listing 14: Query that demonstrates the LENGTH function
SQL> select first_name, LENGTH(first_name) length 2 from employee 3 order by length desc, first_name; FIRST_NAME LENGTH ———————————— —————————————— Frances 7 Matthew 7 michael 7 Donald 6 Betsy 5 Emily 5 Roger 5 mark 4 8 rows selected.
Character functions can also be placed in WHERE and ORDER BY clauses, as illustrated in Listings 15 and 16.
In Listing 15, the total length of those employee first and last names concatenated together, separated by a single space, is calculated with the LENGTH function. And only values that are more than 15 characters long are returned in the result set. In Listing 16, the WHERE clause uses the INSTR function nested inside the SUBSTR function to return only those employees whose last names contain the substring “ton“—the resultant employee first and last name values are concatenated and separated with a space. Finally, the query’s ORDER BY clause sorts the concatenated first and last name values from the SELECT list by character length in descending order, by using the LENGTH character function.
Code Listing 15: Query that demonstrates a function in a WHERE clause
SQL> select first_name||' '||last_name employee_name 2 from employee 3 where LENGTH(first_name||' '||last_name) > 15 4 order by employee_name; EMPLOYEE_NAME ——————————————————— Matthew Michaels michael peterson 2 rows selected.
Code Listing 16: Query that demonstrates functions in a WHERE and an ORDER BY clause
SQL> select first_name||' '||last_name employee_name 2 from employee 3 where SUBSTR(last_name, INSTR(last_name, 'ton')) = 'ton' 4 order by LENGTH(employee_name) desc; EMPLOYEE_NAME ——————————————————— Frances Newton Donald Newton 2 rows selected.
This article has shown you how character functions can be used in SELECT statements to manipulate the ways data is displayed. You’ve seen how to convert data values to uppercase, lowercase, and mixed cases and how to search for strings within strings. You’ve also seen how to pad and trim data and how to specify a string’s total length. By no means does this article provide an exhaustive list of the Oracle character functions. Review the documentation for more details: bit.ly/HZUBC5.
The next installment of SQL 101 will discuss number functions and other miscellaneous functions.
Melanie Caffrey is a senior development manager at Oracle. She is a coauthor of Expert PL/SQL Practices for Oracle Developers and DBAs (Apress, 2011) and Expert Oracle Practices: Oracle Database Administration from the Oak Table (Apress, 2010).
Send us your comments