What You See Is What You Get Element

Learning R for PL/SQL Developers, Part 1

by Arup Nanda

Part 1 of a two-part series that presents an easier way to learn R by comparing and contrasting it to PL/SQL. Part 2 has been published here.

Published August 2017


Table of Contents
Why Learn R?
Structure of this Course
How Will You Practice?
Install R
Help
Case Sensitivity
Comments
Variables
Operations on Variables
Conversion Functions
Operators
Read a File and Save the Output
Summary
Quiz
About the Author

Why Learn R?

Why learn R? Of course you want to learn R; that's why you are here. But just in case you landed here accidentaly and need some convincing, there are a lot of reasons. The main one is that it's the de facto language of analytics. Many analytics applications are written in R; and a whole lot more will be. Learning this language allows you to be in the game for those analytics roles in the future or at least understand what's already available.

R can handle a lot data at once. And because data manipulation is a CPU-intensive operations, R makes data manipulation better by processing the dataset in memory, even for large datasets. With today's machines with relatively abundant memory, this is a real boon to data analytics professionals.

And, perhaps the best of all, it's free—for development and commercial applications. I don't think you need any additional reasons for learning R.

Structure of this Course

Because you are a PL/SQL developer, you are obviously working with Oracle Database and, therefore, you want to do more work in the data field. This course is directed at learning R on a fast track for people who are already familiar with PL/SQL. In this series of articles, you will learn the R language elements parallel to the PL/SQL language. Your familiarity in PL/SQL will help you associate the corresponding elements in R.

The articles in this series are designed to introduce you to R gently starting from the basics and progressing to the more complex elements. The last part of the series will introduce R in Oracle Database. Each article has

  1. A summary for you to quickly recap what you learned.
  2. A quiz to test your understanding.
  3. A video for you to actually see the language in action. The objective of the video is not to replace the text you are reading now. Instead it is provided so:
    • You will be able to see some mouse actions.
    • You can watch it and reinforce your knowledge.
    • You can listen to the self-explanatory audio—even if you don't get a chance to watch the video—while gardening, working around the house, and so on.

Welcome to the world of data science and machine learning. Let's start right away.

How Will You Practice?

It's a breeze. You can download R on your Windows or Linux desktop. It doesn't need much RAM or CPU for learning and it's 100 percent free. You can, and are encouraged to, try the commands shown in this series on your own.

Install R

The repository for downloading R is called Comprehensive R Archive Network (CRAN). Download your OS-specific version of the R installation from https://cran.r-project.org and install it per the instructions. In this article, I will be using R for Windows. In addition to the R executable, the installation files also install R-Gui, which is a graphical interface for R. Here is how you call R:

C:\Users\ananda>r

R version 3.3.3 (2017-03-06) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

>

This single greater-than symbol is the R prompt, sort of like the SQL> prompt for the SQL*Plus environment. You can run an R command here. Let's start with the traditional start for any language learners—displaying "Hello World"! In PL/SQL, you would use dbms_output.put_line("Hello World"). In R, you use the print() function.

> print("Hello World")
[1] "Hello World"

If you were wondering what the [1] is before the output, don't worry about it now. You will understand it later in the course. But note a very important property of R: it can process a value directly entered. If you enter just a value, for example, a string called "Hello World", R will display it. Unlike PL/SQL, the print() function is not even needed.

> "Hello World"
[1] "Hello World"

This property is very helpful as a data processing mechanism. For instance, if you enter an expression directly at the R prompt, R will process it and display the result. The print() function is not required. Here is an example of entering 1 + 2, and R responds with the result: 3:

> 1+2
[1] 3

Unlike PL/SQL, you don't need to use a dbms_output explicitly. R's functionality for displaying results is extended to logical expressions as well, unlike PL/SQL:

> 1<2
[1] TRUE

In addition to the R command-line interpreter, the installation files install two other important components of R:

  1. RGui, which is the graphical shell interface to the R command prompt. It's merely an environment; nothing more. Under the covers, it calls the R command anyway. It's sort of like the Oracle SQL Developer interface, which calls the SQL commands.
  2. RScript, which is a script execution engine for R scripts.

To invoke RGui, go to the application launcher of your OS, for example, by clicking Start -> Programs on Windows, look under R folder, and click RGui. That brings up a window like the one shown in Figure 1. It also shows you an R prompt where you can enter all the commands. This is called the R Console.

Learning R Part 1 fig 1

Figure 1. The RGui window

So what's the advantage of RGui? Well, it depends. First, if you click File in the top menu bar (Figure 2), you will see an item called New script. Clicking it will open a script editor inside the RGui window. This is just a text editor with no bells and whistles; but it's right inside the RGui window, which allows you to copy and paste commands from there to the R Console without leaving RGui.

Learning R Part 1 Fig 2

Figure 2. Menu item for opening the script editor

RGui also allows some commands to be invoked as menu items. For instance, if you click Packages in the top menu bar and then you select Install package(s) (Figure 3), you will be able to select a package to install. There is a command to install packages in R; but this menu item merely calls that command.

Learning R Part 1 Fig 3

Figure 3. Menu item for installing packages

In this article, we will mostly use the R command-line interpreter invoked from the Windows prompt.

Sometimes you might have scripts written with R commands to be executed from the command prompt, not to be passed as individual interactive commands to R, similar to SQL*Plus scripts. To execute such scripts, there is an executable installed along with R, which is known as Rscript. If you type just the executable name—Rscript—you will get the help:

C:\Users\ananda> rscript
Usage: /path/to/Rscript [--options] [-e expr [-e expr2 ...] | file] [args]

--options accepted are
--help Print usage and exit
--version Print version and exit
--verbose Print information on progress
--default-packages=list
Where 'list' is a comma-separated set
of package names, or 'NULL'
or options to R, in addition to --slave --no-restore, such as
--save Do save workspace at the end of the session
--no-environ Don't read the site and user environment files
--no-site-file Don't read the site-wide Rprofile
--no-init-file Don't read the user R profile
--restore Do restore previously saved objects at startup
--vanilla Combine --no-save, --no-restore, --no-site-file
--no-init-file and --no-environ

'file' may contain spaces but not shell metacharacters
Expressions (one or more '-e <expr>') may be used *instead* of 'file'
See also ?Rscript from within R

Suppose you have a file named myfile.r with the contents shown below:

print("Hello World")

When you want to run that file as a script, you use the following:

C:\Users\ananda> rscript myfile.txt
[1] "Hello World"

Help

When I learn something new such as a system, a tool or a language, the first thing I look for is a way to access help. R has a built-in help system. R installation installs the help files by default. To access the overall help, use the help() call from the R command. It will bring up a separate web browser window with the home page of the help files. It's not very helpful; you are better off using the Help function on the RGui interface. But if you are at the R prompt and want look up something on a specific topic quickly, R allows you to access help on that topic alone by prefixing the command with a question mark. For instance, if you want to access help on the print() function, use

> ?print

This will bring up a web browser showing the help for the print() function. This works when you know the precise name of the function. What if you don't? To perform a fuzzy search, that is, search where "print" appears, you can issue the following:

> ??print

This will bring up the help on all items where "print" appears. Remember, the help will be in a web browser, not in the window where the R command line is displayed.

Case Sensitivity

PL/SQL is not case-sensitive; that is, BEGIN and begin can be used interchangeably. Its variable names are not case-sensitive either; that is, Var1 and var1 are the same. R is case-sensitive. So print() is the correct syntax, not Print(). Most commands are in lowercase; so it makes sense to use everything in lowercase. Some function names are in mixed case; so pay attention to the syntax.

Comments

Comments in PL/SQL are lines starting with -- or blocks of text enclosed within /* and */, as shown below:

SQL> -- this is a comment
SQL> this is not a comment
SP2-0734: unknown command beginning "this is no..." - rest of line ignored.
SQL> /* this is a
SQL> multiline
SQL> comment */
SQL>

Comments in R start with #, as shown below.

> # this is a comment
> # this is another comment
> this is not a comment
Error: unexpected symbol in "this is"

Unlike PL/SQL, all comments are a single line. There are no multiline comments. So, what do you do if you want to create a multi line comment? One option is to place a # before each line. Painful, but possible. Another option is to use a little trick like the following:

> if (FALSE){
+ "This is
+ multiline
+ comment"
+ }

You will learn about the if statement later. But the above code is actually legal. Since (FALSE) evaluates to false, this code segment is not evaluated; hence, the strings inside the double quotes are never evaluated. The code logic is not changed.

Variables

With the formalities now out of the way, let's jump right into learning the R language. Let's start with understanding variables. In PL/SQL, when you use a variable, it must be first defined prior to being called. And you use the := notation to assign value, for example:

declare
 v1 number; -- this is the "declaration" of the variable
 begin
 v1 := 1; -- this is the assignment of the variable

R is different in this respect. Here is how you create and assign a value to the variable:

> v1 <- 1
> v1
[1] 1
> print (v1)
[1] 1

Note some important characteristics in R, which are unlike PL/SQL:

  1. There was no need to "declare" a variable beforehand. You just assign a value; the variable is created immediately.
  2. The assignment operator is <-. You can also use an equals sign (=) and <<- for assignment.
  3. There is no line-ending character, such as a semi-colon (;).

Let's visit the topic of the value assignments. The standard assignment operator in R is <-, but = also works. Both of the following mean the same thing: assign a value of 1 to v1.

> v1 <- 1

> v1 = 1

The <- operator is called "pointing to." You can think of <- as a symbolic link or a shortcut for the variable. Because the meaning is "to point to," it can point in either direction. For instance, to assign 1 to variable v1, you can write the following—something you can't do with the = operator.

> 1 -> v1

But there is a cool thing you can do in R that is not possible in PL/SQL. If you want to define two variables and assign the same value to both, you have to write something like this in PL/SQL:

declare
 v1 number;
 v2 number;
begin
 v1 := 2;
 v2 := 2;

If you want two variables to be assigned the same value in R, you can do it in one step:

> v1 <- v2 <- 2

Because the operator is directional, you can also write this:

> 2 -> v1 -> v2

This opens some interesting possibilities. You can write the directional operator in any direction. Consider the following case where you assign values to two variables, v1 and v3, from another variable, v2:

> v1 <- v2 -> v3
> v3
[1] 2

Another way to assign the value is to use the assign function. The function takes two arguments: the first one is the variable name. It's the name of the variable; not the variable itself. So, you have to enclose it within double quotes.

> assign("v1",1)
> v1
[1] 1

Strings

So far, we've talked about numbers. Character variables are defined by enclosing the values in single or double quotes. Note how the values are enclosed in double quotes even if you enclosed them in single quotes.

> v1 <- 'This is a char'
> v1
[1] "This is a char"
> print(v1)
[1] "This is a char"
> v1 <- "This is a char"
> v1
[1] "This is a char"

What if you want to have a single quote inside the value, for example, "Joe's Pizza"? In PL/SQL, you would use the single quote as an escape character:

'Joe''s Pizza'

In R, you can use double quotes as enclosure and put a single quote inside without a problem:

> v1 <- "Joe's Pizza"
> v1
[1] "Joe's Pizza"

However, if you want to put the character inside single quotes, the single quote inside must be escaped; otherwise, the code will throw an error.

> v1 <- 'Joe's Pizza'
Error: unexpected symbol in "v1 <- 'Joe's"
> v1 <- 'Joe\'s Pizza'
> v1
[1] "Joe's Pizza"

A huge problem in not declaring the variable before assigning a value is that you don't know the data type of that variable later. For instance, if you use v1 <- 1, then v1 is a numeric; but if you use v1 <- "c", then v1 is a character. Actually that's not that surprising. PL/SQL is a strongly typed language; that is, you have to declare the data type of the variable and once declared, you can't change it. In R, you don't declare anything at all; so there is no such thing called the data type of the variable. Instead the variable assumes the data type of whatever object it holds at a point in time. So when you issue this:

v1 <- 1

v1 is a number; but when in the next line you issue this:

v1 <- "c"

v1 is a character.

The data type keeps on changing. That's why R is called a dynamically typed language. But that could pose a problem. Because R has many types of variables, such as vectors, factors, data frames, and so on (you will learn about them later), it's important to know the data type in order to properly use them. To find out what the data type of a variable is, use the class() function.

> class(v1) 
[1] "character"

Similarly, you can define logical variables (TRUE or FALSE). Note, you must use uppercase, that is, TRUE not true. You can also use T and F.

> v1 <- TRUE
> v1
[1] TRUE
> print(v1)
[1] TRUE
> class(v1)
[1] "logical"

By the way, the variable you defined first, assigning the value 1, was of the type numeric.

> v1 = 1
> class(v1)
[1] "numeric"

Numeric data types are equivalent to floating point numbers, that is, non-integers. To define an integer, put the letter "L" after the value. Note that, unlike in the Python language, you can't use a lowercase "l"; you must use uppercase "L" for good reasons. Otherwise, it might be confused with the number 1.

> v1 = 1L
> class(v1)
[1] "integer"

Naming Variables

Variable names contain, of course, letters and numbers, and two special characters: _ (underscore) and a period (dot). Be aware of three special rules. Variable names

  1. Cannot start with a number or an underscore; for example, _myVar and 5thVar are invalid
  2. Can start with a dot, but that has a special implication you will learn in the next section
  3. Can start with a dot, but the next character can't be a number; for example, .5thVar is invalid

The special character # cannot be used in a variable name. But there is an interesting side effect. If you use #, R will not produce an error. R will interpret it and the rest of the line as comments. Here is a demo:

> v1#1 = 1
[1] 1
> v1#2 = 1
[1] 1

In the first line, all characters after the # characters are considered comments. So, all R understood was to print the value of variable v1. Because the variable was defined earlier, the command was successful. The same happened to the second line as well. So, if your intention was to declare a variable named v1#1 or v1#2, it wasn't successful. You can examine the effect further by issuing the following:

> v2#1
Error: object 'v2' not found

This time, R looked for a variable named v2; but it failed because v2 had not been defined.

Because you don't declare variables before using them, you might lose track of what variables you have created, which could be problematic when you write R expressions. Variables are also known as objects in R. When you define objects, some of which can be variables, the collection of all such objects is called a workspace. To find the objects defined in the current workspace, you can use the objects() function.

First, let's define a few variables.

> v1.1 = 1
> v1.2 = 2
> v1.3 = 3
> objects()

[1] "v1"   "v1.1" "v1.2" "v1.3"

You can also use the ls() function to display all the variables defined in the workspace.

Now, remember, you can use "." to begin a variable name. But it has an interesting repercussion. Let's see by first declaring a variable named .v2.

> .v2 <- 1

You can reference the variable like any other variable:

> .v2
[1] 1

However, if you use the ls() function, this variable will not be listed.

> ls()
[1] "v1"   "v1.1" "v1.2" "v1.3"

Note: The variable .v2 is not listed above. To display this variable, you will need to set a special parameter, all.names, in the function ls() to TRUE.

> ls(all.names=TRUE)
[1] ".Random.seed" ".v2"          "v1"           "v1.1"         "v1.2"         "v1.3" 

The variable .v2 is now displayed.

To save memory space, you can remove a variable by using the rm() function.

> rm (v1)
> ls()
[1] "v1.1" "v1.2" "v1.3" "v1_1"

Operations on Variables

One of the useful functions, paste(), is needed when you want to concatenate multiple values to one:

> v1 <- 'This'
> v2 <- 'is'
> v3 <- 'a'
> v4 <- 'Test'
> paste(v1,v2,v3,v4)
[1] "This is a Test"

If you want to use a different character than a space, you would use the sep parameter. Here is how to use "_" as a separator parameter.

> paste(v1,v2,v3,v4,sep='_')
[1] "This_is_a_Test"

If you want to just paste the values without anything in between, just use '' (with no space in between two single quotes) for the separator.

Case Conversion

In PL/SQL, you use UPPER() and LOWER() to convert the case of strings. In R, you use toupper() and tolower(), respectively. Here is the PL/SQL code:

SQL> declare
2 v1 varchar2(200);
3 begin
4 v1 := 'This';
5 dbms_output.put_line(upper(v1));
6 dbms_output.put_line(lower(v1));
7 end;
8 /

THIS

this

Here is the equivalent R code:

> v1 <- "This"
> toupper(v1)
[1] "THIS"
> tolower(v1)
[1] "this"

Substringing

When you want to choose specific characters from the string, in PL/SQL you use the SUBSTR() function. The R equivalent is substring(). In PL/SQL, you would write something like this:

SQL> declare
2 v1 varchar2(200);
3 begin
4 v1 := 'This';
5 dbms_output.put_line(substr(v1,2,2));
6 end;
7 /
 hi

In R, you'd write this:

> substring(v1,2,2)
[1] "h"

Character Conversion and Formatting

When you display information, you would want to do so in a reasonably readable format. In PL/SQL, you would use TO_CHAR(), LPAD(), and RPAD() functions to convert to a specific string format and then pad with spaces. Here is an example:

declare
  v1 number;
begin
  v1 := 100.12345;
  dbms_output.put_line('v1='||to_char(v1,'9999999999.999'));
end;
/

Output:

v1= 100.123

In R, you would use the simple format() function:

> v1 <- 100.12345
> format(v1, digits=10)
[1] "100.12345"

Note that the value is not padded with spaces, as the case was with PL/SQL. To pad, you need to use the optional parameter width and set it to the number of digits to display. R will pad the unavailable digits with space:

> format(v1, digits=10, width=13)
[1] " 100.12345"

Conversion Functions

The TO_XXX functions in PL/SQL, such as TO_CHAR and TO_DATE, convert one type to other. R also has similar functions. Let's see them compared with PL/SQL.

PL/SQL Function Equivalent R Function
TO_CHAR as.character()

Example:

> v1 = 100.12345
> as.character(v1)
[1] "100.12345"
TO_NUMBER as.numeric

Example:

> v1 <- "12345.12345"
> as.numeric(v1)
[1] 12345.12
TO_DATE as.date

Example:

The default date format in R is YYYY-MM-DD. So when passed a string in that format, you get this:

> v1 <- "1975-12-11"
> as.Date(v1)
[1] "1975-12-11"

If the format is different, for example, mm/dd/yyyy, you have to use the format parameter:

> v1 <- "12/11/1975"
> as.Date(v1, format="%m/%d/%Y")
[1] "1975-12-11"

Be careful about the case of the year parameter in format strings. You'll get different results if it is changed from Y to y:

> as.Date(v1, format="%m/%d/%y")
[1] "2019-12-11"

Operators

You already learned about the assignment operators: =, <-, and ->. There are other operators that are normally the same as in PL/SQL, such as the following:

Operator Description
+ Addition
- Subtraction
/ Division
* Multiplication
^ or ** Raise to the specified power; for example, 3^2 returns 9
> Greater than
< Less than
<= Less than or equal to
>= Greater than or equal to
== Is equal to
!= Is not equal to

There is no point explaining each operator; they are pretty self-explanatory. Let's see some examples with relational operators:

> v1 <- 100
> v2 <- 200
> v1 <= v2
[1] TRUE
> v1 != v2
[1] TRUE
> v1 == v2
[1] FALSE

The PL/SQL operator <> (not equal to) is not present in R; only != is allowed. The important thing here to note here is that you don't have to explicitly print the results of the operations. The results are implicitly stored in the workspace. If you are in the R command-line interface, the results are printed automatically; but that's not very useful in real R programming. You probably want to store the results in some variables to be processed later. You can easily do that. Here is an example where you assign values 1 and 2 to variables v1 and v2 respectively, check if v2 is greater than v1, and store the result, which is boolean, in the variable v3:

> v1 <- 1
> v2 <- 2
> v3 <- v2 > v1
> v3
[1] TRUE

You will learn about other types of variables, especially collections, which are fairly common in a data oriented language such as R. Let's see some other operators you will be using a lot:

Operator Description
& Logical AND, where you evaluate two boolean variables. If they are both TRUE, it returns TRUE; otherwise, it returns FALSE.

Example:

> v1 = TRUE
> v2 = FALSE
> v1 & v2
[1] FALSE

Remember, the values can be made to be boolean. A value of 0 is considered FALSE and so is any non-zero value, including negative numbers.

Example:

> v3 = 0
> v4 = 1
> v3 & v4
[1] FALSE
| Logical OR, where two boolean variables are evaluated. The result is TRUE if at least one of them is TRUE. Let's see the result using the same two variables we used before.

Example:

> v1 | v2
[1] TRUE

Example of testing with numbers coerced into logical values:

> v3 = 0
> v4 = 1
> v3 | v4
[1] TRUE

This is just touching the tip of the iceberg. In the subsequent articles in this series, you will learn a lot about other types of variables and operations for those.

Read a File and Save the Output

Sometimes you may have the commands available in a file, which is called a script file. In the SQL*Plus environment, you can call the script file as follows:

SQL> @scriptfile.name

This will execute the script file.

In R, the function you'd use is source(), which executes a script file in the R environment. Here are the contents of a file named f1.rscript:

n1 <- 1
n2 <- 2
if (n1<n2) {
  print("Yes; n1 is less than n2")
} else {
  print ("No")
}

Here is how you call the script file. The script file gets executed and then results are printed immediately afterwards.

> source("f1.rscript")
[1] "Yes; n1 is less than n2"

Similarly, if you want to save the output, you use the SPOOL filename command in the SQL*Plus environment. In R, you would use the sink() command and you would pass a parameter as the filename:

> sink("rout.txt")

Run the source function again:

> source("f1.rscript")

If you check the directory, the file will be there.

Summary

Here is a quick summary for you to recap what you learned in this article. I have explained the components in both PL/SQL and R for you to follow along well.

Element PL/SQL R
Case sensitivity Case-insensitive, that is, keywords, function and variables can be written in any case. For instance, BEGIN and begin are the same. Variables v1 and V1 are the same. Case-sensitive, as in UNIX systems. Function print() is correct; PRINT() is not correct. Variable v1 and V1 are different variables.
Comment Single-line comment: starts with --
Multiple lines are commented within /* and */
Only single-line comments are allowed and they starts with #
End of line Code lines terminate with a semicolon (;) , for example:

dbms_output.put_line('This is an output');
No terminating character. End of code line is marked with a newline character.
Variable declaration Variables need to be declared with data types and precision, for example:

declare
   v1 number(10);
   v2 varchar2(200);
Variables are not declared. They are created when they are assigned a value.
Variable naming Variable names have to start with a character, may contain numbers and _ and # as symbols. The "." in an expression usually means a subelement, for example:

EMP.DEPT_NO means DEPT_NO column of EMP table; PKG1.PROC1

means PROC1 procedure of package PKG1; PKG1.VAR1 means a variable VAR1 in package PKG1, and so on.
Variable names can contain any alphanumeric characters and two special characters: "." and "_"

They can't start with a number or with "_"

If the first character is "." the next character can't be a number
Variable assignment operator The assignment operator is :=

Example:

begin
   v1 := 1;
   v2 := 'Char';
The assignment operator can be any of these three operators:

=
<- (it can point in either direction, so -> is also valid)
<<-

Example:

v1 <- 1
1 -> v1
Variable data type stickiness Once a variable is declared, the data type and precision is frozen; it can't be changed. For instance (in the example above), v1 is a number and v2 is a varchar2. If you use the following, you will get a syntax error because v1 is declared as a number:

v1 := 'Char'
Because variables are not declared, the concept of data type is fluid for variables. When a variable is assigned, the data type is set to that type; but it can change when a new data type is assigned. For instance:

v1 <- 1  # here v1 is a number
v1 <- 'Char'  # now v1 is a character
Determining data type Not needed, because the declaration section clearly states the type Use the class() function:
class(variablename)
Multiple variable assignment If you want to assign the same value to multiple variables, you have to do it one by one:

v1 := 100;
v2 := 100;
Multiple variables can be assigned in one code line:

v1 <- v2 <- 100


Or, even like this:

v1 <- 100 -> v2
Displaying variables in the environment Not needed because the variables are explicitly declared ls() shows all variables except the ones starting with a period.

ls(all=TRUE) shows all variables.
Remove a variable defined already Not possible
rm(variableName)
Integer variables NUMBER is a generic type. PLS_INTEGER is generally faster if you can use it. Because there is no declaration, there is no way to specify a variable of a specific data type. Appending "L" to the number makes the variable an integer:

v1 <- 1L
Writing strings Enclosed in single quotes, for example:

v1 := 'This is a string';
Enclosed in either single or double quotes, for example:

v1 <- "This is a string"


OR

v1 <- 'This is a string'


You can't mix the two types of quotes in the same value, for example, "This is a string' is not valid. However, using a different type of quote inside another type is allowed, for example:

v1 <- "Joe's Pizza"


OR

v1 <- 'Joe "Special" Pizza'
String quote escape To use a single quote inside a string, use the single quote as an escape, for example:

  v1 := 'Joe''s Pizza';
The escape character is \, for example:

v1 <- 'Joe\'s Pizza'


OR

v1 <- "Joe \"Special\" Pizza"
Display something
dbms_output.put_line('....');
print()
paste()
Combining strings Use the || operator. Use the paste() function, for example:

paste(v1,v2,v3,v4,sep=',')
Case conversion Use the UPPER() and LOWER() functions. Use the toupper() and tolower() functions.
Substringing Use the SUBSTR() function. Use the substring() function.
Formatting
TO_CHAR(n, '999999999.99')
format(n, digits=x)
Conversion functions TO_CHAR, TO_DATE, and TO_NUMBER as.character(), as.numeric(), and as.Date()
Operators Most operators are the same. Important different ones include:

= is for equality comparisons
!= and <> for are for non-equality comparisons
== is for equality comparisons
Only != is for non-equality comparisons
Logical value assignments
declare
   v1 boolean;
   v2 boolean;
begin
   v1 := true;
   v2 := false;
TRUE (only uppercase; lowercase true is not valid), T or any non-zero number indicates a true logical value. FALSE, F, or 0 indicates false. For example, all of the following indicate the value of v1 is true.
v1 <- 1
v1 <- T
v1 <- TRUE
Null value NULL, for example:

v1 := NULL;
NA, for example:

  v1 <- NA


But this makes the variable a logical variable.
AND operator AND, for example:

WHERE x = 1 and y = 2
&, for example:

x==1 & y==2
OR operator OR, for example:

WHERE x = 1 or y = 2
|, for example:

x == 1 | y == 2
Read a file at the command prompt
SQL> @file.sql
> source("filename")
Redirect output to a file
SQL> spool out.txt
> sink("out.txt")

Quiz

Let's test your understanding with these simple questions.

Questions

1. What will be result of the following? If this will result in a syntax error, mention that as well.

v1 <- 2
v2 <- v1 ** 5v1 <- 2
v2 <- v1 ** 5
print(v2)

2. What will be the result of the following? If this will result in a syntax error, mention that as well.

2 -> v1
v2 <- v1 ^ 5
print(v2)

3. What will be the result of the following? If this will result in a syntax error, mention that as well.

v1 -> 2
v2 <- v1 ^ 5
print(v2)

4. Which of the following variable names are invalid in R?

  1. v1
    
  2. v1.1
    
  3. v1_1
    
  4. .v1
    
  5. 1v
    
  6. _v1
    
  7. v#1
    
  8. .1v
    

5. A script has the following contents:

v1.1 <- 1
v1.2 <- 2
.v1.3 <- 3

You ran the script and then issued the ls() function. What will be the output?

6. A script file has the following contents:

student <- "Andy"
student#1 <- "Billy"
student#2 <- "Charlie"

You ran the script in R and then ran the ls() command to see the variables. How many variables will be you see? Or, will the script fail with a syntax error?

7. What will be the result of the following? Mention if it will result in a syntax error.

v1 = 1
v2 = FALSE
v1 | v2

8. What will be output of the following code? Indicate if it will raise an error:

> l1 <- 1
> l2 <- 2
> l3 <- 3
> l1 < l2 -> l3
> l3 + 1

9. What is the purpose of the code below? Does the last line redirect the output to l3? Or, will it fail with a syntax error?

> l1 <- 1
> l2 <- 2
> l3 <- 3
> l1 <- l2 > l3

10. What is the output of the following code:

> v1 <- NA
> v2 <- 1
> v1 < v2

Answers

1. This is 2 raised to the power 5, which is 32.

2. This will not produce an error. 2 -> v1 is perfectly valid. It's the same as v1 <- 2 or v1 = 2. So the output will be 32.

3. This will result in a syntax error. Note the line v1 -> 2, which assigns v1 (a constant) to 2 and, hence, is not possible. If the intention was to assign 2 to v1, the correct expression is v1 < 2 or 2 -> v1.

4. Answer "e" is invalid because it starts with a number. Answer "f" is invalid because it starts with _. Answer "h" is invalid because it starts with a "." which is allowed unless it is immediately followed by a number.

5. The output will show only v1.1 and v1.2. The third variable name starts with "."; so it will not be shown. To show it, use ls(all=T).

6. You will not see any syntax error; but you will see only one variable named student. Why? It's because student#1 is not really a variable. Everything after # will be considered a comment and therefore will be ignored. The only variable will be student.

7. The syntax is fine. v1 = 1 will be numeric. However in the third line, when you use the | operator (which is "or"), v1 will be coerced into a logical value. Because any non-zero number will be TRUE, v1 will be TRUE. Hence v1 | v2 will be evaluated as TRUE ("TRUE or FALSE" yields TRUE) and that will be the output.

8. The result will be 2. Variable l1 is less than l2. So, l1 < l2 will yield TRUE, which is assigned as a value to variable l3. When you add 1 to l3, l3 is coerced into 1 (remember, 0 equals false and 1 equals true). So l3 + 1 will yield 2.

9. No; it will not produce a syntax error. The last line merely checks whether l2 is greater than l3 and stores the result in l1. Because 2 > 3 will yield false, l1 will hold the value FALSE. You can check that by printing l1 at the prompt.

10. We are comparing NA (the equivalent of NULL in PL/SQL) to a number value. They can't be compared. So the result will be NA.

About the Author

Arup Nanda (arup@proligence.com) has been an Oracle DBA since 1993, handling all aspects of database administration, from performance tuning to security and disaster recovery. He was Oracle Magazine's DBA of the Year in 2003 and received an Oracle Excellence Award for Technologist of the Year in 2012.