JSON, Part 1: Store semi-structured documents in Oracle Database with SQL

Leverage the JSON data interchange standard and the SQL language to simplify information storage and retrieval.

By Arup Nanda | April 2021

JSON, Part 1: Store semi-structured documents in Oracle Database with SQL

[Due to recent interest in JSON—including the release of Oracle Autonomous JSON Database—we have updated this two-part series, first published in Oracle Magazine in 2015. —Ed.]

Recently Acme Bank has been accepting transactions from business partners such as convenience stores and third-party billing companies—and even partners outside the country. Those transactions contain different fields but conform to the JavaScript Object Notation (JSON) interface standard for semi-structured documents.

Mark, the chief technology officer, has been cornered by two colleagues who are concerned about the use of JSON.

JSON enables any set of data to be transmitted immediately, without a predetermined format expected by a relational database, and this makes it attractive for integrating outside transactions quickly. Acme’s partners can send any pertinent data they want without first waiting for a mutually agreeable format. However, even though Acme allows semi-structured data to come in as JSON, the data is stored in a structured manner in the database, in a relational format.

Here are the coworkers expressing concerns to Mark:

Dave, the lead developer, and his team spend a lot of time accepting, parsing, and deciphering the JSON documents before storing them in the database. The JSON format enables any data to be included, but if there is no corresponding database column (or table), the data can’t be accepted into Acme’s systems.

Debbie, the lead database administrator, has to alter the database structures quickly and often to accommodate the new data. It’s not OK with her that fast and frequent responses are required to make things work, and she wants Dave to stop sending her a continuous stream of data management projects that use semi-structured documents and ever-changing data fields.

But Dave has no choice. Acme must accept semi-structured data from its partners to be competitive in the marketplace, and the JSON format enables the company to develop a repository for valuable but loosely structured data. This repository must be flexible enough to store any type of data coming in while being easy to query with a language everyone understands: SQL.

So, Dave explains, JSON is here to stay—and its use will grow rapidly at Acme.

Altering existing database structures takes time, and if a specific data item in an incoming JSON document has no corresponding column in the Acme database, that transaction has to wait until the column is created. This means an interruption of the normal business of the company. Dave wants new database columns to somehow be created on the fly, although he is fully aware that Debbie insists it’s impossible.

In short, everyone is unhappy with the present arrangement. And they all turn to Mark, who brought in JSON as a transaction data exchange format in the first place.

There is a very easy solution, responds Mark: Store JSON documents directly in the database without any parsing, preprocessing, or prior alteration of database structures. The solution delivers the flexibility of JSON for semi-structured data and the power, reliability, and familiarity of Oracle Database—with no special actions required of the DBAs. Dave and Debbie, intrigued, press Mark to explain.

What is JSON?

JSON is a standard for free-format text in which any kind of data can be included, along with a descriptor. The descriptor for the data is called a key, and the actual data is called a value. The collection of related data is put into a single JSON document file. Any type of data can be represented as key-value pairs.

Mark shows everyone an example of a bank transaction in a JSON document, shown in Listing 1, and provides a description of key lines in the listing. (The line numbers shown are not part of the document; they are there to aid in the explanation shown in Table 1.)

Listing 1. A check-transaction JSON document

Line	Explanation
1 and 33	JSON keys and values are enclosed in curly braces `{` and `}`.
2	The `TransId` key is a unique identifier for the transaction. Key names are enclosed in double quotation marks, and the value for a key is shown after the colon. Because this `TransId` value is a number data type, quotation marks are not needed.
3	The value for `TransDate` is a date format, so it is enclosed in quotation marks. (Characters, time stamps, and nonnumeric data are enclosed in quotation marks.)
10	The `CashierId` value is null, which means it is not known. Because this is an ATM transaction, a cashier is irrelevant. The JSON document could have omitted the `CashierId` key and value, but it included the key for completeness and assigned a null value.
11–14	`ATMDetails` embeds a key (which includes the `ATMId` and `ATMLocation` keys and values).
19–32	`CheckDetails` introduces embedded keys and values, but instead of introducing values for just one activity, such as an ATM transaction (as in the case of `ATMDetails`), `CheckDetails` introduces multiple values for multiple activities—one set of subkeys and values for each check deposited. There are two checks in this array, and arrays are presented in square brackets `[` and `]`.
23	In JSON `true` is a boolean data type (used without quotation marks).