죄송합니다. 검색 내용과 일치하는 항목을 찾지 못했습니다.

원하시는 정보를 찾는 데 도움이 되도록 다음을 시도해 보십시오.

  • 검색에 사용하신 키워드의 철자가 올바른지 확인하십시오.
  • 입력한 키워드에 동의어를 사용하십시오. 예를 들어 “소프트웨어” 대신 “애플리케이션”을 사용해 보십시오.
  • 새로운 검색을 시작하십시오.
문의하기 Oracle Cloud에 로그인

Converged Data Architecture for App Development

Reference architectures apply specific design principles to satisfy the core requirements of modern app development. These principles allow developers to build web/mobile apps that integrate AI/ML, data-driven analytics, and messaging platforms. Event-driven architectures are able to react to events in real-time. These architecture patterns help accelerate custom app dev in Finance, Retail, Healthcare, Energy, Manufacturing, and also help extend packaged Oracle applications. Most enterprise applications are data-centric, with a variety of data, and are best developed on a converged database. Programming using popular languages Java, Python, Javascript, Go, etc. are well supported as well in these architecture patterns. Low code app dev compresses the technology tiers needed, and is a good option for some applications. Apps and database containers managed by Kubernetes can then be deployed on Oracle Cloud (OCI), standalone environments, and other major public clouds.

Web/Mobile guidance

Web and mobile applications typically contain a user-visible front end, a query generator, and a back end which does the data computation and serves the front end. In response to a user or API request, a web application interacts with the API or with persistent data stored in a database. The application must support different clients, such as browsers and mobile devices, and interact with other systems and apps by using APIs and events. The backend must be secure and scale on demand.

Build your web application as a set of microservices that can be independently tested, deployed, and owned by different application teams. Expose services as REST APIs and communicate with other microservices using a built-in event mesh for events & messages, or APIs created per microservice. Incorporate machine learning in intelligent apps with ML models through REST endpoints in the database.

Web and mobile apps need to be scalable to handle spikes in demand and operate under stress with low latency. They must be available 24/7, resilient enough to produce data when requested, and must not lose stored information.

Mobile Apps are built with front-end frameworks like React Native or SwiftUI, creating the user interface which interacts with a backend for all its data and enrichment services. The developer can focus on the workflow and the application logic, accessing the backend through well-defined APIs that are resilient, secure, and scale autonomously.

Security is of paramount importance for web apps exposed to the internet. Data encryption, TLS, DDoS protection, firewalls, and granular user and data access management are critical. Both database security and application security are important, and handled with passwords and keys accessed through Oracle Database Wallet and OCI Vault service.

Design principles

Use lightweight open-source frameworks and mature programming languages

Build a mobile and web front-end in Javascript (React) or SwiftUI. For back-end data and app services include polyglot language support (Python, Node.js, Java, PL/SQL, Go, etc.) to enable use cases and microservices written in these languages. Process data close to where it is store.

Build apps as services that communicate over APIs

A web or mobile application often needs to interface with other business systems and services that are outside the organization. Services that are part of a web application should enable interaction and collaboration via well-defined APIs. Use Oracle REST Data Services (ORDS) to use data APIs or build new ones. Use ready-made REST endpoints through OML services and OML4Py embedded Python execution to enable machine learning. Use available API Gateways on the Oracle Cloud as the single point of entry for all clients and route API requests to the appropriate service. Configure load balancer services and ingress controllers for secure API communication between microservices.

Use fully-managed services to eliminate complexity in application development, reduce runtimes, and simplify data management

Maintaining a web or a mobile application infrastructure comes with the responsibility of deploying, upgrading, patching, scaling, and securing the setup. Use managed services such as Oracle Autonomous Database and a host of other managed services on the Oracle Cloud to maximize availability and scalability, and respond to the changing demands of web and mobile apps. Managed services ensure that the application is available 24/7/365 and protected when a failure occurs in the data center where the infrastructure is hosted. Use self-managing services only when a vendor- managed service is not available.

Keep application tier stateless

Keep middle-tier components of application stateless. If the application state is to be stored, use Oracle Autonomous Database to store application data and state for consistency, durability, and fast recoverability from the root of the application state. Keeping the state in one database is simpler and more efficient in overall application recovery.

Use converged databases with full featured support across all data

Web and mobile applications use data in different formats, and they need to store, search, and process for the data in a data store. The data might be tabular (relational), unstructured, formatted as XML and JSON, spatial, or graph. Traditionally, this disparity meant using a relational database for relational data, a document store for unstructured data, and graph databases for hierarchical linked data. But the use of multiple databases leads to operational complexity and data inconsistency. To solve this problem, use the converged Oracle Autonomous Database to store multiple types of data, index them, and provide ways to search the data, and use for unified analytics across all data.

Instrument end-to-end monitoring and tracing

A web or mobile application can contain hundreds of services, owned by different application and business teams. Unified observability with open tool interfaces is important for gaining visibility into the behavior of these inherently distributed systems. Centralize the observability solution by using the metrics, logs, and tracing exported from all layers of the application and data tiers. These services monitor the entire stack, from front-end to backend helping find and fix issues in the application quickly, and become the platform for continuous performance tuning.

Eliminate single point of failure through automated data replication and failure recovery

Web and mobile applications must be resilient, be able to recover from failures, and minimize downtime, and eliminate data loss. Redundnacy helps eliminate single points of failure: Use Kubernetes to manage the resiliency of database containers and app containers. The Oracle Database Kubernetes operator is designed for this, using CI/CD pipelines that include the data tier. In Kubernetes clusters, set up node pools with a minimum of three nodes with each node in a separate availability domain in a multi-availability domain region on OCI. In a single availability domain region on OCI, setup node pools in Kubernetes with each node in a separate fault domain. Use a single public load balancer and multiple private load balancers with multiple ingress controllers for redundancy in Kubernetes.

Setup the Autonomous Database for maximum availability using Autonomous Data Guard to minimize operational downtime and data loss.

Follow the principle of least privilege to ensure that users and service accounts have only the minimal privilege necessary to perform their tasks. Control who has access to the web application components by using Cloud Identity and Access Management (IAM) including the database. Use multifactor authentication in IAM to enforce strong authentication for administrators to restrict access to the application components and the database.

Architecture

This converged data architecture pattern for a web or mobile application is composed of a front-end, and microservices with the back-end using the app and data services built in the Oracle Autonomous Database. Optionally, for services not in the database, microservices in containers are managed in a Kubernetes cluster.

The following diagram illustrates this reference architecture.

web reference architecture

The architecture has the following components (Components can be fully managed services on Oracle Cloud Infrastructure or equivalent services in other deployment environments. For example, Oracle Autonomous Database is a managed service on OCI, and the Oracle Database runs in other environments).

  • Oracle Autonomous Database
    Oracle Autonomous Database provides data services for all types of data to be stored, processed and analyzed – operational database plus data warehousing and analytics. In addition it provides application services that include a built-in Event and messaging platform (Transactional Event Queues), machine learning, REST APIs and a low code development environment. This is a full data platform for modern web/mobile apps written as microservices.
  • Container Engine for Kubernetes (OKE)
    Kubernetes service is a fully managed, scalable, and highly available service that you can use to deploy your containerized apps to the cloud. You specify the compute resources that your apps require, and Container Engine provisions them on OCI in an existing tenancy. Container Engine uses Kubernetes to automate the deployment, scaling, and management of containerized apps across clusters of hosts.
  • Load balancer
    The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.
  • Oracle Cloud Infrastructure Identity and Access Management (OCI IAM)
    OCI IAM provides robust authentication, MFA, social login, self-registration for end-users, identity management, single sign-on (SSO), and identity governance for applications.

Use Cases

This use case describes a sample mobile food delivery application called GrabDish that uses a Microservices architecture with the Autonomous Database, and containerized databases (Pluggable Database in a Container Database) for each of its services. It uses the built-in messaging platform Transactional Event Queues (TEQ) for messaging, simplifying the common microservices patterns for developers. GrabDish shows AI/ML services, and a variety of data types used, programming the app in multiple languages. For details see the Livelabs on GrabDish at http://bit.ly/simplifymicroservices

Explore More

Learn more about related resources.

Event-Driven guidance

Messaging solutions connect application components—including existing on-premises systems—to cloud solutions. Message payloads are events, which are application generated, user inputs, data changes, or device generated events. Messaging enables event and data transfer as part of a well-defined distributed processing pipeline, or for publication of messages and events to multiple independent downstream systems that process, enrich, and analyze data. Most modern apps built with microservices rely on an event-driven architecture. An Event Mesh allows for any event produced in a system to be securely consumed anywhere else in the distributed system where it is needed in near real-time, using one or multiple Event brokers. After an event is processed, the data (payload) is stored in a data lakehouse for analytics and AI/ML model training.

Build messaging solutions that are highly available, reliable, and flexible. Use Oracle's converged database platform, cloud services, and best practices to deliver messaging and even-driven solutions based on business needs. These recommendations help minimize development integration, deployment overhead, and long-term management burden.

Messaging solutions connect application components, including your existing on-premises systems, to cloud solutions. They enable data transfer either as part of a well-defined distributed and converged processing pipeline or publish messages and events to multiple independent downstream systems that evolve independently.

These solutions should also transparently accommodate unplanned spikes in message load by buffering data and dynamically adjusting process resources. In the past, it was challenging for enterprises to deploy and manage reliable messaging solutions that meet these objectives without undue complexity and expenses. However, this implementation can now be straightforward in the cloud with messaging services designed for scale and performance.

Design principles

Use the following design principles to build your messaging applications or platform with the converged data architecture.

Build apps as services that communicate through APIs

Use either JMS or Kafka APIs for messaging with Oracle Transactional Event Queues. The use of the standard APIs provides application portability and lets you seamlessly build hybrid and multi-cloud messaging applications.

Use fully-managed services to eliminate complexity across application development, runtimes, and data management

Run applications on fully-managed services with built-in infrastructure maintenance and security patching. You can leverage scaling automation in response to the changing loads. Use Oracle Autonomous Database a fully managed Oracle Database service on OCI. Oracle Transactional Event Queues is a built-in feature of the database available in all OCI regions.

Use converged databases with fully-featured support across all data

Use Oracle Autonomous Database that natively support different types of data—JSON, relational, graph, spatial, and so on. Use database functionality to simplify application logic. For example, use SQL for queries, joins, and analysis. Use transactions to guarantee consistency and isolation, and built-in machine learning algorithms and analytics to avoid unnecessary data transfers. Use the database’s security features and access control to protect sensitive data, and use replication to improve the availability, scalability, and resiliency of your app.

Instrument end-to-end monitoring and tracing

A messaging application can contain hundreds of services, owned by different application and business teams. Unified observability with open tool interfaces is important for gaining visibility into the behavior of these complex distributed systems. Instead of each team building their own solution, centralize the metrics, logs, and tracing exported from all layers of the application. TEQ metrics can be exported into Prometheus, which supports Grafana dashboards for debugging and performance tuning workflows.

Eliminate single points of failure through horizontal scaling, and automate failure recovery.

Since TEQ is a part of the Oracle Autonomous Database, applications benefit from the built-in High Availability and cross-region Disaster Recovery capabilities without any additional work.

TEQ’s transactional messaging also simplifies recovery from external failures.

Implement a defense-in-depth approach to secure the app lifecycle

Implement Identity and Access Control (IAM) policies to allow only authorized users to create, send, or receive data from the streams. Apply a principle of minimum reachability to the endpoint by securing access to messaging endpoints with mTLS and service gateway, which limits access from the Internet. Encrypt data at rest and in transit to achieve data confidentiality. Use Database Wallet for securing credentials of connections to the database.

Architecture

The converged database architecture provides a design to accomplish messaging patterns in modern applications. This pattern uses Transactional Event Queues (TEQ) that is built-into the Oracle Autonomous Database.

This architecture provides simplicity by eliminating the need to utilize external streaming or queueing services and provides transactional messaging capabilities that simplify common microservices patterns.

converged database architecture image

Application architects should consider this architecture’s scalability, performance, and simplicity:

  • Use Transactional Event Queues (TEQ) in Oracle Database for Asynchronous Messaging
    The converged Oracle Database combines data and message processing in a scalable infrastructure which simplify life cycle management and security for all data and messages in the database. Processing of data is closer to the data store, which provides better performance, higher security, easier upgrades and maintenance, when compared to managing a separate messaging infrastructure.

    Transactional Event Queues in the database simplifies the building of microservices by providing transactions across messaging and database operations. The Transactional Outbox pattern is implicitly supported without additional code and with exactly-once messaging, makes applications simpler to write. Network or server failures are easily recoverable with transaction rollbacks.
  • Oracle TEQ high throughput messaging can scale up to 1 Trillion messages per day
    For the most demanding messaging workloads, Oracle TEQ provides high throughput performance with an in-memory accelerator cache used for enqueue and dequeue operations. TEQ supports both small messages that are typical in event processing as well as larger payloads associated with business workflows. TEQ supports larger messages by separating payload from metadata management in the message cache.
  • Oracle TEQ also functions as an event mesh
    In addition to messaging, TEQ also interoperates with Kafka as a secure Event Mesh, transporting the right event to the right application in real time for enterprise applications built on the cloud. Event transformation is supported with a built-in rules engine, and event processing with callback mechanisms available at both enqueuing and dequeuing of messages. These callback mechanisms can execute Java, PL/SQL, or OCI Functions.

    TEQ can use Kafka Connectors to interoperate with a multitude of event producer and consumer types. This establishes an Event Mesh and is the backbone for a scalable microservices application development environment across the enterprise.
  • Oracle TEQ brings the best of JMS messaging and Kafka-like Pub/Sub in a single platform
    Implement best of JMS messaging and Pub/Sub streaming in a single messaging system with Oracle TEQ. Oracle TEQ is partitioned similar to Kafka’s Topics/Partitions, and is the modern messaging engine of choice to be used with the converged Oracle Database for the most demanding data-intensive, event-driven applications using the messaging pattern for event exchange.

Use Cases

FedEx uses Oracle E-Business Suite and the business event manager built with Oracle TEQ, for accounts receivables of the 15.5 million packages delivered every day.

  • FedEx has moved E-Business Suite to Oracle Cloud Infrastructure using Exadata Cloud Service. E-Business Suite workflows and business event system are entirely built on Oracle Advanced Queuing (AQ) messaging.
  • Oracle AQ is simple to use and for high performance queues, Oracle TEQ is the high performance drop-in replacement with multiple event streams per queue.

Several 2-factor authentication (2-FA) scenarios that use one-time passwords (OTP) are enabled by Oracle Transactional Event Queues

  • 2-FA is used to validate package delivery in the retail industry worldwide. Messages are sent from the merchant app to the customer app and the delivery person’s app to be authenticated.
  • 2-FA is used to validate identity in cash withdrawals from ATMs. In this case the ATM app, the banking app, and the customer’s mobile are exchanging messages for authentication.

Explore More

Learn more about related resources.

Low code guidance

Low-code platforms are well suited for building opportunistic applications in collaboration with business stakeholders; building data reporting and analysis apps; extending SaaS apps; and modernizing legacy applications. Every line of code has a cost associated with it—to author it, maintain it, debug it, upgrade it, and secure it. Oracle Application Express (APEX) helps developers avoid these costs by providing high-level components and common design patterns through an intuitive and graphical development experience.

Low-code platforms enable you to build enterprise apps faster than with traditional hand-coding. These platforms are well suited for building opportunistic applications in collaboration with business stakeholders, building data reporting and analysis apps, extending SaaS apps, and modernizing legacy applications.

With low-code platforms, you are free to focus on solving your business problem rather than on the complexity of developing web applications. These complexities include security, accessibility, efficient data access, performance, and globalization. Low-code platforms eliminate this complexity by dramatically reducing the amount of code you have to maintain.

Oracle Application Express (APEX) helps developers avoid the costs associated with traditional app development by providing you with high-level components and common design patterns through an intuitive graphical development experience.

Oracle Cloud Infrastructure (OCI) provides the secure, reliable, scalable, and highly performing infrastructure required for the most demanding applications. These applications can be scaled to support anything from small workgroups to millions of end users. This document describes the design principles and optimal implementation path for architecting a low-code application.

Design principles

When implementing a low-code pattern, use the following Modern App Development design principles:

Use fully managed services to eliminate complexity across application development, runtimes and data management

Adopt a metadata-driven, low-code approach when developing apps. Specify application logic declaratively where possible, and write code only where necessary. Interact directly with data in the database by using SQL. Use fully-managed services, such as Oracle Autonomous Database and Oracle APEX Application Development (APEX Service), that can maximize availability and scalability to handle the changing demands of your low-code apps. In addition, database capabilities such as Oracle Real Application Clusters (RAC) and Oracle Data Guard ensure that your low-code apps are available 24/7/365 and can fail over if a failure occurs in the data center where the infrastructure is hosted.

Automate build, test, and deployment

Use the OCI Resource Manager to automate the provisioning of Oracle Autonomous Databases and APEX environments. Use Oracle SQL Developer Command Line (SQLcl) with Liquibase to automate the deployment of data model changes. Use APEX one-click application deployment to manually deploy changes between environments.

Keep application tier stateless

Oracle APEX is stateless and serverless and the runtime application state is stored in tables so connections can be re-used across users. This results in far fewer connections that can support high user concurrency.

Use converged databases with full featured support across all your data

Low-code applications commonly need to work with data in different formats, such as structured (relational), unstructured (XML/JSON documents), and spatial. Because APEX is embedded in the Autonomous Database, you can use SQL, PL/SQL, and server-side JavaScript to work across all these data formats. Also, because of the unique architecture of APEX, apps enjoy zero-latency data access, which provides the most optimal performance.

Instrument end-to-end monitoring and tracing

Monitor and trace APEX application activity by using the built-in Activity Monitoring capabilities, which include detailed tracing and debug information on a per-user level. Review workspace-level and instance-level activity from APEX Administration Services. Use Performance Hub to monitor database activity, review automatic workload repository (AWR) reports to identify top resource consumers, and to identify tuning recommendations.

Eliminate single point of failure though automated data replication and failure recovery

APEX on Oracle Autonomous Database is deployed using highly available architecture that encompasses the data tier (Exadata and RAC) and the middle tier (redundant Oracle REST Data Services nodes). Use Autonomous Data Guard to further increase the availability of your apps to protect against availability domain failures.

Implement a defense-in-depth approach to secure the app lifecycle

Use OCI Identity and Access Management (IAM) for authentication schemes for your APEX applications. Assign authorization schemes to APEX apps and app components to enforce access control based on user role or privilege. Use the built-in declarative capabilities of APEX to handle Session State Protection (SSP) and item-level encryption to protect your apps and data. Use bind variables in SQL queries to prevent against SQL injection. Configure app-appropriate timeouts to ensure that inactive sessions are automatically terminated. Run the built-in APEX Advisor to detect potential security concerns such as unprotected pages, items, and buttons. Use declarative escaping and programmatic escaping APIs to guard against cross-site scripting (XSS).

Architecture

This architecture is for low-code app development and deployment using Oracle APEX Application Development (APEX Service) and all Oracle Autonomous Database services. By deploying these services, all the components necessary for the full stack are automatically provisioned and fully managed. These components include gateways, load balancers, and Oracle REST Data Services.

low code architecture

This image shows the architecture underlying low-code app development and deployment using Oracle APEX Application Development and all Oracle Autonomous Database services. It shows a fully-managed environment containing a public subnet and a private subnet. Outside the fully-managed environment are these service: APEX Applications, Oracle REST Data Services APIs, and External REST APIs. These services access the fully managed environment through either an Internet gateway or a NAT gateway.

The public subnet contains a load balancer while the private subnet contains an Oracle REST Data Services instance and an Oracle Autonomous Database, upon which is an instance of Oracle APEX on Autonomous Database.

Traffic from the APEX Applications and the Oracle REST Data Services APIs is routed through an Internet gateway to the load balancer, which directs it bidirectionally to the Oracle REST Data Services instance in the private subnet. This service, in turn, communicates bidirectionally with Oracle Autonomous Database. Traffic from the Oracle APEX on Autonomous Database instance is routed directly to the external REST APIs through a NAT gateway.

All application artifacts are deployed to the database tier, which provides zero-latency data access because there’s no need for network traffic between the application and database tiers.

  • Gateways and Load Balancers
    This infrastructure is automatically provisioned, fully-managed, and is provided to support access to the APEX services. These services are fully transparent to the low-code APEX application developer.
  • Oracle APEX
    Oracle APEX on Autonomous Database provides a preconfigured, fully managed and secured environment to both develop and deploy applications.
  • Autonomous Database
    Oracle Autonomous Database is a self-driving, self-securing, self-repairing database service that’s optimized for transaction processing workloads. You don’t need to configure or manage any hardware or install any software. OCI creates the database and manages backups, patching, upgrades, and tuning. Autonomous Database provides a converged database that enables you to store, index, search, and manipulate all of these data types.
  • Oracle REST Data Services
    Oracle REST Data Services (ORDS) bridges HTTPS and your Oracle Database. A mid-tier Java application, Oracle REST Data Services provides a Database Management REST API, SQL Developer Web, a PL/SQL gateway to APEX applications, and the ability to publish RESTful web services for interacting with the data and stored procedures in your Oracle Database.

Other Considerations

When implementing a low-code pattern, you should also consider the following:

  • Applications developed by using APEX can integrate with external services and systems by consuming REST APIs directly or by automatically synchronizing REST data locally.
  • You can also publish the functionality developed in APEX as REST APIs for external consumption by using the built-in REST Data Workshop and Oracle REST Data Services.

Alternatives and Antipatterns

Consider the alternatives to the architecture described in this pattern and avoid attempting to implement antipatterns.

  • Alternatives
    It’s common to augment low-code apps with high-control technologies for specific edge cases. For the components of your app that aren’t suited for low-code app development, use Java or JavaScript apps to access the same data, enabling a common, single data store. This pattern provides the flexibility to use the most appropriate technology for the specific use case and low-code for the rest.
  • Antipatterns
    We do not recommend hand-coding most business applications. There are numerous complexities involved in developing an opportunistic app, including security, accessibility, efficient data access, performance, and globalization. However, these complexities can be addressed more efficiently by using low-code platforms.

Use Cases

Some examples of the efficacy of a low-code pattern are:

  • Opportunistic Apps
    When a new business opportunity emerges, sometimes a new application needs to be built quickly. Organizations have a huge backlog of apps that are required to meet changing business needs and remain competitive. This backlog can be poorly defined, and the business priorities might change rapidly, so the apps must be fast to build and easy to update as required. Such apps can be easily built and maintained using APEX.
  • Data Reporting and Analysis
    Obtaining a complete, accurate picture across an organization, or even within a department, is often challenging. Data is held in numerous systems, existing reports are limited and don't always provide the detail needed to make informed business decisions, it’s hard to limit who can see what and avoid data breaches and running canned reports can take hours. Using APEX and its extensive reporting and data visualization capabilities makes developing appropriate dashboards for various user communities far simpler.
  • SaaS and EBS Extensions
    ERP systems provide extensive functionality but they don't always provide the specific reports that you need or might be missing functionality that’s specific to your industry or your organization. You might also have common business processes that take too many steps to complete, making them inefficient. In such cases, building an extension by using APEX can deliver the appropriate information or greatly improve productivity and user experience.
  • Legacy App Modernization
    Oracle Forms apps often provide an out-of-date client/server user experience. These legacy systems often have usability and accessibility issues, have difficulties working with various browsers, and aren’t mobile friendly. Oracle APEX is the clear platform of choice for easily transitioning Oracle Forms applications to modern web apps. The same stored procedures and PL/SQL packages work natively in APEX, making it a breeze to develop.
  • Spreadsheet Replacement
    Almost every organization uses spreadsheets to disseminate and report on data. Why? Because spreadsheets are so easy to create. Anyone can put together a spreadsheet when they have the data. After they are created, spreadsheets are often sent it out to colleagues to help update, which inevitably leads to numerous copies with different data and flawed business processes. A far better solution is to have a single source of data stored in a fully secured database with a browser-based app that everyone can use to maintain the data.

Explore More

Learn more about related resources.

AI/ML guidance

Big data is a set of capabilities and patterns that enables you to manage, collect, store, catalog, prepare, process, and analyze all data types (unstructured, semi-structured, and structured)—whether they come from sources such as databases, videos, forms, documents, log files, web pages, or images. Further, a machine-learning platform should be fully managed and allow data engineers and data scientists to perform all these steps in the model development lifecycle.

Oracle's big data capabilities span various services and tools so that you can begin your big data journey based on your skills and preferences. With converged Oracle database variety of data is stored in the database, and scales up to multiple PBs of volume, with fast ingest of data, and events (with Transactional Event Queues), providing security of all data with built-in security. The converged data architecture addresses the volume, variety, velocity, and veracity of data in a converged database platform with Oracle Autonomous Database.

Data scientists and machine learning engineers don’t want to spend time provisioning, upgrading, patching, and securing infrastructure. They want to spend their time building, training and deploying models that impact the business. A machine learning platform should be fully managed and should allow them to perform all the steps in the model development lifecycle (build, train, deploy, and monitor). Data used for machine learning should be source agnostic and let data scientists access consistent and reliable data for building, training and deploying models.

Most modern machine learning toolkits are open source and written in Python. As such, a machine learning platform should provide native support for open source frameworks and Python. It should also give users the ability to customize their machine learning environments by either installing their own libraries or upgrading the ones already installed. The platform should let data scientists train their models on structured, unstructured, or semi-structured data while scaling the extract, transform, and load (ETL) or training steps vertically or horizontally across a number of compute resources.

Finally, a machine learning platform should ensure that models can be easily deployed for real-time consumption with minimal friction (ideally through a simple REST call), while preserving the lineage of the deployed model to ensure that it can be audited and reproduced.

Design principles

When implementing a big data and analytics pattern, use the following design principles for Modern App Development.

Use fully managed services to eliminate complexity across application development, runtimes and data management

Your data is only as valuable as your ability to use it. Big data tools are popular in the open source community, and most of the capabilities from them have equivalent capabilities in modern databases, especially with data warehousing, analytics and AI/ML model training and deployment. The converged Oracle Database is one such big data platform.

Oracle's native premium capabilities such as Oracle Autonomous Data Warehouse external tables and SQL allow the use of data lakehouses in the Oracle database and OCI object store for storing and analyzing petabytes of data in real time.

Automate build, test, and deployment

DataOps is important for ensuring that you can derive maximum benefits from your big data pipelines. Use the Oracle Cloud Infrastructure Data Integration service to ingest data, implement ETL processing and ELT pushdown, and create pipelines for connecting tasks in a sequence or in parallel to facilitate a process. Pipelines can include various popular data sources within and outside of Oracle Cloud. Use Data Integration scheduling capabilities to define when and how often to run each task. Use Oracle Database Cloud Service Management to define database jobs that run against a set of databases on a schedule. Enhance this with CI/CD pipelines and unified observability for app developers.

Use converged databases with full featured support across all data

Use the best tools that can simplify, automate, and accelerate the consolidation of data for use for maximum business value. For data warehouses, departmental data marts, and serving and presentation layers with structured data, use Autonomous Database, which is optimized for these scenarios. Autonomous Data Warehouse capability also provides connectivity to analytics, business intelligence, and reporting tools like Oracle Analytics Cloud.

Implement a defense-in-depth approach to secure the app lifecycle

Plan to keep your data secure. Track all jobs that bring data in and take data out of your data lake, keep data lineage metadata, and ensure that access control policies are updated.

Follow the principle of least privilege, and ensure that users and service accounts have only the minimal privilege necessary to perform their tasks. Control who has access to the data platform components by using Oracle Cloud Infrastructure Identity and Access Management. Use multi-factor authentication in Oracle Cloud Infrastructure Identity and Access Management to enforce strong authentication for administrators. Use Database security with Oracle Datasafe to get security posture of all data, users, and access patterns. Store sensitive information such as passwords and authentication tokens a Vault service.

Architecture

Use a scalable converged database (transaction processing and data warehouse) to operate the enterprise which generates data, stores and analyzes all types of data. In this architecture, various data sources (end users, devices, events, sensors, and applications) feed data to the database through data integration (Oracle GoldenGate) and Oracle Transactional Event Queues for streaming data. The data is stored in Oracle Autonomous Database (Oracle Autonomous Transaction Processing and Oracle Autonomous Data Warehouse) along with OCI Object Store support for big data using SQL with external tables. Use Oracle Machine Learning for model building and deployment, and use Oracle Analytics Cloud for insights into the data.

Oracle Machine Learning (OML Notebooks, OML AutoML UI, OML Services, OML4Py) on Oracle Autonomous Database for exploring and preparing data, and building, evaluating, and deploying machine learning models.

This architecture pattern provides powerful capabilities when the data required to train the model is brought into the database, and processing is close to where the data is. This pattern has a variety of data sources and events that can be explored and prepared with data exploration for OML training, and use OML to build, train and deploy models with SQL, Python using OML4Py, or using no-code AutoML. Import models trained elsewhere, like OCI Data Science (e.g. Tensorflow or PyTorch) with OML Services, using the ONNX model format.

This pattern utilizes the ‘move the algorithms to the data’ approach. All data is accessed at its source (using external tables) or ingested, and then processed and stored into converged database for ML models. Once the models are trained in the database, they can be directly deployed using SQL queries (using PREDICTION operators) or with OML4Py APIs. Externally trained models can be deployed using OML Services. Additionally, text, spatial, and graph analysis machine learning is available in the Autonomous Database.

AI/ML architecture

This architecture uses the following components:

  • Autonomous Database
  • OCI Data Catalog
  • Oracle Machine Learning (OML) – with SQL, OML4Py
  • OML Services
  • Oracle Transactional Event Queues (TEQ)
  • GoldenGate Data Integration
  • Oracle Spatial Studio
  • Oracle Graph Studio
  • Oracle Text

The following data sources are covered:

  • Enterprise applications
  • Devices
  • End user
  • Events
  • Sensors
  • Any digital asset

This architecture has the following components within the VCN:

  • Virtual cloud network (VCN)
    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.
  • Data Integration
    Oracle Cloud Infrastructure Data Integration is a fully managed, serverless cloud service ingests and transforms data for data science and analytics. It helps simplify complex ETL and ELT into data lakes and warehouses with Oracle’s modern, no-code data flow designer. You can use one of the ready-to-use operators—such as a join, aggregate, or expression—to shape your data.
  • Oracle Cloud Infrastructure Transactional Event Queues (TEQ) in ADB
    Oracle Transactional Event Queues in an autonomous database provide database-integrated message queuing functionality. This highly optimized and partitioned implementation leverages the functions of Oracle database so that producers and consumers can exchange messages with high throughput, by storing messages persistently, and propagate messages between queues on different databases. Oracle Transactional Event Queues are a high performance partitioned implementation with multiple event streams per queue.
  • Oracle Autonomous Database
    Oracle Autonomous Database is a self-driving, self-securing, self-repairing database service that is optimized for data warehousing workloads. You do not need to configure or manage any hardware, or install any software. Oracle Cloud Infrastructure handles creating the database.

    This cloud data warehouse service eliminates all the complexities of operating a data warehouse, securing data, and developing data-driven applications. It automates provisioning, configuring, securing, tuning, scaling, and backing up the data warehouse. It includes tools for self-service data loading, data transformations, business models, automatic insights, and built-in converged database capabilities that enable simpler queries across multiple data types and machine learning analysis.
  • OCI Object Storage
    Object storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store and then retrieve data directly from the internet or from within the cloud platform. You can seamlessly scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

    This internet-scale, high-performance storage platform offers reliable and cost-efficient data durability. The Object Storage service can store an unlimited amount of unstructured data of any content type, including analytic data and rich content, like images and videos.
  • Oracle Machine Learning on Oracle Autonomous Database
    Oracle Machine Learning on Oracle Autonomous Database (Autonomous Transaction Processing and Autonomous Data Warehouse). In addition to spatial and graph processing, ML enables many use cases like routing simplification for package delivery, and fast anomaly detection in anti-money laundering operations.
  • Oracle Analytics Cloud
    This best-in-class platform for modern analytics in the cloud empowers business analysts and consumers. Oracle Analytics Cloud offers modern AI-powered self-service analytics capabilities for data preparation, discovery, and visualization; intelligent enterprise and on demand reporting together with augmented analysis; and natural language processing and generation. Whether you’re a business analyst, data engineer, citizen data scientist, departmental manager, domain expert, or executive, Oracle Analytics Cloud can help you turn data into insights.
  • Analytics, ML, and custom apps
    Analytics services, Oracle Machine Learning, and custom applications that will catalog, prepare, process, and analyze big data.
  • OCI Data Catalog
    Oracle Cloud Infrastructure Data Catalog is a fully managed, self-service data discovery and governance solution for your enterprise data. It provides data engineers, data scientists, data stewards, and chief data officers a single collaborative environment to manage the organization's technical, business, and operational metadata.

    Oracle Cloud Infrastructure Data Catalog is a metadata management service that helps data professionals discover data and support data governance.
  • Oracle GoldenGate
    This fully managed service offers a real-time, log-based change data capture (CDC) and replication software platform to meet the needs of today’s transaction-driven applications. The software provides capture, routing, transformation, and delivery of transactional data across heterogeneous environments in real time.

Considerations and antipatterns

Consider the following for big data and analytics.

  • Reduce data copies and movement
    Data movement is costly, consumes resources and time, and can reduce data fidelity. Choose the right service to store and process your data, depending on data types, data quality, and required transformations. Use converged Oracle Database for your data lake storage for all types of raw data, managing upto multiple Petabytes of operational and analytical data in realtime. Extend this storage with an appropriate Object Store. Use Oracle Autonomous Data Warehouse to store transformed data for presentation. Using the right store helps you avoid copying and moving data and reduces duplicate copies of data, which can be hard to maintain and keep synchronized.
  • Provide your users the data interface they need
    Enterprise data and analytics platforms have many types of users: data engineers, data analysts, application developers, big data engineers, database admins, business analysts, data scientists, data stewards, and other consumers. All of them have different needs and preferences for consuming data. Understanding all your use cases and data consumer requirements is important. For SQL queries and interfacing with business intelligence tools, use Autonomous Data Warehouse.

When implementing machine learning and artificial intelligence, consider these options.

  • Provide horizontal scalability at every step in the model development lifecycle Provide horizontal scalability for the ETL and data processing steps, the model training itself, and the model deployment. Ensure model reproducibility. Models are audited and need to be reproduced. Reproducing a model requires that references to the source code, training and validation datasets, and environment (third-party libraries and architecture) are provided when a model is saved. Use references to Git repos and commit hashes to track code. Use Object Storage to save snapshots of training and validation datasets.
  • Version-control code, features, and models. This consideration is related to model reproducibility.
  • Package, share and reuse third-party runtime dependencies. Reuse the same Python notebooks, jobs, and model deployments. Doing so also minimizes the risk of third-party dependency mismatches between those steps.
  • Be data source agnostic while limiting data transfers. Transferring data to a model training environment is time-consuming. Use data in the database that can be shared across notebook environments or training jobs as much as possible. Keep local dataset snapshots for model training and validation purposes.

Antipatterns

When designing an implementation, consider the following:

  • Not using a converged database causes data fragmentation, copy contagion, and data security risks
  • Lack of data cataloging and governance can convert data lakes into data swamps

Alternatives and Antipatterns

Consider the alternatives to the architecture described in this pattern and avoid attempting to implement antipatterns.

  • Alternatives
    It’s common to augment low-code apps with high-control technologies for specific edge cases. For the components of your app that aren’t suited for low-code app development, use Java or JavaScript apps to access the same data, enabling a common, single data store. This pattern provides the flexibility to use the most appropriate technology for the specific use case and low-code for the rest.
  • Antipatterns
    We do not recommend hand-coding most business applications. There are numerous complexities involved in developing an opportunistic app, including security, accessibility, efficient data access, performance, and globalization. However, these complexities can be addressed more efficiently by using low-code platforms.

Use Cases

The following are example implementations that use Oracle Cloud Infrastructure (OCI) data and analytics services to ingest, store, catalog, prepare, process, and analyze big data.

  • Data Warehousing and business analytics

    Use Oracle Autonomous Data Warehouse as a data warehouse or data mart with Oracle Analytics Cloud.

    • Data Integration ingests data from intended sources. The type of data integration used depends on whether the data is batch, streaming, or synchronized database records, and whether the data is on-premises or in the cloud.
    • Data can be delivered to Object Storage for shared access by cloud services and for processing before it’s stored in Autonomous Data Warehouse or Big Data. Data can also be delivered directly to Autonomous Data Warehouse and then transformed using ELT capabilities, or records from other databases can be directly ingested.
    • Oracle Analytics Cloud provides visualization of data in the database, including machine learning results. Oracle Analytics Cloud pushes down as much processing as possible to Autonomous Data Warehouse for data flow processing.
    • Object Storage is optional for active archive or data sharing. An active archive is where less frequently used data is moved from ADW to a lower-cost storage tier (Object Storage). The data can still be queried from Object Storage, but the performance is slower. Object Storage can also be used to store data that is shared between cloud services.
    • Oracle Cloud Infrastructure Data Catalog harvests metadata from Autonomous Data Warehouse and Object Storage data sources. You interact with Data Catalog to use and manage the catalog.
  • Manage all types of data with a data lake and data warehouse for a lake house pattern

    Manage data in both Autonomous Data Warehouse, and use Oracle Analytics Cloud for visualization of the data.

    • Data Integration ingests data from intended sources. The type of data integration used depends on whether the data is batch, streaming, or synchronized database records, and whether the data is on-premises or in the cloud.
    • Data can be delivered converged Oracle Database by cloud services and for data and event processing as it is stored in Autonomous Data Warehouse. Data can also be directly delivered to Autonomous Data Warehouse and then transformed using ELT capabilities, or records from other databases can be directly ingested.
    • Autonomous Data Warehouse can query data from Object Storage or ingest data from Object Storage with SQL or with the help of Oracle Cloud Infrastructure Data Integration.

Explore More

Learn more about related resources.