Country

AI Blueprints

Deploy, scale, and monitor GenAI workloads in minutes with Oracle Cloud Infrastructure (OCI) AI Blueprints. Get prepackaged, OCI-verified deployment blueprints, complete with hardware recommendations, software components, and out-of-the-box monitoring.

Try out AI blueprints

Overview
FAQ

Why OCI AI Blueprints?

Confidently deploy with OCI-verified best practices

Ease AI workload deployment concerns to scale deployments, determine driver and application compatibility, and manage observability and management decisions with blueprints built on OCI-verified best practices.

Register for our next webcast
Simplify your GenAI deployment

Deploy and monitor your mission-critical GenAI workloads in minutes with blueprints that include verified hardware, software, and out-of-the-box monitoring.

Read the developer blog
Ease AI monitoring and observability concerns

Adopt prebuilt connections to third-party observability applications, such as Prometheus, Grafana, and MLflow to ease monitoring and observability concerns across AI workloads.

Try OCI AI Blueprints now

Popular use cases for AI Blueprints

Blueprints

LLM and VLM inference with a vLLM

Simplify the deployment of large language models (LLMs) and vision language models (VLMs) using an open source interface engine called virtual large language model (vLLM). Deploy a custom model or select from a variety of open models on Hugging Face.
Blueprints

Fine-tuning benchmarking

Streamline infrastructure benchmarking for fine-tuning using the MLCommons methodology. It fine-tunes a quantized Llama-2-70B model with a standard data set.
Blueprints

LoRA fine-tuning

OCI AI Blueprints enable efficient model tuning using low-rank adaptation (LoRA), a highly efficient method of LLM fine-tuning. Fine-tune a custom LLM or use most open LLMs from Hugging Face.
Blueprints

Health Check

Prior to deploying production or research workloads, you can use a robust, precheck blueprint for thorough GPU health validation to proactively detect and address issues. Verify that your GPU infrastructure is primed for high-demand experiments across both single- and multi-node environments.
Blueprints

CPU inference

Adopt a comprehensive framework for serving LLMs on CPUs using the Ollama platform with a variety of supported models, such as Mistral, Gemma, and others.
Blueprints

Multi-node inference with RDMA and vLLM

With this blueprint, you can distribute inference serving across several computing nodes, each typically equipped with one or more GPUs. For example, deploy Llama 405B–sized LLMs across multiple H100 nodes with RDMA using vLLM and LeaderWorkerSet.
Blueprints

Scaled inference with vLLM

Serve LLMs with autoscaling using KEDA, which scales to multiple GPUs and nodes using application metrics, such as inference latency.
Blueprints

LLM inference with MIG

Deploy LLMs to a fraction of a GPU with NVIDIA’s multi-instance GPUs and serve them with a vLLM.

Utilize OCI AI Blueprints for your technology

Get your AI application running quickly and efficiently with opinionated hardware recommendations, prepackaged software stacks, and out-of-the-box observability tooling.

Validated hardware recommendations

Deploy your GenAI workloads with confidence using prepackaged blueprints tested on recommended OCI GPU, CPU, and networking configurations, saving you from time-consuming performance benchmarking and guesswork.
Opinionated, prepackaged software stacks

Adopt the necessary frameworks, libraries, and model configurations for popular AI use cases, such as RAG, fine-tuning, and inference, or customize use cases for your business needs.
Built-in observability and autoscaling

Get simplified infrastructure management with automated MLOps tasks, including monitoring, logging, and scaling. Get started quickly with preinstalled tools, such as Prometheus, Grafana, MLflow, and KEDA, to get a production-grade environment with minimal effort.

March 18, 2025

OCI AI Blueprints: Go from Zero to Hero While Deploying AI Workloads on OCI

Maywun Wong, Director, Product Marketing, Oracle
Amar Gowda, Senior Principal Product Manager, Oracle
Vishnu Kimmari, Principal Product Manager, Oracle

Introducing OCI AI Blueprints, an AI workload Kubernetes management platform with a set of blueprints that can help you deploy, scale, and monitor AI workloads in production in minutes.

Read the complete post