How can you deliver inference requests at scale for your large language model and accelerate your AI deployment? By deploying the enterprise-ready solution NVIDIA NIM on Oracle Cloud Infrastructure (OCI) Kubernetes Engine (OKE). In this demo, we’ll show how to deploy NVIDIA NIM on OKE with the model repository hosted on OCI Object Storage. Using a Helm deployment, easily scale the number of replicas up and down depending on the number of inference requests, plus get easy monitoring. Leverage OCI Object Storage to deploy models from anywhere, with support for various types of models. Powered by NVIDIA GPUs, take full advantage of NIM to help you get the maximum throughput and minimum latency for your inference requests.