May 31, 2022 | 6 minute read
“Oracle provided us with the artificial intelligence and computation platform we needed to develop research to cure children’s cancer.”
The author would like to thank Prachi Solomon, Principal Solution Engineer, Oracle for her contributions.
Children's Medical Research Institute (CMRI) is an Australian medical and biological research institute and a registered nonprofit. For more than 60 years, the institute has improved healthcare outcomes for children. It has many firsts to its credit, including creating Australia's first research unit for newborns and advancing microsurgery techniques to help repair blood vessels and organs in infants and children.
Today, CMRI advances healthcare research for children in the areas of childhood cancer, epilepsy, eye disorders, and other genetic diseases. But to analyze data from genomics sequencing, proteomics, high-resolution images from microscopes, and numerical simulations—and to manage multiple terabytes of data, CMRI knew that it needed better computational resources than their existing solution could offer—and an advanced data science service to support their machine learning goals.
- Reduce numerical simulation time from 30 days to five with OCI Data Science
- Become 30-50% more efficient with resources
- Save approximately 30% in costs with Oracle Cloud
CMRI’s Cloud Goals
With researchers collaborating from across locations and a need for additional compute power, including fast CPUs and GPUs, CMRI knew it was time to change the way it historically managed data pipelines. The institution had the following goals:
- Optimize its processes to improve performance
- Facilitate seamless collaboration and make the most out of data
The organization evaluated AWS and Google, as well as Oracle. The right cloud solution, CMRI knew, would:
- Enable the institute to provision infrastructure almost instantaneously
- Scale up or down as the needs shifted
- Provide physical data center and virtual network security.
It would also provide a machine learning service that would manage the entire model lifecycle, provide access to open-source libraries and tools, and facilitate data scientists who wanted to share and reuse models.
In addition, CMRI sought to make research data shareable, customizable, and reusable for researchers, data scientists, and operations teams. With one local server on-premises, resources were shared among multiple people, slowing down the research process. Once the pandemic necessitated working from home, researchers needed to log into systems externally, which was slower and not as efficient.
Suite of Selected Cloud and Machine Learning Services
CMRI partnered with Oracle to address its immediate challenges in supporting medical research with large-scale data analytics and data coming from high throughput NGS technologies.
CMRI chose OCI Data Science for its machine learning capabilities, in addition to a host of other services, including:
- Oracle Cloud Infrastructure: Oracle Cloud is the first public cloud built from the ground up to be a better cloud for every application. By rethinking core engineering and systems design for cloud computing, we created innovations that accelerate migrations, deliver better reliability and performance for all applications, and offer the complete services customers need to build innovative cloud applications.
- OCI Data Science: OCI Data Science is a fully managed and serverless service for data science teams to build, train, and manage machine learning models in the Oracle Cloud Infrastructure. It provides data scientists with a collaborative, project-driven workspace to train models using Python-centric tools, libraries, and packages on Conda environments.
- OCI Object Storage: Oracle Cloud Infrastructure (OCI) Object Storage enables customers to securely store any type of data in its native format. With built-in redundancy, OCI Object Storage is ideal for building modern applications that require scale and flexibility, as it can be used to consolidate multiple data sources for analytics, backup, or archive purposes.
Migration Path to Oracle Cloud
CMRI spent a few months testing various clouds, including AWS, but ultimately chose OCI due to its features and usability. They chose OCI for the following reasons:
- Better performance for the features most important to CMRI
- Cost savings
- Improved cloud support, with frequent engagement and meetings to resolve hurdles
- Ability to address the need for upgrades and aging infrastructure complexities
- Integrated AI solution with access to open source libraries and frameworks
Because CMRI had been testing OCI for a few months, it took less than a week for them to officially deploy projects to the cloud. CMRI prioritized resource-intensive jobs that required compute-intensive RAM and CPUs. They also prioritized projects that required heavy collaboration, because these were the projects that would benefit most from moving to the cloud.
Some of the architectures choices they made included using block volume drives on multiple instances using a shared VCN network, which means CMRI no longer has to copy data. This has allowed them to switch from a data science instance to a VM. CMRI uses OCI Object Storage to store archived data sets. In addition, CMRI uses many pipelines with domain-specific workflow language and they use local forwarding and SSH tunneling to connect to servers over the public internet.
AI and Machine Learning Medical Research Results
A typical numerical simulation that once took CMRI about 30 days to perform now takes only five days with OCI Data Science, regardless of the number of simulations being run. This is accomplished with OCI Data Science, an end-to-end machine learning service that enables CMRI to provide researchers with basic templates for different kinds of analyses and data, and which data scientists can further customize.
Example uses of OCI Data Science include simulating proteins and measuring what occurs when a mutation is introduced to particular proteins and whether that mutation makes the proteins weaker or stronger, and analyzing data from proteomics projects.
The move to OCI has helped CMRI gain consistent access to the most up-to-date technology. Galaviz explained, “One thing we had to do prior to purchasing OCI was purchasing our own server, and using our own graphic cards. But the problem with that is the graphics cards are evolving constantly, so our graphics cards became obsolete very quickly. But Oracle constantly updates their graphic cards so we always have up-to-date hardware.”
CMRI uses Oracle-provided GPUs to run intensive molecular dynamic simulations, in addition to a tool from NVIDIA called Parabricks, which helps CMRI match sequences from a genome to a reference genome. CMRI can now test faster and provision resources faster, rather than waiting on data transfers or waiting to get resources. Now, they can deploy and configure new projects within a day.
The move to OCI has helped CMRI take advantage of big data and machine learning capabilities to automate routine database tasks, database consolidation, operational reporting, and batch data processing. It has also made data available much faster to the people who need it. OCI has helped CMRI simplify governance and security. Prior to OCI, CMRI had a centralized shared drive with complicated access through VPN and mounting drives. Now, they can share the data from OCI Object Storage. With OCI, CMRI can place more controls on their environments and build specific environments for specific analyses. This makes governance simpler, especially when packages are not compatible between different software types. This allows CMRI to focus on building a more advanced pipeline for medical research. Today, CMRI can readily share data, code, and resources with people through the institute.
Being on a single unified cloud platform also helps the institute create workflows and generate reproducible results. The cloud-enabled modernization is bringing CMRI transformative opportunities, such as performing complex gene therapy research. For projects such as scRNA-Seq and genomics projects, CMRI is now 30-50% more efficient with their resources. And their cost savings are approximately 30% lower as compared to maintaining a server.
Future Machine Learning Medical Research Goals
Eventually CRMI intends to use OCI Vision and OCI Data Labeling when it comes to automating workflows around bounding cells. CMRI also plans to place an image in object storage, run scripts to pull the image and process it once the network has been trained. Another expansion opportunity that the institution has explored is in big data integration. Project involving proteomics, genomics, transcriptomics require samples and metadata from different institutes and need to integrate data from different diseases like breast cancer, bone cancer, etc. To accomplish this, CMRI will utilize additional computational power on OCI.
CMRI is continually pushing the boundaries of medical research and with Oracle, they can continue to make children’s lives better.
To try Oracle AI, start a free machine learning workshop:
CMRI cuts costs by 25% with Oracle Cloud Infrastructure
Australian research institute embraces OCI Data Science to unlock flexibility and scalability, discover new insights, and perform analysis faster.