University of Michigan improves AI text summaries of academic journals

Using Oracle Cloud Infrastructure AI, University of Michigan creates summary system for long documents used in research, journalism, and legislation.

United States | Higher and Post Secondary Education

“The 40 GB of memory provided by the NVIDIA A100 GPU running on OCI more than doubles the capacity our last-generation 16 GB NVIDIA V100 processor. When running experiments with the same configuration, the A100 uses about 25% less time on average. What makes it even better is the smooth process of setting up the machine on Oracle Cloud.”

Shuyang CaoGraduate Student Research Assistant, University of Michigan

Business challenges

There are an estimated 2.5 million academic journals in the scientific field alone, with each of these publishing hundreds of articles each year. With the huge amount of information being produced, important data is easily overlooked by key decision-makers and the general public. While today’s text summary technology can process short documents, it is usually less effective with longer documents of 10,000 words or more. These systems also frequently include unconfirmed or verifiably false details in their summaries.

Therefore, researchers at the University of Michigan wanted to build a new natural language processing (NLP) system that could summarize longer documents, including research papers, articles, and reports, more accurately.

Why the University of Michigan chose Oracle

The university selected Oracle Cloud Infrastructure AI to support this natural language processing research. In addition to providing performance improvements, the researchers’ Oracle Cloud credits gave them free access to GPUs, bare metal compute, and low-latency cluster networking. With the upgrade, the university could process long documents using advanced neural networks much faster than before.

Results

For this project, the researchers used high-performance virtual machines and remote NVIDIA A100 Tensor Core graphical processing units (GPUs), which proved effective for running the team’s memory-hungry summarization algorithms. The university used BM.GPU4.8, which contains 8 GPU cards at 40 GB of memory each. These high-performance GPUs sped up the training of the researchers’ models and the validation of their hypotheses.

The university used this computational power to build a new NLP system by improving the efficiency of existing language-generation models and increasing the document length accepted by those models. The ability to consume more words allowed the summarization system to analyze long documents in their entirety, instead of processing a truncated version. An NLP model that can read the full document also improves the accuracy of the facts in the generated summaries, as the system becomes aware of all the information available.

This new software generated summaries that more accurately captured the core content of long documents, supporting new research, journalistic articles, government reports, and more. In their experiments, Michigan researchers found that longer document lengths led to more accurate summaries, as measured by both human assessment and automatic metrics.

Learn more about University of Michigan

Additional resources

Research paper: Efficient attentions for long document summarization (PDF)

Research paper: CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive Summarization (PDF)