Cerebras and Abu Dhabi’s M42 made an LLM dedicated to answering medical questions


Cerebras and M42 logos

Cerebras and M42/ZDNET

The applications of artificial intelligence in health care are numerous. But they are largely dominated by older AI technology; newer things such as so-called generative AI and large language models (LLMs) are the craze of the moment, but they are deemed too risky to be used to any great extent in health care given the sensitive nature of health applications, as ZDNET has recently reported

Efforts in open-source software could help advance generative AI by making it a little bit easier to look inside the “black box” of AI compared to closed programs such as OpenAI’s ChatGPT

Also: How does ChatGPT actually work?

In that spirit, AI computer maker Cerebras Systems last week announced a joint effort with partner M42, an operator of healthcare facilities in 27 countries, to offer an open-source LLM designed for health applications, to serve as an “assistant” to healthcare providers. 

The program, called Med42, is a refinement of Llama 2, the open-source LLM released by Meta Properties this year, using a special health-related data set compiled by the companies. 

“It’s blazing a trail in the use of AI in delivering healthcare,” said Cerebras co-founder and CEO Andrew Feldman in an interview with ZDNET.

Also: 3 ways AI is revolutionizing how health organizations serve patients. Can LLMs like ChatGPT help?

Applications of the program as an assistant to physicians include medical question answering, patient record summarization, aiding medical diagnosis, and general health Q&A, according to the companies. It does not include training physicians, they emphasized. 

The Med42 program uses the 70-billion-parameter version of Llama 2. The work of fine-tuning was performed by Cerebras and M42 in conjunction with Core42, a managed services and IT firm that does fundamental AI research. Both M42 and Core42 are owned by Cerebras customer G42, a global conglomerate.

The Med42 neural net was fine-tuned with a data set of 700,000 question-and-answer pairs from publicly available sources, “curated by M42 and reviewed by our team of medical experts,” said M42 in an email to ZDNET. “The dataset included multiple choice questions, medical flashcards, among others,” it said.

“Med42 has not been trained using patient data or personally identifiable information,” said M42.

The M42 code is available now on HuggingFace, along with performance data.The companies plan to release enhancements as they “refine and test the model collaboratively” with health care professionals “to help enhance its capability and performance.” When asked if the data set itself will be released, the companies told ZDNET in an email, “This is still to be determined.”

Also: Microsoft unveils extensions to Fabric, Azure for healthcare AI

The fine-tuning was done on Condor Galaxy, a massive AI computer that Cerebras built for G42 this year, which Cerebras calls “the world’s largest supercomputer for AI.” According to Cerebras, “Rapid setup and reduced training time were made possible by the 82 terabytes of memory and the 54 million AI cores in the 64 Cerebras CS-2 systems inside of CG-1.”

“What you have are all these interesting applications being run on top of Condor Galaxy, and that’s unlike any other startup’s hardware,” said Feldman. “We’re really moving the industry forward.”

Feldman noted in a follow-up email that “All parameters [of Llama 2] were fine-tuned, and this was made possible by the vast memory available on Condor Galaxy 1 […] The setup and training for 3 Epochs was accomplished in 5 days, which would have taken months on a large cluster of GPUs.”

In the performance data on HuggingFace, the companies note that “Med42 achieves competitive performance on various medical benchmarks, including MedQA, MedMCQA, PubMedQA, HeadQA, and Measuring Massive Multitask Language Understanding (MMLU) clinical topics.”

Also: What is HuggingChat? Everything to know about this open-source AI chatbot

On the US Medical Licensing Examination, or, USMLE, sample exam, the program “achieves a 72% accuracy,” according to M42, “surpassing the prior state of the art among openly available medical LLMs.” It also surpassed by a wide margin OpenAI’s closed-source GPT 3.5, which garnered 59.6% accuracy, though Med42 fell short of GPT4’s 84.3% accuracy.

“You take a very big pre-trained model like Llama 2 70 billion, and if you bring to it really interesting data sets, pioneering data sets, you can have them do really interesting things, and done at a fraction of the time and the power draw of something like GPT 3.5,” said Feldman.

Cerebras has been especially active in open-source projects of late. In March, the company published as open-source several versions of generative AI programs to use without restriction. 

In August, the company unveiled the world’s most powerful Arabic-language LLM, Jais-Chat, as an open-source program. 

Also: Cerebras and Abu Dhabi build world’s most powerful Arabic-language AI model

For the moment, Med42 is not in production. “Following successful testing, Med42 will be made available for clinical deployment,” the companies said in an email to ZDNET. 

“Importantly, Med42 will have the capability of being deployed on-premise, fully customized to healthcare providers’ needs, using owned data sources and limiting the ability for external intrusions,” they added. “We are prioritizing safe application of the technology over speed to production and are committed to extensive safety evaluation of the model before rolling it out.”


Source link