Key takeaways
- Concerns have been raised by experts over OpenAI’s Whisper tool’s high hallucination rates, which could affect transcription accuracy.
- Whisper causes hallucinations in many transcriptions, according to studies, which raises issues with accessibility and healthcare.
- Federal restrictions to reduce hazards in essential applications have been called for as a result of error rates.
Because mistakes in transcription and healthcare might be dangerous, experts express concern about the high hallucination rates in OpenAI’s Whisper tool and call for controls.
Software engineers, developers, and academic researchers are extremely concerned about transcription errors generated by OpenAI’s Whisper, according to an Associated Press investigation.
It’s unexpected that transcription is also impacted by hallucinations, a well-known problem in generative AI for LLM-driven chatbots like ChatGPT.
Worries voiced regarding mistakes in Whisper’s transcriptions
Engineers and academics have discovered that Whisper has a larger frequency of hallucinations than any other AI-powered tool, despite the fact that developers anticipate some faults in transcribing tools.
Whisper’s hallucinations are common among researchers; according to a University of Michigan study, they appear in eight out of 10 transcriptions. One developer discovered hallucinations in almost all 26,000 transcripts, while a machine learning engineer found them in half of the 100 hours he examined. Furthermore, a study found 187 hallucinations in more than 13,000 undistorted audio samples, which could lead to thousands of inaccurate transcriptions. Some of the fictitious language in Whisper’s transcripts, according to experts who spoke to the AP, includes violent speech, racial remarks, and even hypothetical medical procedures.
After examining thousands of TalkBank snippets, researchers from Cornell University and the University of Virginia discovered that almost 40% of hallucinations were detrimental because they misunderstood or distorted the speaker.
Given Whisper’s extensive use in interview translation and transcription, text generation in consumer electronics, and video subtitle creation, experts caution that these mistakes are particularly worrisome. The Deaf and hard of hearing, a demographic that is especially susceptible to inaccurate transcriptions, can also use the technology for closed captioning. Furthermore, despite OpenAI’s cautions against using Whisper-based technologies in “high-risk domains,” experts are worried about medical centers quickly implementing these techniques for transcribing patient talks.
Experts Call for Regulation Due to Whisper’s Hallucinations
Whisper is incorporated into Oracle and Microsoft’s cloud platforms, as well as OpenAI’s ChatGPT. It provides translation and transcription services to thousands of businesses. HuggingFace recently had over 4.2 million downloads of one version of Whisper.
Developed by Nabla, a corporation with headquarters in the US and France, the Whisper-based application is used by 40 health systems and over 30,000 physicians, including Mankato Clinic and Children’s Hospital Los Angeles. About 7 million medical encounters have been translated by the technology, according to Nabla.
Experts and former OpenAI staff have urged the company to fix this issue and push for government AI laws due to the high frequency of hallucinations.
According to a representative for OpenAI, the business is looking into ways to lessen hallucinations and appreciates the researchers’ results for model upgrades. Additionally, they cautioned against utilizing Whisper in “decision-making contexts” where errors could be serious due to accuracy problems.