ChatGPT Models Surpass Human Benchmark in Neurology Exams

2 mins read December 11, 2023

Two ChatGPT AI models outperformed human neurology students in a neurology board exam, with one model scoring 85%.
The study, published in JAMA Network Open, did not allow the AI models internet access or specific neurology training.
The findings highlight the potential of AI in medical fields, particularly for tasks involving memory and analysis.

In a study featured in JAMA Network Open, two versions of ChatGPT Large Language Models (LLMs) have demonstrated a remarkable ability to outperform human neurology students in board-style examinations. This development marks a significant milestone in the application of artificial intelligence (AI) in the medical field, particularly in neurology.

AI’s stride in neurology exams

Researchers employed LLM 1 (ChatGPT version 3.5) and LLM 2 (ChatGPT version 4) to tackle questions from the American Board of Psychiatry and Neurology (ABPN) question bank. The study’s key finding was that LLM 2 achieved an impressive 85% success rate, surpassing the human average of 73.8%. Notably, this performance was achieved without the models having access to the internet or undergoing neurology-specific tuning.

The study adhered to rigorous scientific protocols, including the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines. The comparison with human neurology students involved a range of questions, classified as either lower-order, focusing on basic understanding and memory, or higher-order, requiring application, analysis, and evaluative thinking.

The implications of AI in medical fields

The superior performance of LLM 2, especially in higher-order questions, underscores the rapid advancements in AI and its potential applications in clinical settings. This is particularly relevant as AI continues to cross into domains traditionally reserved for human expertise, such as medicine, military, education, and research.

The use of AI in clinical neurology has been expanding, with tasks ranging from diagnosis to treatment planning and prognosis. The study highlights how AI, especially transformer-based architectures like ChatGPT, can aid and sometimes replace human roles in these fields.

Balancing AI and human expertise

While the results are promising, they also open up discussions about the balance between AI and human expertise in sensitive fields like medicine. The study’s authors emphasize that AI’s strengths in memory-based tasks, compared to those requiring deep cognition, indicate a complementary role rather than a replacement of human medical experts.

The study’s findings are a testament to the potential of AI in enhancing medical practices and educational tools. However, it also underscores the need for ongoing evaluation and refinement of these AI systems to ensure they augment human expertise effectively.

The study from JAMA Network Open reveals a significant leap in AI capabilities, particularly in the medical field of neurology. The results demonstrate AI’s prowess in complex analytical tasks and open the door to new possibilities in medical education and practice. The future of AI in medicine appears bright, with these technologies poised to play an increasingly supportive role alongside human professionals.

Don’t just read crypto news. Understand it. Subscribe to our newsletter. It's free.

Share this article

Disclaimer. The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisions.

Brenda Kanana

Brenda has been with 4+ years of experience specializing in cryptocurrency, artificial intelligence, and emerging technologies. She has worked at Zycrypto, Blockchain Reporter, The Coin Republic, and now, makes Cryptopolitan her home. Her Sociology degree from Mombasa Technical University keeps her aligned with her readers’ pulse.

TABLE OF CONTENT

1. AI’s stride in neurology exams

2. The implications of AI in medical fields

3. Balancing AI and human expertise

Share this article