Research Articles | Challenge Journal of Perioperative Medicine

Evaluation of the content quality of regional anesthesia and postoperative analgesia approaches generated by ChatGPT-4.0 according to surgical incision sites

Müzeyyen Beldağlı


DOI: https://doi.org/10.20528/cjpm.2025.03.005
View Counter: Abstract | 206 times | ‒ Full Article | 101 times |

Full Text:

PDF

Abstract


Background: Large language models (LLMs) are increasingly consulted for perioperative decision support, yet their ability to give professional-grade guidance for regional anesthesia and analgesia remains uncertain.

Materials and Methods: In a prospective observational study, we presented eight incision-based figures (Items 2–9) representing common abdominal incisions to ChatGPT-4.0 and requested a regional anesthesia technique and postoperative analgesia plan for each. Five independent anesthesiologists rated each response on Accuracy, Comprehensiveness, and Safety using a 5-point Likert scale. Inter-rater reliability was summarized with Fleiss’ κ. One non-incision item (Item 10) was analyzed descriptively and excluded from pooled statistics. Single-shot prompts were used.

Results: Mean ratings were high: Accuracy 4.28, Comprehensiveness 4.30, Safety 4.00 (1–5 scale). Inter-rater agreement was substantial for Safety (κ=0.76) and lower for Accuracy (κ=0.33) and Comprehensiveness (κ=0.31). Two consistent low points emerged: right-lower-quadrant (McBurney/Lanz) incision‒Safety mean 3.0 and suprapubic (Pfannenstiel) incision‒Accuracy 3.0; Comprehensiveness 3.4; Safety 3.4. When explicitly asked for postoperative plans, the model rarely proposed neuraxial techniques (e.g., epidural), favoring fascial-plane/peripheral strategies.

Conclusions: An LLM produced clinically usable suggestions for common abdominal incisions with strong safety agreement, but performance was not uniform, and neuraxial options were under-recommended. These tools may serve as a helpful adjunct for education and option-generation, yet they should be used with expert oversight and local protocols. Future work should test repeated sampling, prompt standardization, model/tier comparisons, and link recommendations to patient outcomes.


Keywords


postoperative analgesia; regional analgesia; large language models; ChatGPT 4.0

References


Yamamoto T, Schindler E. Regional anesthesia as part of enhanced recovery strategies in pediatric cardiac surgery. Curr Opin Anaesthesiol. 2023;36(3):324–333.

Ahiskalioglu A, Yayik AM, Celik EC, et al. The shining star of the last decade in regional anesthesia part I: Interfascial plane blocks for breast, thoracic, and orthopedic surgery. Eurasian J Med. 2022;54(Suppl 1):97–105.

Yayik AM, Celik EC, Aydin ME, et al. The shining star of the last decade in regional anesthesia part II: Interfascial plane blocks for cardiac, abdominal, and spine surgery. Eurasian J Med. 2023;55(Suppl 1):9–20.

Nelms MW, Javidan A, Chin KJ, et al. YouTube as a source of education in perioperative anesthesia for patients and trainees: A systematic review. Can J Anaesth. 2024;71(9):1238–1250.

Gul S, Erdemir I, Hanci V, Aydogmus E, Erkoc YS. How artificial intelligence can provide information about subdural hematoma: Assessment of readability, reliability, and quality of ChatGPT, BARD, and Perplexity responses. Medicine (Baltimore). 2024;103(18):e38009.

Wagemans MF, Scholten WK, Hollmann MW, Kuipers AH. Epidural anesthesia is no longer the standard of care in abdominal surgery with ERAS: What are the alternatives?. Minerva Anestesiol. 2020;86(10):1079–1088.

Roofthooft E, Joshi GP, Rawal N, Van de Velde M, PROSPECT Working Group of the European Society of Regional Anaesthesia and Pain Therapy and supported by the Obstetric Anaesthetists’ Association. PROSPECT guideline for elective caesarean section: Updated systematic review and procedure-specific postoperative pain management recommendations. Anaesthesia. 2021;76(5):665–680.

Lirk P, Thiry J, Bonnet MP, Joshi GP, Bonnet F. Pain management after laparoscopic hysterectomy: Systematic review of literature and PROSPECT recommendations. Reg Anesth Pain Med. 2019;44(4):425–436.

Ismaiel N, Nguyen TP, Guo N, Carvalho B, Sultan P, study collaborators. The evaluation of the performance of ChatGPT in the management of labor analgesia. J Clin Anesth. 2024;98:111582.

Meyer MKR, Kandathil CK, Davis SJ, et al. Evaluation of rhinoplasty information from ChatGPT, Gemini, and Claude for readability and accuracy. Aesthetic Plast Surg. 2024;49:1868–1873.


Refbacks

  • There are currently no refbacks.