Conference Program

4.2 - Theme 2. Harnessing Artificial Intelligence, Technology and Digital Innovations in Guideline Development and Implementation

Wednesday, September 17, 2025

4:15 PM - 5:30 PM

Room E

Speaker

Miss Jiayi Liu

China

Lanzhou University

QUEST-TCM - A Framework for Human Evaluation of Large Language Models in Traditional Chinese Medicine Practice

4:15 PM - 4:19 PM

Abstract

Abstract
Background: Traditional Chinese Medicine (TCM) has gained international attention while large language models (LLMs) show promise in assisting healthcare. Existing LLM evaluation frameworks focus on Western medicine and cannot accommodate TCM's unique characteristics.
Objective: This study develops a standardized framework for evaluating the performance of LLMs in TCM practice, addressing the lack of TCM-specific evaluation methodologies.
Methods: We conducted a scoping review following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guidelines. Literature searches were performed across multiple databases including PubMed, Embase, Web of Science, CNKI, and Wanfang. Studies that examined the application of LLMs in tasks related to TCM practice, such as developing guidelines or in clinical practice itself. We reviewed these studies across multiple dimensions including accuracy, relevance, comprehensiveness, consistency, safety, and usability. Based on the findings of the review, we developed a TCM-specific evaluation framework of LLMs.
Results: From 1,100 initial records, 41 articles were selected for analysis after screening. The framework, named as QUEST-TCM, was built on five core evaluation principles: TCM knowledge conformity, diagnostic accuracy, treatment rationality, safety and ethics, and modern integration. The framework provides a structured approach across preparation, execution, and evaluation phases.
Discussion for scientific abstracts: The QUEST-TCM framework provides a comprehensive, standardized approach for evaluating LLMs in TCM application. This framework bridges traditional knowledge systems with modern AI capabilities, promoting responsible integration of LLMs into TCM practice while preserving its philosophical foundations.

Paper Number

352

Biography

AI-Driven Evidence Synthesis: Data Extraction of Randomized Controlled Trials with Large Language Models (Accepted by International Journal of Surgery in November) Enhancing Systematic Reviews with Large Language Models: Data Extraction of Randomized Controlled Trials (Poster, The Global Evidence Summit 2024, Prague)

Miss Jiayi Liu

China

Lanzhou University

Using Large Language Models to Generate Medical Plain-Language Summaries: A Comparative Study

4:19 PM - 4:23 PM

Abstract

Abstract
Background: Effective translation of medical evidence for lay audiences is crucial for guideline implementation. While generative artificial intelligence (GenAI) increasingly supports healthcare communication, its outputs often exhibit unwarranted optimism that may obscure critical uncertainties (e.g., risks of bias, certainty of evidence) and generate misinformation.
Objective: We aimed to assess gaps in plain language summaries (PLS) of systematic reviews generated by GenAI based on standardized prompts from different large language models (LLMs), identify reasons for incomplete disclosures, and test how better prompt engineering improves the quality of the PLS.
Methods: We analyzed 50 Cochrane reviews (2018–2023) comparing five PLS versions: 1) manually developed (published), 2) standard GPT-4o, 3) standard Claude-3, 4) GPT-4o refined using Cochrane guidelines, and 5) Claude-3.7 refined. A multidisciplinary panel assessed completeness (16-item checklist), readability (Flesch-Kincaid), and risk communication adequacy (Likert scale), with inter-rater reliability validated (Kappa>0.75).
Results: Standard LLMs omitted 66-71% of limitations from the PLSs (GPT-4o: 68% [95% confidence interval 61–75%], Claude-3: 71% [64–78%]) vs. 12% manual. Evidence-structured prompts improved the disclosure of limitations 3.8-4.5-fold (GPT-4:4.2-fold; Claude-3:3.9-fold), achieving parity with humans in conflict-of-interest transparency (Δ≤5%, p>0.05). Claude-3 showed marginally higher lexical diversity (measure of textual lexical diversity =83.6 vs GPT-4o: 79.2), while GPT-4 better replicated Cochrane terminology (86% vs Claude-3: 72%). Both LLMs maintained readability at the level understandable for 8th-9th grade students.
Discussion for scientific abstracts: Evidence-structured prompts greatly improve the quality of AI-generated plain-language summaries, bridging a critical gap in the communication of medical information.
Keywords: LLMs, evidence translation, plain language summary

Paper Number

353

Biography

PUBLICATIONS AI-Driven Evidence Synthesis: Data Extraction of Randomized Controlled Trials with Large Language Models (Accepted by International Journal of Surgery in November) CONFERENCE PRESENTATIONS Enhancing Systematic Reviews with Large Language Models: Data Extraction of Randomized Controlled Trials (Poster, The Global Evidence Summit 2024, Prague)

Dr Marta Souto Maior

Coordinator

Conitec

Platform for consulting drug recommendation in Clinical Practice Guidelines

4:23 PM - 4:27 PM

Abstract

Background: The National Committee for Health Technology Incorporation (Conitec) advises Brazilian Ministry of Health (MoH) in Clinical Practice Guidelines (CPG) developing. These documents establish criteria for diagnosing a disease or health problem; treatment, clinical control mechanisms; and the monitoring and verification of therapeutic results, to be monitored by managers of the Unified Health System (SUS).
Objective: To describe the development of a platform for consulting drug recommendation in Clinical Practice Guidelines
Methods: Descriptive qualitative study about platform development.
Results: For each CPG published by MoH until December 2024, it was collected its title, Internation Classification of Diseases codes and drug recommendations and publishing date. All data were organized using Excel 2010 software. Later, data was exported to Microsoft Power BI, to create a platform. The platform is now being improved so that it can be made available on Conitec website.
Conclusions: The platform will improve access of patients, health managers and health professionals to information on which drugs are recommended in each guideline and which guidelines provide recommendations on the care of each disease.

Paper Number

517

Biography

Pharmacist. MSc and pHD in Public Health. Works at Conitec.

Mr Gregor Wenzel

German Cancer Society

Evaluation of AI-Generated Summaries from Evidence Tables for Evidence-Based Guidelines in the German Guidelines Program in Oncology

4:27 PM - 4:31 PM

Abstract

Objective: The development of S3 guidelines requires meticulous evidence retrieval and synthesis. While comprehensive evidence tables are available, their interpretation is often left to guideline groups, imposing a significant resource burden. This study evaluates AI-generated summaries through a quantitative assessment of endpoint accuracy and a qualitative evaluation of text quality, including plausibility and usability.
Methods: Two AI models, Claude Sonnet 3.5 and OpenAI o3-mini, processed 30 randomly selected evidence tables from 17 clinical guidelines. Two assessors evaluated recognized, erroneous, and hallucinated endpoints quantitatively, as well as plausibility and usability qualitatively on a 3- and 5-point scale, respectively. Summary length was also analyzed for its potential impact on readability and interpretation.
Results: OpenAI o3-mini recognized more endpoints (92.8%) than Claude (53.2%) and had fewer erroneous extractions (0.1% vs. 2.8%).Only Claude hallucinated any endpoints (1.8%). On average, Claude’s summaries were shorter (243.5 vs. 630.4 words) and slightly more plausible (1.32 vs. 1.74). Usability scores were comparable (2.15 vs. 2.26), though differences in summary length may influence qualitative assessments.
Conclusion: OpenAI o3-mini excelled in endpoint recognition with minimal errors, while Claude generated summaries that assessors found slightly more plausible. Both models show promise for aiding evidence interpretation, but refinements are needed to optimize usability for guideline development.

Paper Number

288

Biography

Gregor Wenzel is a theoretical biologist and medical writer and has been working in the German Guidelines Program in Oncolgoy, assisting in guideline digitalization and as methodologist.

Mr Dianchun Liu

China

Beijing University of Chinese Medicine

AI-based Recommendations Map (RecMap) for Traditional Chinese Medicine (TCM) Treatment of Diabetes: Design, Development and Dissemination

4:31 PM - 4:35 PM

Abstract

Introduction
Approximately, 828 million adults worldwide were affected by diabetes in 2022, with a notable trend towards younger age groups and broader prevalence. Due to the varied quality of evidence in the field of traditional Chinese medicine (TCM), the inconsistency and conflicting recommendations often occurs between guidelines. Our aim is to develop an RecMap for TCM treatment of diabetes covered whole progress stage integrated with AI.

Methods
In the first phase, planning and investigating. We will explore the needs and expectations from both clinicians and patients. Steering committee will be set up to specify the scope and obligations of this platform. The second, development phase. comprehensive and systematic screening will be conducted from a wide sources of databases such as Embase, Pubmed, Chinese Medical Association Guide, and Wanfang. Guidelines will be evaluated using AGREE-II and AGREE-REX. The recommendations extracted from the guidelines will be evaluated using GRADEPro infrastructure. In the AI training stage, the AI system is integrated into the website to provide accurate responses to the questions according to the guideline and recommendations of the website. Finally, dissemination. Strategies for debugging and testing will be collaboratively developed with stakeholders, and feedback from clinicians and patients will optimize and promote the dual-mode display. In the clinical trial and promotion phase, feedback from patients and clinicians will optimize and promote the website.

Discussions
The AI-based TCMDia-RecMap will significantly enhance the utilization of reliable guidelines among clinicians, patients, and policymakers, thereby optimizing evidence-based diabetes management.

Paper Number

122

Biography

Dianchun Liu, an undergraduate at Beijing University of Chinese Medicine, is committed to research on diabetes, cancer, and gastrointestinal diseases. Proficient in bioinformatics, artificial intelligence, and evidence - based medicine, Liu has published studies as the first author in journals like the World Journal of Gastrointestinal Oncology and Chinese General Practice. These works are crucial steps in advancing research in these medical areas.

Yishan Qin

CHINA

Lanzhou University

Application and Exploration of Large Language Models in the Dissemination and Implementation of Infertility Guidelines

4:35 PM - 4:39 PM

Abstract

Infertility is rising in reproductive-age populations and is now a major global public health issue. The promotion and implementation of clinical practice guidelines is the key to promoting medical equity, improving medical quality and solving public health problems. This research uses infertility guidelines as an example to study the application of large language models in adapting guideline content and helping its spread and implementation.
We first collected and processed key recommendations from infertility-related guidelines. Then, we used large language models to create explanatory and promotional texts and videos. Next, medical experts and patient representatives checked the accuracy and readability of these materials. The early results show that these models can quickly produce information in many formats. This information is easy for patients from different cultural and educational backgrounds to access, thus boosting guideline accessibility. These models also have the potential to offer personalized patient information and support healthcare providers in low-resource settings.
This new method provides a promising way to enhance the efficiency and coverage of guideline dissemination and implementation, promoting fair access to evidence - based recommendations. However, ensuring the accuracy of generated content and sticking to clinical and legal rules are crucial for future development and use.

Paper Number

450

Biography

Qin Yishan is a PhD candidate at the School of Basic Medicine, Lanzhou University. Her research interests are traditional medicine and guideline methodology.

Ms Yanfang Ma

CHINA

Vincent V.C. Woo Chinese Medicine Clinical Research Institute, School of Chinese Medicine, Hong Kong Baptist University

LLMs-CMPs collaboration for clinical decision-making to promote guidelines implementation in Chinese medicine

4:39 PM - 4:43 PM

Abstract

Background: Traditional Chinese Medicine (TCM), one of the complementary therapies, is widely used globally. TCM offers diverse therapeutic approaches such as acupuncture, herbal medicine, and tuina, rooted in centuries of holistic practices. However, a limitation of TCM is its non-standardized, individualized approach to diagnosis and treatment, often resulting in variability in decision-making. With the rapid growth of clinical evidence and evidence-based clinical guidelines, Chinese Medicine Practitioners (CMPs) face challenges in integrating them for consistent care. Artificial intelligence, particularly Large Language Models (LLMs), offers a potential solution. LLMs can self-train on extensive datasets, providing real-time, evidence-based recommendations to support clinical decision-making.

Objective: This study aims to evaluate the consistency between LLMs, CMPs, and LLM-CMP collaboration in clinical decision-making. We seek to understand how LLMs, learning guideline recommendations in real-time, can enhance decision-making consistency and promote the implementation of guidelines in TCM.

Methods: A cross-sectional observational study will compare the diagnosis and treatment decisions made by LLMs, CMPs, and their collaboration on standardized clinical gastroenterology routine cases. LLMs will provide self-trained recommendations, CMPs will apply clinical expertise, and LLM-CMP collaboration will combine both to make final decisions.

Results: Diagnostic and treatment decisions will be assessed for consistency using Cohen’s κ, targeting κ≥0.8 for routine cases. The performances of different LLM models in collaboration with CMPs will also be quantified.

Discussion: The study is ongoing, and the results will be presented at the GIN Conference.

Paper Number

Biography

Ms. Ma focuses on evidence-based medicine, systematic reviews, and developing clinical practice guidelines from 2016. She joined the Chinese EQUATOR Centre at Hong Kong Baptist University in August 2022 and is also interested in the development of reporting guidelines (data sharing and traditional Chinese medicine). Ms. Ma has authored or co-authored over 40 articles in peer-reviewed journals and participated in more than five books related to reporting guidelines, GRADE applications, and evidence-based assessment of Chinese Medicine.

Dr Natasha Gloeck

South Africa

Senior Scientist

South African Medical Research Council

Promoting efficiency in an evidence response service towards advancing universal health coverage (UHC) in South Africa

4:43 PM - 4:47 PM

Abstract

Background
The Evidence to Decision (E2D)Initiative builds on a decade of engagement with academic and government partners to strengthen healthcare recommendations through evidence synthesis for Universal Health Coverage(UHC) in South Africa. E2D advances the partnership through clear workplans and funding to ensure timely, responsive evidence synthesis and methodological support. A key component of this initiative involves leveraging technology to streamline the evidence requests process and improve overall efficiency.

Aim
To develop a tailored database to meet specific needs of the E2D evidence-response system and enhance review request processes.

Methods
Previously, the service operated through email requests, informal discussions and manual spreadsheet updates, making real-time updates cumbersome and dependent on several people for maintenance. The current approach utilizes Microsoft Forms for request submissions where updates still rely on manual input. However, formalization of the evidence response service through E2D highlighted the need for a more robust platform supporting real-time updates and multi-user accessibility.

Results
We have developed and are piloting a new platform to manage requests for evidence from NDoH, using Redcap technology. This links databases and creates more streamlined processes for allocating available reviewers. This also enables real-time updates of review product progress. Further testing is ongoing, and inclusive of additional modules such as report generation.

Discussion
Harnessing technology will enhance efficiency, improve reviewer capacity management, and minimize the risk of overlooked requests. We further anticipate this serving as a pilot project to optimize processing for a planned Health Technology Assessment agency, supporting the transition towards UHC in South Africa.

Paper Number

374

Biography

Tasha is a Senior Scientist in the Health Systems Research Unit at the SAMRC. She holds a MBChB (UP), DTM&H (UP), MSCE (UP) and PG Dip in Health Economics (UCT). She is currently pursuing a PHD in Public Health. Her special interests include evidence-based health care, primary health care, evidence synthesis, evidence-informed decision-making, and clinical practice guideline methodology. Tasha helps to co-ordinate the South African GRADE Network, and co-leads Goal 2 of the SAMRC/NdoH E2D project. Tasha is passionate about implementing training and research that positively impacts the lives of the people of South Africa, and other low-and-middle income countries.

Ms Kinlabel Okwen Tetamiyaka Tezok

Cameroon

Software Engineer

Effective Basic Services (eBASE) Africa

Harnessing the Transferability Toolkit for Guideline Adaptation in Local Contexts

4:47 PM - 4:51 PM

Abstract

Background
Most solutions developed in the Global North are tailored to their specific contexts. A guideline that is effective in the United Kingdom may not yield the same results in Africa due to diversities in culture, infrastructure, healthcare systems, and socioeconomic factors. This problem highlights the need to adjust guidelines to fit different local contexts to yield better health outcomes.

Objective
This study explores how the education-based transferability toolkit can assess the feasibility of adapting healthcare guidelines across diverse settings using machine learning.

Methods
We apply the transferability toolkit developed by eBASE Africa, leveraging Classification and Regression Trees (CART) and Natural Language Processing (NLP) to predict guideline adaptability. The model evaluates five key variables: relevance, complexity, cost, average importance, and impact to classify guidelines as highly transferable, moderately transferable, and/or not transferable in a given context.

Results
A threshold of 69% was realized for the transferability of educational strands for stakeholders in education. We argued that transferability is valid when there is high relevance, low complexity, low cost, and high importance. Building on these results, we expect that this tool will be greatly applicable in guideline adaptation.

Discussion for scientific abstracts
The study explores how the transferability toolkit can be used to adapt healthcare guidelines to different contexts. This approach will make guidelines development a living evidence and their context specificities more ideal. Living guidelines through the transferability tool will mean a reduction in efforts and cost of funding the frequent development of guidelines across the globe.

Paper Number

224

Biography

Kinlabel is a tech professional specializing in mobile development and machine learning in evidence-based practices. She leads the eBASE Connect app development at iCode Abakwa, where she helps shape app design and architecture. Her work includes a paper on improving livelihoods for people with disabilities in Cameroon through evidence-based toolkits. She is currently part of the DESTINY development team, working to integrate the transferability toolkit into their DEST tool. The toolkit predicts the transferability of interventions across different contexts, enhancing adaptive learning and evidence utilization.

Mr Haodong Li

CHINA

Master Candidate

Lanzhou University

Evaluation of Compliance of Methodological Quality Compared with LLM with AMSTAR 2 Tool: A Cross-Sectional Survey

4:51 PM - 4:55 PM

Abstract

Abstract
With the increasing application of Large Language Models (LLMs) in the medical field, their potential in assessing the methodological quality of systematic reviews has garnered significant attention. This study aims to compare the assessment results of methodological quality between three LLMs (Kimi, DouBao, and DeepSeek) and human evaluation in 73 systematic review articles using the AMSTAR 2 tool. The study is ongoing, with all tests expected to be completed and results presented before the conference.

Background:
The methodological quality of systematic reviews is crucial, and the AMSTAR 2 tool is widely used for evaluation. This study compares the performance of three LLMs (Kimi, DouBao, DeepSeek) and two human evaluators in assessing 73 systematic reviews.

Methods:
Each review is assessed three times by LLMs and humans. Primary indicators include overall consistency score (OCS), OCS for each item, testing time, LLM stability, and intraclass correlation coefficient (ICC).

Results:
The study is ongoing, and complete results will be available before the conference. Preliminary findings show differences in overall OCS between LLMs and humans. Detailed analysis will reveal LLM performance in different dimensions, efficiency differences, and consistency.

Conclusion:
This study will provide empirical evidence on LLMs' strengths and limitations in medical literature evaluation, guiding future research and practice.

The author gratefully acknowledges the support of K.C. Wong Education Foundation, Hong Kong.

Paper Number

511

Biography

I'm from the School of Public Health, Lanzhou University. My major is Epidemiology and Health Statistics, and my research focuses on evidence-based medicine and chronic epidemiology. Currently, I'm working on a project that combines artificial intelligence and evidence-based medicine.

Dr Xiaomei Yao

Mcmaster University

The Role of COVIDENCE: An AI-Based Tool for Title and Abstract Screening in A Breast Cancer Evidence-Based Clinical Practice Guideline

4:55 PM - 4:59 PM

Abstract

Background: Developing systematic review (SR)-based, high-quality cancer clinical practice guidelines (CPGs) typically requires two years without any assistance from artificial intelligence (AI).
Objective: To compare the performance of a newly introduced AI-assisted title and abstract screening (Stage I) in COVIDENCE with fully manual screening, using retrospective data from a SR supporting an already-published breast-cancer CPG.
Methods: In a SR comprising 8,774 articles, each article was assessed for relevance and final inclusion through manual review. From this dataset, three article subsets (n=500, 1000, and 2000) were randomly selected to run 30 independent Stage I, AI-assisted trials for each subset. The primary outcome of each trial is workload savings (the proportion of articles not requiring manual screening) at AI-assisted identification of 95% and 100% relevant articles, and 100% finally-included articles. The secondary outcome is missed articles (number of finally-included articles missed upon identifying 95% relevant articles).
Results: To date, 10 trials in each of the first two subsets were completed. At the identification of 95%, 100% (relevant) and 100% (finally included) articles, mean [standard deviation] workload savings are 37.1%[14.1%], 25.9%[15.1%], 57.8%[17.9%] (n=500) and 27.6%[18.2%], 13.5%[13.4%], 49.7%[28.6%] (n=1000), respectively. Workload savings differed significantly (p=0.038) between n=500 and n=1000 trials at the identification of 100% relevant articles. One missed article in 1 trial for the subset of n=500 and 2 missed articles in 2 trials for n=1000 were noted.
Discussion: AI assistance in COVIDENCE demonstrates promise in improving the efficiency of Stage I screening. Complete data will be available for discussion by May 2025.

Paper Number

117

Biography

Dr. Xiaomei Yao is the Associate Director for Quality and Methods at the Program in Evidence-Based Care (PEBC), Health Ontario (Cancer Care Ontario). She is a part-time faculty member at the Department of Health Research Methods, Evidence, and Impact at McMaster University, Canada, and a former member of the GIN/NA Steering Group. Dr. Yao is the Section Editor of "Epidemiology and Statistics" for Surgical Oncology and an Associate Editor of GIN's journal, CPHG.

Dr Simon Van Cauwenbergh

Methodologist

WOREL / USP

Adapting guideline recommendations on smoking cessation within a cross-border primary care collaboration: Lessons learned

4:59 PM - 5:03 PM

Abstract

Background: The collaboration between WOREL (Belgium) and NHG (Netherlands) was initiated at the GIN 2022 conference in Toronto. This partnership was formalized with a Memorandum of Understanding in 2023. Recently, both primary care organisations were accredited by the Belgian Centre for Evidence-Based Medicine.

Objective: To present facilitators, barriers and practical considerations when exchanging summaries of evidence (including GRADE SoF tables) and evidence to decision formats for guideline recommendations on smoking cessation in primary care.

Methods: The collaboration focuses on the guideline development process on smoking cessation. In 2024, the topic was coincidently addressed by the other organization, providing a unique opportunity to assess the feasibility of combined cross-border guideline adaptation and adoption (adolopment procedure). The collaboration involves:

- exchanging methodological details, development processes, search strategies and resources,

- using MagicApp to facilitate the sharing of critically appraised research evidence, rationales and evidence-to-decision frameworks,

- identifying barriers, facilitators, and practical considerations in cross-border guideline development.

Results: During the adolopment process, search strategies, SoF tables and evidence to decision formats are exchanged. Key outcomes will include a bilateral exchange of guideline development methodologies, identification of challenges and facilitators in cross-border collaboration, and documentation of lessons learned.

Discussion: The collaboration will establish/generate a roadmap for the next steps of collaboration, outlining strategies to optimize the exchange and joint development of primary care guidelines in future.

Paper Number

435

Biography

Ton Kuijpers is epidemiologist at Dutch College of General Practitioners and co-chair Dutch GRADE Network. Simon Van Cauwenbergh is a medical doctor, PhD-candidate in Physical and Rehabilitation Medicine and since 2021 working for the Belgian Working Group Development of Primary Care Guidelines (WOREL).

Ms Lejla Koco

Guideline Advisor

Stichting PZNL

The potential of AI in assisting palliative care guideline revisions: insights from a pilot study

5:03 PM - 5:07 PM

Abstract

Background:
Artificial Intelligence can potentially assist in different ways during all phases of the guideline development process, such as collection of evidence, formulating recommendations, structuring texts and the writing process. However, integration of AI must align with established guideline development frameworks to ensure transparency and reliability. Despite growing use of AI in medicine, its role in palliative care guideline development remains limited.

Objective:
This study explores how AI can be applied in guideline development, focusing on various prompt engineering techniques. We evaluate multiple AI-generated texts on their content, quality and expected usefulness and assess the required recourses, human efforts and potential benefits for guideline developers.

Methods:
We examined various AI applications by using ChatGPT 4o for text generation with several prompt variations for generating new texts. Different prompt structures were tested to optimize the AI-generated output. The content of the generated AI texts was evaluated through text analysis and expert opinions.

Results:
Well-structured prompts significantly improved AI-generated content quality, ensuring coherence and relevance. By providing reference materials for AI, as input, the quality of AI-generated texts improved and is expected to reduce undesired hallucinated output. These AI applications reduced drafting time and enhanced content consistency of guideline recommendations and considerations. However, human checks remained crucial for maintaining methodological rigor and clinical accuracy.

Discussion:
Future research should focus on refining AI applications, integrating them into structured workflows, and ensuring alignment with established guideline development methodologies. Responsible AI implementation will require ongoing evaluation and adaptation to maintain scientific integrity and trustworthiness.

Paper Number

464

Biography

Lejla Koco, MSc, is a guideline advisor specializing in Dutch palliative care guidelines. With expertise in palliative care guideline development, she focuses on enhancing evidence-based practices to improve patient care. Her work involves developing, revising, and implementing guidelines to ensure high-quality palliative care standards in the Netherlands. Passionate about innovation, she explores the role of AI in optimizing guideline processes.

Dr, Prof Janine Vetsch

Ost

A systematic Comparison of data extractions using a large language model (Elicit) and human reviewers

5:07 PM - 5:11 PM

Abstract

Background: Elicit is an artificial intelligence tool which may automate data extraction for the conduct of evidence synthesis and guidelines. However, the tool’s performance and accuracy is unclear and requires an independent assessment.

Objective: We aimed at comparing data extractions of randomized controlled trial reportsdone by Elicit and human reviewers.

Methods: We sampled 20 randomized controlled trial reports of which data was extracted manually by a human reviewer. We assessed the variables study objectives, sample characteristics and size, study design, intervention, outcome measured and intervention effects and classified the results into "deviating extractions", "partially equal with less information" and "equal to or more information".

Results: Data extractions were equal between Elicit and human extractions in 49 % of all variables in all twenty studies, partially equal in 46% and deviating in 5%. Across all variables, Elicit extracted equal to or more information compared to a human reviewer in 1-20 studies (median 11). Only for the variable study design, all extractions (100%) by Elicit were equal to human reviewers. For the variable intervention effects, extractions by Elicit were equal to human reviewers in only one study (5%).

Discussion for scientific abstract: Elicit extracted data only partly correct for our predefined variables. Variables like ‘intervention effect’ or ‘intervention’ may require a human reviewer to complete the data extraction. Our results suggest that verification by human reviewers is necessary to ensure that all relevant information is captured completely and correctly by Elicit.

Paper Number

Biography

Magdalena Vogt is a research associate at the Competence Centre Evidence-based Healthcare (EBHC) at the Institute of Health Sciences since September 2023. She holds a Master degree in Public Health. Her activities include service provision, research and teaching in the field of EBHC. Service and research projects focus on knowledge management and networking in healthcare professions as well as the transfer of research results into practice to promote EBHC The teaching content includes research methods and systematic literature research as well as supervision of bachelor theses. Magdalena and her team published several articles on the topics mentioned above.

Yishan Qin

CHINA

Lanzhou University

An Innovative AI - Integrated Doctor - Patient Collaborative System for Optimizing Diagnosis and Integrating Traditional Chinese and Western Medicine

5:11 PM - 5:15 PM

Abstract

In the current healthcare landscape, the prevalent issues of information asymmetry between doctors and patients impede effective communication. Low diagnostic efficiency is a persistent problem, and the integration of traditional Chinese and Western medicine remains insufficient. These factors limit the quality of medical services and patient care.
To address these challenges, this study creates a system that fuses clinical practice guidelines with artificial intelligence. A vast amount of data, including clinical guidelines, medical literature, and patient data, is gathered to construct a comprehensive knowledge base. Natural language processing and machine learning techniques are employed. Machine learning models are trained using processed and annotated data. The system also incorporates evidence - based medicine principles, retrieving the latest research for diagnostic and treatment decision - making. Advanced algorithms handle symptom - disease associations, recognition, and hypothesis generation, with continuous improvement through feedback.
The system has demonstrated the ability to rapidly analyze large volumes of medical data. It provides real - time doctor - patient interaction, offering diagnostic suggestions and treatment plans. By promoting the complementary use of traditional Chinese and Western medicine, it boosts diagnostic efficiency and accuracy, while also optimizing medical resource allocation.
This innovative doctor - patient collaborative diagnosis and treatment system enhances the scientific and standardized aspects of medical diagnosis. It provides a new model for the development of smart healthcare and is planned for pilot use in hospitals in Hong Kong.

Paper Number

456

Biography

Qin Yishan is a PhD candidate at the School of Basic Medicine, Lanzhou University. Her research interests are traditional medicine and guideline methodology.

Dr Danielle Pollock

Senior Research Fellow

Health Evidence Synthesis, Recommendations And Impact (hesri), School Of Public Health, University Of Adelaide

Can evidence and gap maps improve guideline efficiency?

5:15 PM - 5:19 PM

Abstract

Background: The development of trustworthy guidelines requires extensive resources. There is a need to improve current workflow, whilst maintaining high standards to meet the needed demands of clinical practice. A key area to improve efficiencies is in the conduct and reporting of evidence synthesis. Traditionally, guideline developers conduct individual searches for each prioritized research question, this approach is tedious and inefficient, often resulting in the same study being screened for inclusion multiple times. Our team conducted an evidence and gap map (EGM) to improve workflow efficiencies in the development of the Australian Motor Neurone Disease (MND) Guideline. By conducting one search and categorizing the evidence base, we propose this can improve guideline workflow.
Objective: To discuss our process of conducting an EGM to assist in question prioritization, evidence searching, screening and conduct.
Methods: This EGM was conducted according to Campbell guidance and JBI guidance for scoping reviews. It was designed with people living with MND, clinicians, and researchers. The EGM was conducted prior to prioritization of research questions.
Results: Our EGM is currently underway, and by GIN 2025 will be completed. We will discuss the benefits, challenges, feasibility, implications and recommendations by conducting an EGM.
Discussion: EGMs could provide a foundation for transparent clinical practice guidelines to be developed more efficiently.

Paper Number

126

Biography

Dr Danielle Pollock is a Research Fellow at HESRI (Health Evidence Synthesis Recommendations and Impact). She developed the JBI Scoping Review Network is the chair of of the JBI Scoping Review methodology group and GIN ANZ working group.

Dr, Prof Philip Van der Wees

Radboudumc

An online decision tool to bridge the gap between evidence-based guidelines and multidisciplinary personalized allied health care in Parkinson's disease

5:19 PM - 5:23 PM

Abstract

Background
Allied health professionals support people with Parkinson’s disease to manage the impact of the disease in daily life. Guidelines should support evidence-based clinical care, but this is hampered by the lack of alignment between the content and design of monodisciplinary guidelines and the complexity of the disease that requires personalized care and multidisciplinary collaboration.
Objective
To develop and implement living multidisciplinary guidelines for allied health care in Parkinson's disease with embedded decision support based on the guidelines. With this, we aimed to improve quality of care by facilitating users in multidisciplinary collaboration, personalized shared decision-making and access to up-to-date knowledge on interventions for specific problems.
Methods
The development phase included three parallel iterative processes in which representatives of professionals and people with Parkinson's disease were involved: 1) Developing comprehensive multidisciplinary guidelines conform the GRADE methodology, covering the main needs of people with Parkinson's; 2) Further developing an innovative online system to support all steps of guideline development and continuous updates; 3) Designing a guideline-based interactive decision support to filter and present the relevant recommendations in a separate version for professionals and for people with Parkinson's. The pre-final version of the product was piloted among end-users to refine the product and inform the implementation plan. The national launch was in November 2023.
Future prospects
Continuous updates and implementation are successfully ongoing. One-year implementation evaluation showed positive results. These online guidelines with embedded decision support have potential for international (context) translation and implementation and can inform guideline development for other diseases.

Paper Number

170

Biography

Philip van der Wees is professor of Allied Health Sciences. His research focuses on value-based care in interdisciplinary networks. He is Chair of the Dutch Guidelines Network.

Chair

Dr Miloslav Klugar

Chair Of Nikez Method Centre

Jbi, Cochrane, Grade Of The Czech Republic, Nikez, Úzis Čr