갑상선초음파에서 인공지능: 임상적 성과, 잠재적 한계, 및 임상 실무 통합

Artificial Intelligence for Thyroid Ultrasound: Clinical Performance, Pitfalls, and Practice Integration

Article information

Clin Ultrasound. 2025;10(2):59-68
Publication date (electronic) : 2025 November 30
doi : https://doi.org/10.18525/cu.2025.10.2.59
1Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
2Department of Internal Medicine, Korea Medical Institute, Seoul, Korea
3ThanQ Seoul Center for Thyroid-Head and Neck Surgery & Medicine, Seoul, Korea
강준석1, 안지현2, 하정훈,3orcid_icon
1하버드대학교 공학·응용과학대학원
2KMI한국의학연구소 내과
3땡큐서울의원 이비인후과
Address for Correspondence: Jeong Hun Hah, M.D., Ph.D. ThanQ Seoul Center for Thyroid-Head and Neck Surgery & Medicine, 408 Teheran-ro, Gangnam-gu, Seoul 06192, Korea Tel: +82-1522-8775, Fax: +82-2-563-5075 E-mail: jhunhah@gmail.com
Received 2025 September 16; Revised 2025 October 27; Accepted 2025 November 13.

Trans Abstract

The use of artificial intelligence (AI) in thyroid ultrasound is bringing important changes to endocrine imaging, helping improve diagnostic accuracy and make the assessment of thyroid nodules more consistent. This review examines the current applications, technological approaches, clinical performance, adversities, and future directions of AI-based systems in thyroid ultrasound. Recent studies suggest that AI technologies hold significant potential in thyroid ultrasound, particularly in automated nodule detection, classification, and risk stratification. Deep learning models, particularly convolutional neural networks, achieve diagnostic accuracies exceeding 90% in distinguishing benign from malignant nodules, often matching or surpassing human radiologist performance. Current applications include Thyroid Imaging Reporting and Data System-based classification systems, lymph node metastasis prediction, and real-time diagnostic assistance. However, challenges including reproducibility concerns, clinical workflow integration, and regulatory considerations remain significant barriers to widespread adoption. While AI shows remarkable promise in thyroid ultrasound applications, challenges including validation requirements, standardization needs, and clinical integration barriers must be addressed for widespread adoption. Future developments should focus on multimodal integration, explainable AI systems, and prospective clinical trials to fully utilize the potential of AI in transforming thyroid diagnostics.

INTRODUCTION

Thyroid nodules represent one of the most common endocrine disorders, affecting up to 68% of the general population when detected by high-resolution ultrasound [1]. Accurate differentiation between malignant and benign thyroid nodules is crucial for appropriate patient management, as most are benign while only 7–15% prove malignant [2,3]. With rapid advances in medical imaging, detection rates of thyroid disease and cancer have risen markedly [4,5].

Ultrasound remains the first-line modality because of its real-time capability, safety, and soft-tissue contrast [4,6]. Yet traditional interpretation suffers from inter-observer variability and operator dependency [4]. Inexperienced clinicians are prone to misclassification and unnecessary fine-needle aspiration (FNA) biopsies [4,7].

The growing incidence of thyroid cancer and clinical workload have created an urgent need for objective and efficient diagnostic tools [8]. Artificial intelligence (AI) provides advanced pattern-recognition capabilities for ultrasound imaging [9]. By leveraging deep-learning algorithms, AI serves as a powerful adjunct to enhance diagnostic efficiency and consistency [5,10].

AI applications have evolved from early computer-aided detection or diagnosis (CAD) systems to sophisticated models capable of automated nodule characterization, risk stratification, and decision support [11]. Collectively, these systems improve accuracy and standardization while reducing inter-observer variability [4,6,10].

This review provides an integrated overview of AI technologies, clinical p erformance, r egulatory f rameworks, a nd f uture opportunities in thyroid ultrasound (Fig. 1).

Figure 1.

Schematic workflow of artificial intelligence-assisted thyroid ultrasound.

AI TECHNOLOGIES IN THYROID ULTRASOUND

Machine-Learning (ML) Models

ML constitutes the foundation of AI applications in thyroid ultrasound. Early studies employed classifiers such as support-vector machines (SVMs) and random forests (RFs) to distinguish malignant from benign nodules [10].

RF models aggregate multiple decision trees, reducing bias and variance and demonstrating robustness in high-dimensional imaging data [11]. Compared with conventional manual feature assessment, ML-based systems markedly improve diagnostic performance [10,12].

Deep-Learning Architectures

Convolutional neural networks (CNNs)

CNNs form the backbone of modern AI for thyroid ultrasound. They extract hierarchical spatial features directly from raw images, eliminating the need for handcrafted preprocessing [8,10].

Widely used architectures—ResNet, DenseNet, VGG, and EfficientNet— offer differing trade-offs between depth and efficiency [13,14]. ResNet-50, for example, achieved an F1-score of 92% in nodule classification [13], and ensemble approaches further improved performance. EfficientNet-B4 excelled in histopathological classification of thyroid carcinomas [14,15].

Image segmentation and feature extraction

Advanced CNNs enable automated region of interest (ROI) segmentation and quantitative feature extraction [16]. They consistently identify hypoechoic patterns and irregular margins as malignant features [17-19]. Training on heterogeneous datasets mitigates variability across scanners and operators, enhancing robustness.

Vision Transformers (ViTs) and Advanced Architectures

Emerging vision-transformer architectures capture global spatial relationships through attention mechanisms [20]. ViTs have achieved performance comparable to CNNs in complex diagnostic tasks.

Multi-channel CNNs attain 90.9% accuracy in multi-class thyroid disease classification, while multi-scale detection networks reach 97.5% accuracy in nodule detection [4,20].

AI-Assisted Systems: Real-Time CAD, TI-RADS and K-TIRADS Integration, and Multimodal Approaches

AI-assisted systems combine algorithmic analysis with real-time ultrasound acquisition

CAD platforms analyze live ultrasound feeds and immediately highlight suspicious nodules, improving inexperienced physicians’ diagnostic accuracy from 73.82% to 76.44% (Table 1) [6,21].

Representative AI models and performance in thyroid ultrasound

Thyroid Imaging Reporting and Data System (TI-RADS)-based AI automatically classifies nodules according to American College of Radiology (ACR) TI-RADS criteria, achieving area under the curve (AUC) values around 0.91 and outperforming junior radiologists [18].

To reflect regional practice, the Korean Thyroid Imaging Reporting and Data System (K-TIRADS) provides analogous stratification standards widely adopted in East Asia [22,23]. Integration of AI with both ACR TI-RADS and K-TIRADS enhances consistency across international centers.

Commercial implementations—Koios DS and See-Mode Technologies—illustrate successful translation of research into clinical products [24,25]. Koios DS employs semi-automated ROI selection and ensemble CNNs, whereas See-Mode offers fully automated detection and reporting (Table 2).

Comparative features of FDA-approved thyroid-ultrasound AI systems

Multimodal integration further extends diagnostic depth by fusing B-mode, Doppler, and elastography data with clinical variables [26,27]. Collectively, these AI-assisted solutions standardize assessment, reduce inter-reader variability, and improve efficiency in thyroid imaging workflows [28].

CLINICAL APPLICATIONS AND PERFORMANCE

Diagnostic Accuracy and Nodule Classification

AI-assisted ultrasound systems demonstrate substantial gains in diagnostic accuracy, often exceeding 90% and reaching >99% in validation cohorts [6,29,30]. Detection accuracies above 97% and AUCs approaching 0.985 have been reported [4].

Meta-analyses reveal pooled sensitivity 0.88 and specificity 0.81 for benign-malignant differentiation [9]. Integration of AI into clinical workflows benefits both junior and senior radiologists by reducing inter-observer variability and false-positive/negative rates [6,10,17].

Lymph Node Metastasis Prediction

Cervical lymph node assessment

In papillary thyroid carcinoma, ultrasound-based AI achieves pooled sensitivity 0.80 and specificity 0.83 for cervical lymphnode metastasis detection—surpassing physicians’ sensitivity of 0.51 [3,31].

Central compartment evaluation

For central lymph-node metastasis prediction, AI models yield AUC values 0.84–0.89, supporting surgical planning decisions [31,32]. Such applications extend AI’s role from diagnosis to therapeutic guidance.

Risk Stratification and Therapeutic Decision Support

Radiomic models quantify imaging biomarkers to predict tumor invasiveness and nodal metastasis [6]. AI-assisted TIRADS/K-TIRADS scoring reduces unnecessary FNA biopsies from 61.9% to 35.2% while maintaining accuracy [33].

Clinical decision-support systems integrate imaging and clinical data to generate treatment recommendations aligned with guidelines [34,35]. In one study, AI recommendations matched surgical decisions in 78.9% of cases [30], underscoring its utility in multidisciplinary care.

Real-World Clinical Validation

Large-scale multicenter validations (>20,000 patients) confirm that AI maintains diagnostic accuracy across heterogeneous ultrasound equipment [36].

A prospective study of 1,500 nodules reported sensitivity 96% and specificity 95%, with strong concordance with expert sonographers [29]. These findings affirm the robustness and clinical readiness of AI systems (Table 3).

Summary of clinical performance across diagnostic, metastatic, and risk-stratification domains

CHALLENGES AND LIMITATIONS

Data Quality and Generalizability

Reproducibility and generalizability remain the foremost barriers to clinical translation. Nearly 90% of published AI studies rely on proprietary, single-center datasets with limited external access, preventing independent replication [30,34].

Performance often deteriorates on unseen data—for instance, ThyNet accuracy dropped from 89.1% to 64% on external validation [4]. Heterogeneity in equipment, operator technique, and patient populations introduces distribution shifts and spectrum bias [37].

Robust AI models require transparent preprocessing pipelines, diverse multicenter datasets, and open-source code sharing to enable true external validation [38,39].

Interpretability and Explainable AI (XAI)

The black-box nature of deep networks limits clinician trust [7,40]. XAI approaches such as Grad-CAM, saliency mapping, and LIME are increasingly adopted to visualize decision-making processes.

These methods enhance interpretability, allow correlation of algorithmic attention with sonographic features (e.g., microcalcifications, irregular margins), and strengthen physician confidence in AI-assisted diagnosis.

Clinical Workflow Integration and Operator Dependency

Integration with picture archiving and communication systems (PACS) and radiology reporting software remains inconsistent [30,41]. Without seamless connectivity, AI tools may lengthen interpretation time or disrupt established workflows.

Successful adoption requires interoperability with existing ultrasound consoles, minimal user interaction, and real-time processing compatible with clinical throughput [18,42].

Operator dependency persists even with AI: image quality still hinges on acquisition skill [37]. Standardized scanning protocols and structured AI training for clinicians are essential to sustain diagnostic reliability.

Regulatory and Ethical Frameworks

Regulatory approval for AI-based thyroid ultrasound systems necessitates rigorous validation to ensure clinical safety, efficacy, and reproducibility across diverse patient populations. These frameworks are essential to translate algorithmic innovation into safe, reliable clinical practice.

Global regulatory pathways

In the United States, the Food and Drug Administration (FDA) has pioneered the evaluation and clearance of AI-enabled imaging devices. FDA-cleared systems such as Koios DS and See-Mode Technologies underwent extensive premarket review processes confirming diagnostic accuracy, reproducibility, and risk management in accordance with Software as a Medical Device (SaMD) principles [24,25].

Similarly, in the European Union, the Medical Device Regulation (MDR) and certification experts (CE) marking processes emphasize continuous post-market surveillance, traceability, and transparency throughout the product lifecycle. These frameworks collectively ensure that AI-based diagnostic software meets both safety and ethical standards before widespread deployment.

Korean regulatory framework and Digital Medical Products Act (DMPA)

In South Korea, the regulation of AI-based thyroid ultrasound systems falls under the Ministry of Food and Drug Safety (MFDS), which serves as the primary authority for medical device software (SaMD) [43]. The MFDS is responsible for ensuring clinical safety and efficacy, having issued specialized guidance on both regulatory review and approval [44] and clinical trial design [45] for machine learning-enabled diagnostic devices.

This framework was institutionalized through the DMPA, which entered into force on January 24, 2025 [46]. The DMPA establishes a comprehensive, dedicated legal framework for digital medical products, defining requirements for classification, manufacturing, import authorization, and technical documentation [47].

Furthermore, compliance with the stringent Personal Information Protection Act (PIPA) is mandatory for the handling of sensitive health data [43]. The DMPA also mandates adoption of “security-by-design” principles, requiring manufacturers to conduct proactive cybersecurity risk assessments during software development [47].

This integrated regulatory landscape ensures that AI systems deployed in Korean healthcare institutions comply with global quality benchmarks while addressing local challenges, including PACS interoperability, hospital information system integration, and reimbursement structures. Such harmonization between regulatory oversight and clinical infrastructure facilitates both innovation and patient safety in real-world implementation.

Ethical, legal, and data governance considerations

Ethical and legal frameworks governing AI in healthcare remain in rapid evolution. Key issues include liability in cases of AI misdiagnosis, the requirement for informed consent when predictive or automated decision support tools are used, and the delineation of responsibility between human clinicians and algorithmic systems [30,41].

From a governance perspective, data integrity and privacy protection are paramount. AI training and deployment must comply with both local and international data privacy laws, including Health Insurance Portability and Accountability Act (HIPAA) in the U.S., European Union General Data Protection Regulation (GDPR), and PIPA/DMPA in Korea. The implementation of secure data pipelines, access control mechanisms, and auditable data logs ensures that patient information remains protected during model development and clinical use [26].

Transparency, fairness, and explainability

Algorithmic transparency and fairness are critical to maintaining clinician trust. The so-called “black box” problem limits interpretability, as deep learning models often yield accurate predictions without explaining their reasoning [7,40]. To address this, current research emphasizes XAI methodologies that provide human-understandable rationales for model outputs.

Equitable AI deployment also requires mitigating algorithmic bias, ensuring balanced performance across sex, age, and ethnic subgroups [26]. Independent validation using multi-institutional, demographically diverse datasets is essential for achieving fairness and inclusivity.

Ultimately, regulatory frameworks must evolve toward continuous oversight models that account for adaptive learning, enabling safe updates of AI algorithms post-approval while preserving accountability and ethical compliance.

FUTURE DIRECTIONS AND OPPORTUNITIES

Technological Advances

Enhanced diagnostic tools

Next-generation systems such as AI-SONIC are expanding training datasets to encompass diverse imaging and histopathologic data, improving differentiation between malignant and benign nodules [41,48].

Multimodal integration

Integrating B-mode, Doppler, and elastography with clinical, laboratory, and molecular data can provide holistic characterization of thyroid pathology [26,27]. Such fusion addresses the multifactorial nature of disease and may support personalized treatment pathways.

Edge computing and mobile deployment

Edge and mobile implementations will permit point-of-care AI on handheld or portable ultrasound devices, democratizing expert-level interpretation in community and low-resource settings [35,40].

Federated learning

Federated learning enables collaborative model training among hospitals without exchanging raw patient data, thus preserving privacy while enhancing generalizability [7].

Clinical Research Priorities and Standardization

Prospective clinical trials

Large-scale randomized controlled trials comparing AI-assisted versus conventional workflows are essential to quantify patient-outcome benefits and cost-effectiveness [18,42].

Standardization efforts

Professional societies should establish consensus guidelines on dataset annotation, validation metrics, and TI-RADS/K-TIRADS-aligned reporting to ensure reproducibility across institutions [32]. Uniform preprocessing disclosure will facilitate regulatory review and meta-analysis.

Health-economics evaluation

Health-economics research must assess AI’s impact on biopsy rates, turnaround time, and healthcare expenditure [6,18]. Evidence of cost-effectiveness will underpin sustainable reimbursement models.

Education, Implementation, and Quality Assurance

Clinician training remains pivotal. Educational programs should emphasize AI interpretation, limitations, and bias awareness [37,42]. Continuous quality-assurance (QA) protocols— periodic validation, drift monitoring, and system recalibration— are needed to maintain accuracy [18].

Effective system integration depends on robust information technology (IT) infrastructure and staged deployment with stakeholder engagement across radiology, surgery, and compliance teams [26]. Transparent demonstration of clinical benefit mitigates resistance and supports organizational change management [42].

CONCLUSION

AI is redefining thyroid-ultrasound diagnostics by enhancing accuracy, reproducibility, and workflow efficiency. Deep-learning architectures—particularly CNNs—achieve performance exceeding 90% and often rival expert radiologists in nodule classification and lymph-node-metastasis prediction.

Clinical applications have progressed from basic CAD tools to fully integrated decision-support systems aligned with TIRADS and K-TIRADS, reducing unnecessary FNAs while maintaining diagnostic fidelity. Remaining challenges include dataset transparency, reproducibility, workflow interoperability, and regulatory harmonization.

Future directions point toward multimodal and federated models, prospective multicenter validation, and standardized reporting frameworks linking AI outputs directly to structured clinical decision pathways. The emergence of FDA- and MFDS-cleared systems demonstrates feasibility but underscores the need for unified international guidelines.

Sustained adoption will depend on education, QA, and evidence of economic value. Ultimately, AI should augment—not replace—clinical expertise, transforming thyroid ultrasound from a subjective art into a reproducible, data-driven science.

Notes

ACKNOWLEDGEMENTS

Figure 1 in this manuscript were generated with the assistance of ChatGPT-5 (OpenAI) based on author-provided prompts.

FUND

None.

CONFLICTS OF INTEREST

No potential conflict of interest relevant to this article was reported.

AUTHOR CONTRIBUTIONS

J.A. and J.H.H. designed the study. J.A. and J.H.H. were responsible for the data acquisition. J.K., J.A. and J.H.H. analyzed the data. J.K. and J.A. wrote the first draft of the manuscript. J.A. and J.H.H. critically revised the manuscript. J.H.H. supervised the project. All authors read and approved the final manuscript.

References

1. Bhatiya MK, Walke R. A review on artificial intelligence for thyroid nodule ultrasound. J Pharm Negat Results 2022;13:2950–2952.
2. Liang XW, Cai YY, Yu JS, Liao JY, Chen ZY. Update on thyroid ultrasound: a narrative review from diagnostic criteria to artificial intelligence techniques. Chin Med J (Engl) 2019;132:1974–1982.
3. Wang X, Qi Y, Zhang X, Liu F, Li J. Ultrasound-based artificial intelligence for predicting cervical lymph node metastasis in papillary thyroid cancer: a systematic review and meta-analysis. Front Endocrinol (Lausanne) 2025;16:1570811.
4. Cao CL, Li QL, Tong J, et al. Artificial intelligence in thyroid ultrasound. Front Oncol 2023;13:1060702.
5. de Oliveira Andrade LJ, de Oliveira GCM, Carneiro Andrade JC, Bittencourt AMV, de Figueiredo AM, de Oliveira LM. Enhancing diagnostic precision in thyroid nodule classification: a deep learning approach to automated ultrasound image analysis. medRxiv [Preprint] 2025;[cited 2025 Jul 1]. Available from: https://doi.org/10.1101/2025.02.05.25321737.
6. Liu D, Yang K, Zhang C, Xiao D, Zhao Y. Fully-automatic detection and diagnosis system for thyroid nodules based on ultrasound video sequences by artificial intelligence. J Multidiscip Healthc 2024;17:1641–1651.
7. Yazdanpanahi P, Atighi F, Keshtkar A, et al. The current progress of artificial intelligence in approach to thyroid nodules: a narrative review. Shiraz E Med J 2024;25:e148493.
8. Li X, Zhang S, Zhang Q, et al. Diagnosis of thyroid cancer using deep convolutional neural network models applied to sonographic images: a retrospective, multicohort, diagnostic study. Lancet Oncol 2019;20:193–201.
9. Xue Y, Zhou Y, Wang T, et al. Accuracy of ultrasound diagnosis of thyroid nodules based on artificial intelligence-assisted diagnostic technology: a systematic review and meta-analysis. Int J Endocrinol 2022;2022:9492056.
10. Xu C, Wang Z, Zhou J, et al. Application research of artificial intelligence software in the analysis of thyroid nodule ultrasound image characteristics. PLoS One 2025;20:e0323343.
11. Lu Q, Wu Y, Chang J, Zhang L, Lv Q, Sun H. Application progress of artificial intelligence in managing thyroid disease. Front Endocrinol (Lausanne) 2025;16:1578455.
12. Cai X, Zhou Y, Ren J, et al. Intelligent diagnosis of thyroid nodules with AI ultrasound assistance and cytology classification. Front Endocrinol (Lausanne) 2025;16:1546983.
13. Amgad N, Haitham H, Alabrak M, Mohammed A. Enhancing thyroid cancer diagnosis through a resilient deep learning ensemble approach. Proceedings of the 2024 6th International Conference on Computing and Informatics (ICCI) 2024;Mar 6-7; New Cairo, Egypt: ICCI, 2024. :195–202.
14. Linda B, Kamila K, Ilyas BR, Yamina K, Leila B, Sofiane BM. Enhancing thyroid cancer diagnosis with advanced deep learning methods. Proceedings of the 2024 International Conference on Telecommunications and Intelligent Systems (ICTIS) 2024;Dec 14-15; Djelfa, Algeria: ICTIS, 2024. :1–6.
15. Sharma R, Mahanti GK, Panda G, et al. A framework for detecting thyroid cancer from ultrasound and histopathological images using deep learning, meta-heuristics, and MCDM algorithms. J Imaging 2023;9:173.
16. Jassal K, Edwards M, Koohestani A, Brown W, Serpell JW, Lee JC. Beyond genomics: artificial intelligence-powered diagnostics for indeterminate thyroid nodules-a systematic review and meta-analysis. Front Endocrinol (Lausanne) 2025;16:1506729.
17. Kim J, Kim MH, Lim DJ, et al. Deep learning technology for classification of thyroid nodules using multi-view ultrasound images: potential benefits and challenges in clinical application. Endocrinol Metab (Seoul) 2025;40:216–224.
18. Chen Y, Gao Z, He Y, et al. Artificial intelligence model based on ACR TI-RADS characteristics for US diagnosis of thyroid nodules. Radiology 2022;303:613–619.
19. Deng P, Han X, Wei X, Chang L. Automatic classification of thyroid nodules in ultrasound images using a multi-task attention network guided by clinical knowledge. Comput Biol Med 2022;150:106172.
20. Zhang X, Lee VCS, Rong J, Lee JC, Song J, Liu F. A multi-channel deep convolutional neural network for multi-classifying thyroid diseases. Comput Biol Med 2022;148:105961.
21. Kim Y, Shin S, Lee E, Kim D, Kwak JY. Real-time assessment of the beneficial role of computer-aided diagnosis in the diagnosis of thyroid nodules on ultrasound. J Eondcor Soc 2023;7(Suppl 1):bvad114.1799.
22. Ha EJ, Shin JH, Na DG, et al. Comparison of the diagnostic performance of the modified Korean Thyroid Imaging Reporting and Data System for thyroid malignancy with three international guidelines. Ultrasonography 2021;40:594–601.
23. Marukatat N, Parklug P, Chanasriyotin C. Comparison of the diagnostic accuracy of K-TIRADS and EU-TIRADS guidelines for detection of thyroid malignancy on ultrasound. Radiography (Lond) 2023;29:862–866.
24. Melville NA. FDA clears AI-powered thyroid ultrasound analysis system [Internet]. Medscape Medical News. c2024 [cited 2025 Aug 4]. Available from: https://www.medscape.com/viewarticle/fda-clears-ai-powered-thyroid-ultrasound-analysis-system-2024a1000h8o.
25. Wildman-Tobriner B, Taghi-Zadeh E, Mazurowski MA. Artificial intelligence (AI) tools for thyroid nodules on ultrasound, from the AJR special series on AI applications. AJR Am J Roentgenol 2022;219:1–8.
26. Chen Z, Chambara N, Lo X, et al. Improving the diagnostic strategy for thyroid nodules: a combination of artificial intelligence-based computer-aided diagnosis system and shear wave elastography. Endocrine 2025;87:744–757.
27. Xia M, Song F, Zhao Y, Xie Y, Wen Y, Zhou P. Ultrasonography-based radiomics and computer-aided diagnosis in thyroid nodule management: performance comparison and clinical strategy optimization. Front Endocrinol (Lausanne) 2023;14:1140816.
28. Nagendra L, Pappachan JM, Fernandez CJ. Artificial intelligence in the diagnosis of thyroid cancer: recent advances and future directions. Artif Intell Cancer 2023;4:1–10.
29. Pignataro F, Guidobaldi L, Gismant M, Nestola M, Nisita C. Validation of an artificial intelligence model for the diagnosis of thyroid nodules: a prospective multicenter study. HSOA J Clin Stud Med Case Rep 2024;11:239.
30. Gudarzi M, Parhizkari M, Abbasi F. Artificial intelligence in thyroid imaging: a review of deep learning techniques and clinical applications. Adv Appl NanoBio Tech 2025;6:66–78.
31. Tang X, Zhou H, Liu Y, Gao S, Zhou Y. Diagnostic performance of the ultrasound -based artificial intelligence diagnostic system in predicting cervical lymph node metastasis in patients with thyroid cancer: A systematic review and meta-analysis. Sci Prog 2025;108:1–17.
32. Han H, Sun H, Zhou C, et al. Development and validation of a machine learning model for central compartmental lymph node metastasis in solitary papillary thyroid microcarcinoma via ultrasound imaging features and clinical parameters. BMC Med Imaging 2025;25:228.
33. Peng S, Liu Y, Lv W, et al. Deep learning-based artificial intelligence model to assist thyroid nodule diagnosis and management: a multicentre diagnostic study. Lancet Digit Health 2021;3:e250–e259.
34. Yao J, Wang Y, Lei Z, et al. Multimodal GPT model for assisting thyroid nodule diagnosis and management. NPJ Digit Med 2025;8:245.
35. Yao J, Wang Y, Lei Z, et al. AI-generated content enhanced computer-aided diagnosis model for thyroid nodules: a ChatGPT-style assistant. arXiv [Preprint]. 2024 [cited 2025 Jul 1]. Available from: https://doi.org/10.48550/arXiv.2402.0240.
36. Ha EJ, Lee JH, Lee DH, et al. Artificial intelligence model assisting thyroid nodule diagnosis and management: a multicenter diagnostic study. J Clin Endocrinol Metab 2024;109:527–535.
37. Park SH. Artificial intelligence for ultrasonography: unique opportunities and challenges. Ultrasonography 2021;40:3–6.
38. Yousefi M, Maleki SF, Jafarizadeh A, et al. Advancements in radiomics and artificial intelligence for thyroid cancer diagnosis. arXiv [Preprint]. 2024 [cited 2025 Jul 1]. Available from: ttps://doi.org/10.48550/arXiv.2404.07239.
39. Edström AB, Makouei F, Wennervaldt K, et al. Human-AI collaboration for ultrasound diagnosis of thyroid nodules: a clinical trial. Eur Arch Otorhinolaryngol 2025;282:3221–3231.
40. Wang Z, Zhang Z, Traverso A, Dekker A, Qian L, Sun P. Assessing the role of GPT-4 in thyroid ultrasound diagnosis and treatment recommendations: enhancing interpretability with a chain of thought approach. Quant Imaging Med Surg 2024;14:1602–1615.
41. Yang L, Lin N, Wang M, Chen G. Diagnostic efficiency of existing guidelines and the AI-SONIC™ artificial intelligence for ultrasound-based risk assessment of thyroid nodules. Front Endocrinol (Lausanne) 2023;14:1116550.
42. Toro-Tobon D, Loor-Torres R, Duran M, et al. Artificial intelligence in thyroidology: a narrative review of the current applications, associated challenges, and future directions. Thyroid 2023;33:903–917.
43. Ministry of Food and Drug Safety (MFDS). Guidance on the review and approval of artificial intelligence (AI)-based medical devices [Internet]. Osong: MFDS, c2023 [cited 2025 Jul 1]. Available from: https://www.mfds.go.kr/eng/brd/m_40/view. do?seq=72627.
44. Ministry of Food and Drug Safety (MFDS). Guidance on clinical trials design of artificial intelligence (AI)-based medical devices [Internet]. Osong: MFDS, c2023 [cited 2025 Jul 1]. Available from: https://www.mfds.go.kr/eng/brd/m_40/view.do?seq=72628.
45. Ministry of Food and Drug Safety (MFDS). Digital medical products act [Internet]. Osong: MFDS, c2024 [cited 2025 Jul 1]. Available from: https://elaw.klri.re.kr/eng_service/lawView.do?hseq=69456&lang=ENG.
46. Baade A, Noh J. South Korea’s digital medical products act now enforced [Internet]. EMERGO by UL, c2025 [cited 2025 Jul 1]. Available from: www.emergobyul.com/news/south-koreas-digital-medical-products-act-now-enforced.
47. Park SH, Dean G, Ortiz EM, Choi JI. Overview of South Korean guidelines for approval of large language or multimodal models as medical devices: key features and areas for improvement. Korean J Radiol 2025;26:519–523.
48. Xu D, Sui L, Zhang C, et al. The clinical value of artificial intelligence in assisting junior radiologists in thyroid ultrasound: a multicenter prospective study from real clinical practice. BMC Med 2024;22:293.

Article information Continued

Figure 1.

Schematic workflow of artificial intelligence-assisted thyroid ultrasound.

Table 1.

Representative AI models and performance in thyroid ultrasound

AI model/system Core approach or architecture Sensitivity (%) Specificity (%) AUC Accuracy (%) Other metrics/notes Reference
ResNet-50 Deep CNN with residual connections for hierarchical feature extraction 81.7 60.0 0.86 73.0 F1-score 92% for nodule classification [6,13]
Multi-channel CNN Parallel channels processing multiple views for multi-class classification 89.6 99.4 0.98 90.9 Precision 94.4; F1-score 91.7 [20]
Multi-scale Detection Network Multi-resolution feature pyramid for detecting nodules of various sizes - - - 97.5 High accuracy in nodule detection [4]
Real-time CAD (inexperienced physicians) AI analysis of live ultrasound feeds for instant feedback - - 0.716 → 0.740 73.82 → 76.44 Diagnostic improvement with AI assistance [21]
TIRADS-based AI Automated classification per ACR TI-RADS criteria 83.0 87.0 0.91 86.0 Outperformed junior radiologists in AUC [18]
Koios DS FDA-cleared semi-automated CAD ensemble CNN 82.3 → 86.5 38.3 → 54.8 0.776 → 0.817 - NPV 94.5 → 96.4; >35% biopsy reduction [24,25]
See-Mode Technologies Fully automated CAD with detection and reporting modules - - - - Up to 30% reduction in scan time; enhanced consistency [24,25]

AI, artificial intelligence; AUC/AUROC, area under the receiver operating characteristic curve; CNN, convolutional neural network; CAD, computer-aided detection or diagnosis; TIRADS, Thyroid Imaging Reporting and Data System; ACR, American College of Radiology; FDA, U.S. Food and Drug Administration; NPV, negative predictive value.

Table 2.

Comparative features of FDA-approved thyroid-ultrasound AI systems

Feature Koios DS See-mode technologies
FDA clearance Dec 2021 (FDA 510(k) K213875), Nov 2024 (K242130, latest) Sep-Nov 2024 (FDA 510(k) K240697)
Classification type CAD (diagnostic risk assessment) CAD (detection and diagnosis, nodule localization/stratification)
Database size >2 million images from 48 sites Not publicly disclosed
Automation level Semi-automated (user selects region of interest/ROI) Fully automated (nodule detection and characterization, no ROI input needed)
Processing time ≈2 s per nodule Up to 30% scan-time reduction in large imaging centers
TIRADS integration ACR TI-RADS + ATA classification ACR TI-RADS (full lexicon, automatic scoring for all US features)
Clinical performance AUROC 0.776–0.817; Sensitivity 82.3–86.5%; Specificity 38.3–54.8%; NPV 94.5–96.4% Improved localization and characterization; MRMC study shows improved diagnostic and reader performance; Up to 30% faster workflow
Biopsy reduction >35% reduction in unnecessary biopsies Not specified
Time efficiency ≈25% reduction in interpretation/analysis time Up to 30% overall scan time reduction; automated worksheet/report
Unique features Advanced “AI Adapter” risk tool; >17,000 features/image; improved observer agreement; structured report export Prior-exam comparison; automated report/preliminary worksheet; cloud/browser workflow; PACS/RIS push; full automation
Integration PACS and structured (SR) reporting software Native integration with major PACS, RIS, radiology reporting systems; web-based and DICOM compatible
Global approvals FDA (USA) FDA (USA), Canada, Singapore, Australia, New Zealand

FDA, U.S. Food and Drug Administration; AI, artificial intelligence; computer-aided detection or diagnosis; ROI, region of interest; ACR, American College of Radiology; TIRADS, Thyroid Imaging Reporting and Data System; ATA, American Thyroid Association; US, ultrasound; AUROC, area under the receiver operating characteristic curve; NPV, negative predictive value; MRMC, multi-reader multicase; PACS, picture archiving and communication system; RIS, radiology information system; SR, structured report; DICOM, digital imaging and communications in medicine.

Table 3.

Summary of clinical performance across diagnostic, metastatic, and risk-stratification domains

Clinical application Sensitivity (%) Specificity (%) AUC Accuracy (%) Other metrics Reference
Diagnostic accuracy/nodule classification 89–98 84–88 0.93–0.98 90–99 Detection accuracy >97% (high-end model) [4,6,29,30]
Meta-analysis (benign vs malignant) 88 (pooled) 81 (pooled) 0.92 - PLR 4.5; NLR 0.15; DOR 30 [9]
Cervical LN metastasis prediction (PTC) 80 (pooled) 83 (pooled) 0.84–0.85 79–82 Human physicians: 0.51 sensitivity [3,31]
Central compartment evaluation/surgical planning 73–78 77–82 0.84–0.89 80–86 Guides extent of surgery; improves staging accuracy [31,32]

AUC/AUROC, area under the receiver operating characteristic curve; PLR, positive likelihood ratio; NLR, negative likelihood ratio; DOR, diagnostic odds ratio; LN, lymph node; PTC, papillary thyroid carcinoma.