Zum Inhalt springen

High-Frequency Ultrasound Radiomics Combined with Clinical Features for Detecting OMERACT-Defined Metacarpophalangeal Joint Cartilage Damage in Early Rheumatoid Arthritis

Prometheus Redaktion

Open AccessArticle High-Frequency Ultrasound Radiomics Combined with Clinical Features for Detecting OMERACT-Defined Metacarpophalangeal Joint Cartilage Damage in Early Rheumatoid Arthritis Department of Ultrasound, Peking University People’s Hospital, Beijing 100044, China * Author to whom correspondence should be addressed. Diagnostics 2026, 16(12), 1758; https://doi.org/10.3390/diagnostics16121758 (registering DOI) Submission received: 26 April 2026 / Revised: 27 May 2026 / Accepted: 3 June 2026 / Published: 6 June 2026 Abstract Background/Objectives: The aim of this study was to develop and validate a high-frequency ultrasound radiomics-based model for quantitative assessment of metacarpophalangeal (MCP) joint cartilage damage in early rheumatoid arthritis (RA). Methods: 656 MCP joints from 99 early RA patients and 65 healthy controls were prospectively enrolled and graded according to the Outcome Measures in Rheumatology (OMERACT) system. After radiomics feature extraction, five machine learning classifiers were evaluated. Radiomics, clinical, and combined models were constructed and assessed. Radiomics scores were compared among healthy grade 0 joints, early RA grade 0 joints stratified into two risk subgroups, and RA grade ≥ 1 joints. SHapley Additive exPlanations (SHAP) analysis was used for interpretation. Results: Eight stable radiomics features were selected. Among classifiers, support vector machine achieved the highest cross-validated performance and was selected as the final radiomics classifier (validation AUC = 0.804). The combined model, integrating radiomics features with age, disease duration, and Disease Activity Score in 28 joints, achieved the best diagnostic performance (AUC = 0.855), significantly outperforming both the radiomics and clinical models. Among OMERACT grade 0 joints, the high-risk subgroup demonstrated elevated radiomics-derived scores. SHAP analysis identified original_shape2D_PerimeterSurfaceRatio as the strongest contributor. Conclusions: High-frequency ultrasound radiomics combined with clinical features demonstrated strong performance in detecting MCP joint cartilage damage in early RA and may provide a quantitative extension to conventional semiquantitative assessment. Keywords: ultrasound radiomics; rheumatoid arthritis; cartilage damage; metacarpophalangeal joint 1. Introduction Rheumatoid arthritis (RA) is a chronic inflammatory autoimmune disease characterized by persistent synovitis and progressive joint destruction, affecting about 0.5–1% of the global population [ 1]. Damage to articular cartilage is a major contributor to functional impairment, reduced quality of life, and disability in RA patients [ 2]. Early and accurate detection of cartilage damage is essential for timely intervention and improved outcomes. This study aimed to develop a multimodal model integrating ultrasound radiomics with clinical parameters for detecting cartilage damage in early RA. Beyond model development, we further explored whether the radiomics model could identify latent structural heterogeneity within OMERACT grade 0 joints, where conventional semi-quantitative assessment may underestimate early cartilage damage. This approach may offer a reproducible and objective complement to OMERACT ultrasound assessment, supporting earlier and more precise RA cartilage assessment. 2. Materials and Methods 2.1. Study Population The single-center, prospective cross-sectional study enrolled patients with RA who attended the Department of Rheumatology at Peking University People’s Hospital and underwent HFUS examinations between September 2023 and January 2026. The inclusion criteria were as follows: fulfillment of the 2010 American College of Rheumatology/European League Against Rheumatism (ACR/EULAR) classification criteria for RA; age between 18 and 75 years; disease duration of less than 2 years; involvement of at least one second or third MCP joint; and availability of complete baseline clinical and ultrasound imaging data. Early RA was defined as a disease duration of less than 2 years. The exclusion criteria were as follows: concomitant inflammatory joint diseases (e.g., psoriatic arthritis and gout); history of trauma or surgery; intra-articular corticosteroid injections within the previous 4 weeks; poor ultrasound image quality; severe systemic diseases or malignant tumors; and pregnancy. Healthy controls were recruited during the same study period. The inclusion criterion was the absence of any history or symptoms of arthritis. Exclusion criteria included any history or current evidence of joint disease and poor ultrasound image quality. A total of 99 patients with early RA were ultimately included. Bilateral MCP2–3 images were acquired for each patient, resulting in 396 MCP joints for analysis. Patients were randomly divided into a training set and an independent validation set at the patient level in a 7:3 ratio, yielding 69 patients (276 joints) in the training set and 30 patients (120 joints) in the validation set. This patient-level split was used to ensure that joints from the same patient were not allocated to both sets. In addition, 65 healthy controls (260 joints) were included. Baseline clinical data were collected for all RA patients, including the Disease Activity Score in 28 joints (DAS28), erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), anti-cyclic citrullinated peptide antibody (anti-CCP), and rheumatoid factor (RF). DAS28 was calculated using four components: the 28-joint tender joint count (TJC28), the 28-joint swollen joint count (SJC28), the ESR in mm/h, and the general health score (GH) assessed by the patient on a 100 mm visual analogue scale (0 = best, 100 = worst). The DAS28-ESR was computed using the following formula: D A S 28 = 0.56 ୍ଠ T J C 28 + 0.28 ୍ଠ S J C 28 + 0.70 ୍ଠ ln E S R + 0.014 ୍ଠ G H 2.2. Ultrasound Assessment Ultrasound examinations were performed using a Canon Aplio i800 ultrasound scanner (Canon Inc., Tokyo, Japan) equipped with a 24 MHz high-frequency linear transducer. All examinations were performed by two physicians with over three years’ experience in musculoskeletal ultrasound, following the OMERACT Working Group guidelines. Participants were seated with their hands resting naturally on the examination table, and the fingers were flexed to approximately 60° to fully expose the articular surfaces of the metacarpal heads. The transducer beam was positioned strictly perpendicular to the cartilage surface. Grayscale ultrasound images of the dorsal aspects of bilateral MCP2 and MCP3 joints were acquired along the longitudinal axis of the finger. All images were stored in DICOM format. The OMERACT cartilage score was independently assessed by two senior sonographers who were blinded to the patients’ clinical and laboratory data. In cases of disagreement, a third senior expert made the final decision. Cartilage damage was graded as follows: grade 0, normal thickness and echogenicity of hyaline cartilage; grade 1, abnormal cartilage echogenicity (localized increase or decrease), with or without mild thinning; grade 2, marked reduction in cartilage thickness or complete loss of cartilage ( Figure 1). 2.3. Radiomic Features Extraction 2.3.1. Image Segmentation All ultrasound images were manually segmented by one experienced sonographer, who was blinded to clinical and laboratory data, using 3D Slicer software (version 5.10.0). The ROI was delineated on the longitudinal dorsal grayscale image of each MCP joint, encompassing the visible hyaline cartilage over the metacarpal head. The superficial boundary was defined by the cartilage–synovial interface, and the deep boundary was defined by the cartilage–bone interface (identified as the hyperechoic cortical surface). The ROI included only the visible cartilage layer and excluded adjacent synovium, cortical bone, joint fluid, and regions affected by shadowing or artifacts. To assess reproducibility, 50 joint images were randomly selected and independently segmented by two sonographers. Dice similarity coefficient and intraclass correlation coefficient (ICC) were calculated. 2.3.2. Feature Extraction Before radiomics feature extraction, ultrasound images and their corresponding masks were resampled to a uniform spacing of 0.0154 × 0.0154 mm using B-spline interpolation. Intensity normalization was performed using z-score standardization, with the normalized values subsequently scaled to a range of 0–100. Gray-level discretization was conducted using a fixed bin width of 16. As all images represented single two-dimensional cross-sectional slices, feature extraction was restricted to the two-dimensional mode throughout. Radiomics features were extracted from each ROI using the PyRadiomics package (version 3.1.0) in Python (version 3.9.25), in accordance with the Image Biomarker Standardization Initiative (IBSI) recommendations [ 11]. A total of 939 features were initially extracted, including shape features, gray-level co-occurrence matrix (GLCM) features, gray-level run-length matrix (GLRLM) features, gray-level size zone matrix (GLSZM) features, gray-level dependence matrix (GLDM) features, and features derived from image transformations, including wavelet decomposition (LL, LH, HL, HH), gradient, logarithm, exponential, and square filters. 2.3.3. Feature Selection Feature selection was performed exclusively within the RA training set to avoid data leakage. The outcome was dichotomized as OMERACT grade 0 versus OMERACT grade ≥1. Repeated stratified 5-fold cross-validation was used for feature selection. In each fold, features with unstable repeatability (ICC 20 U/mL or RF > 20 IU/mL) together with moderate-to-high disease activity (DAS28 > 3.2), as seropositivity and high disease activity are recognized poor prognostic factors [ 12]; the phenotype-negative group included joints from patients not simultaneously meeting both criteria. Age-adjusted radiomics-derived scores were compared among four groups: grade 0 joints from healthy controls, phenotype-negative early RA grade 0 joints, phenotype-positive early RA grade 0 joints, and grade ≥ 1 joints from RA patients. Overall differences among the four groups were assessed using the Kruskal–Wallis test. Pairwise comparisons between groups were conducted using the Mann–Whitney U test. The distributions of radiomics-derived scores across the four groups were visualized using violin plots. 2.3.5. Construction and Comparison of Radiomics, Clinical, and Combined Models To further evaluate the combined value of radiomic features and clinical information, this study constructed the following three types of models within the RA group. The clinical model was constructed using clinical variables including age, disease duration, and DAS28. The radiomics model was constructed using the final stable radiomics feature set. The combined model incorporated the radiomics-derived score together with the clinical variables to construct a multimodal model. Given the limited number of input variables, logistic regression was selected as the classifier for both the clinical and combined models. The three models were evaluated in the independent validation set. ROC curves were plotted, and AUCs were compared using the DeLong test. To account for within-patient correlation among multiple joints, patient-clustered bootstrap resampling was additionally performed as a sensitivity analysis. 2.3.6. Statistical Analysis All statistical analyses and visualizations were performed using Python (version 3.9.25) and R (version 4.3.2). Continuous variables were tested for normality using the Shapiro–Wilk test. Normally distributed variables were presented as mean ± standard deviation and compared using the independent-samples t-test. Non-normally distributed variables were presented as median and interquartile range, and compared using the Mann–Whitney U test. Categorical variables were presented as frequencies and proportions (%) and compared using the chi-square test or Fisher’s exact test. For multiple pairwise comparisons, Bonferroni correction was applied where appropriate. All statistical tests were two-sided, with p 20 U/mL or RF > 20 IU/mL) with DAS28 > 3.2 among RA grade 0 joints with disease duration ≤ 3 months. Phenotype− indicates early RA grade 0 joints not simultaneously meeting both criteria. ns, not significant; * p 20 U/mL or RF > 20 IU/mL) with DAS28 > 3.2 among RA grade 0 joints with disease duration ≤ 3 months. Phenotype− indicates early RA grade 0 joints not simultaneously meeting both criteria. ns, not significant; * p < 0.05; *** p < 0.001. Figure 5. Comparison of the diagnostic performance of the clinical, radiomics, and combined models. Receiver operating characteristic (ROC) curves in ( a) the training cohort and ( b) the validation cohort. Figure 5. Comparison of the diagnostic performance of the clinical, radiomics, and combined models. Receiver operating characteristic (ROC) curves in ( a) the training cohort and ( b) the validation cohort. Figure 6. Confusion matrices of the ( a) clinical, ( b) radiomics, and ( c) combined model in the validation cohort. Figure 6. Confusion matrices of the ( a) clinical, ( b) radiomics, and ( c) combined model in the validation cohort. Table 1. Baseline demographics and disease characteristics of patients with RA and HCs. Table 1. Baseline demographics and disease characteristics of patients with RA and HCs. Characteristics RA Patients HCs p-Value Number 99 65 Age, years, mean (SD) 48.8 (13.8) 44.9 (11.8) 0.052 Female, n (%) 76 (76.8%) 55 (84.6%) 0.239 Disease duration, months, median (IQR) 6.0 (2.5–13.5) - - CRP, mg/L, median (IQR) 0.8 (0.0–8.6) - - ESR, mm/h, median (IQR) 16.0 (8.0–28.5) - - Anti-CCP, IU/mL, median (IQR) 236.5 (101.7–304.0) - - RF, IU/mL, median (IQR) 53.7 (16.3–178.4) - - DAS 28-ESR, median (IQR) 4.07 (3.17–5.24) - - Joints grade = 0, n (%) 340 (85.9%) 248 (95.4%) <0.001 Joints grade = 1, n (%) 48 (12.1%) 9 (3.5%) <0.001 Joints grade = 2, n (%) 8 (2.0%) 3 (1.2%) 0.54 RA: rheumatoid arthritis; HCs: healthy controls; SD: standard deviation; RF: rheumatoid factor; anti-CCP: anti-citrullinated peptide antibodies; IQR: interquartile range; ESR: erythrocyte sedimentation rate; CRP: C-reactive protein; DAS28-ESR: disease activity score in 28 joints based on ESR. Table 2. Stable radiomics features and their selection frequencies. Table 2. Stable radiomics features and their selection frequencies. Features Frequency logarithm_glrlm_LongRunHighGrayLevelEmphasis 0.84 original_shape2D_PerimeterSurfaceRatio 0.64 logarithm_glszm_LargeAreaLowGrayLevelEmphasis 0.60 wavelet-LH_firstorder_Median 0.58 gradient_firstorder_Minimum 0.52 wavelet-HL_glszm_LargeAreaLowGrayLevelEmphasis 0.48 exponential_gldm_LargeDependenceEmphasis 0.34 exponential_firstorder_Mean 0.20 GLRLM, gray-level run length matrix; GLSZM, gray-level size zone matrix; GLDM, gray-level dependence matrix. Selection frequency indicates the proportion of times each feature was selected across repeated cross-validation during the feature selection process. Table 3. Performance comparison of machine learning models in the training and validation cohorts. Table 3. Performance comparison of machine learning models in the training and validation cohorts. Model Set AUC [95% CI] Accuracy, % Sensitivity, % Specificity, % F1-Score, % Logistic Regression Train 0.779 [0.680–0.868] 82.6 59.0 86.5 48.9 Validation 0.785 [0.670–0.900] 79.2 52.9 83.5 41.9 Random Forest Train 0.866 [0.807–0.917] 84.8 71.8 86.9 57.1 Validation 0.734 [0.593–0.894] 79.2 58.8 82.5 44.4 Support Vector Machine Train 0.813 [0.735–0.880] 81.9 61.5 85.2 49.0 Validation 0.804 [0.663–0.929] 81.7 64.7 84.5 50.0 Gradient Boosting Train 0.836 [0.768–0.901] 84.4 69.2 86.9 55.7 Validation 0.741 [0.588–0.890] 79.2 58.8 82.5 44.4 K-Nearest Neighbors Train 0.847 [0.796–0.910] 84.4 64.1 87.8 53.8 Validation 0.755 [0.613–0.890] 80.0 52.9 84.5 42.9 AUC, area under the receiver operating characteristic curve; CI, confidence interval. Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. © 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Share and Cite MDPI and ACS Style Yao, M.; Li, W.; Xin, Y.; Li, D.; Yang, L.; Zhu, J. High-Frequency Ultrasound Radiomics Combined with Clinical Features for Detecting OMERACT-Defined Metacarpophalangeal Joint Cartilage Damage in Early Rheumatoid Arthritis. Diagnostics 2026, 16, 1758. https://doi.org/10.3390/diagnostics16121758 AMA Style Yao M, Li W, Xin Y, Li D, Yang L, Zhu J. High-Frequency Ultrasound Radiomics Combined with Clinical Features for Detecting OMERACT-Defined Metacarpophalangeal Joint Cartilage Damage in Early Rheumatoid Arthritis. Diagnostics. 2026; 16(12):1758. https://doi.org/10.3390/diagnostics16121758 Chicago/Turabian Style Yao, Minghui, Wenxue Li, Yuwei Xin, Diancheng Li, Li Yang, and Jia’an Zhu. 2026. "High-Frequency Ultrasound Radiomics Combined with Clinical Features for Detecting OMERACT-Defined Metacarpophalangeal Joint Cartilage Damage in Early Rheumatoid Arthritis" Diagnostics 16, no. 12: 1758. https://doi.org/10.3390/diagnostics16121758 APA Style Yao, M., Li, W., Xin, Y., Li, D., Yang, L., & Zhu, J. (2026). High-Frequency Ultrasound Radiomics Combined with Clinical Features for Detecting OMERACT-Defined Metacarpophalangeal Joint Cartilage Damage in Early Rheumatoid Arthritis. Diagnostics, 16(12), 1758. https://doi.org/10.3390/diagnostics16121758 Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here. Article Metrics Article metric data becomes available approximately 24 hours after publication online.

www.mdpi.com

Zum Originalartikel