Perturbing single images as a surrogate for radiomic feature robustness test-retest experiments

Perturbing single images as a surrogate for radiomic feature robustness test-retest experiments

Zwanenburg, A.; Leger, S.; Troost, E.; Richter, C.; Löck, S.

Purpose/Objective: Radiomics is the high-throughput, machine-learning based analysis of medical images for model-based treatment decisions. It relies on image characteristics (features), which quantify aspects of a volume of interest, such as its mean intensity, volume and texture heterogeneity. Features used for modelling should be robust against perturbations, induced e.g. by patient positioning, image acquisition and contouring; otherwise resulting radiomics models may not be generalisable. Test-retest imaging is the recommended method for assessing feature robustness, but is tumour phenotype-specific. A test-retest experiment would thus be required for each radiomics study, incurring additional costs in terms of patient preparation, imaging and additional imaging dose. Therefore we asses feature robustness using single images as a surrogate and compare these with test-retest results.

Methods: Two patient cohorts with test-retest CT imaging were used: a public NSCLC cohort of 31 patients [1] and an HNSCC cohort of 19 patients. For the NSCLC cohort, two separate images were acquired within 15 minutes of each other using the same scanner and protocol. Images in the HNSCC cohort were acquired within 4 days of each other with different scanners and protocols. The gross tumour volume (GTV) was contoured and 5571 features were extracted from the GTV of each image. Image perturbation was used to assess robustness from single images. Images were perturbed by adding image noise, performing sub-voxel translation, rotation, and contour randomisation, where contour boundaries are altered based on overlap of supervoxels with the GTV. Feature robustness between test-retest images and between perturbations of a single image was measured using the intraclass correlation coefficient (ICC). Features with ICC ≥ 0.85 were considered to be robust.

Results: We identified 3831 and 1123 robust features for test-retest imaging for the NSCLC and HNSCC cohorts, respectively. Features in the HNSCC cohort were generally less reproducible compared to the NSCLC cohort. The largest overlap between non-robust features identified by test-retest imaging and single image perturbation existed for rotation with randomised contours with 96% and 86% for NSCLC and HNSCC cohorts, respectively.

Conclusion: An essential step in radiomic analyses is the selection of features that are insensitive to different imaging protocols and equipment, and inter-observer variability. We demonstrated that perturbing single images by rotations combined with random contour alteration provides a suitable alternative to test-retest imaging that is easily available in clinical routine.

  • Lecture (Conference)
    ESTRO 37, 20.-24.04.2018, Barcelona, España
  • Open Access Logo Abstract in refereed journal
    Radiotherapy and Oncology 127(2018), S1151-S1152
    DOI: 10.1016/S0167-8140(18)32404-6

Publ.-Id: 26221