Publications Repository - Helmholtz-Zentrum Dresden-Rossendorf

1 Publication

Multitask Learning with Convolutional Neural Networks and Vision Transformers Can Improve Outcome Prediction for Head and Neck Cancer Patients

Starke, S.; Zwanenburg, A.; Leger, K.; Lohaus, F.; Linge, A.; Schreiber, A.; Kalinauskaite, G.; Tinhofer, I.; Guberina, N.; Guberina, M.; Balermpas, P.; von der Grün, J.; Ganswindt, U.; Belka, C.; Peeken, J. C.; Combs, S. E.; Böke, S.; Zips, D.; Richter, C.; Troost, E. G. C.; Krause, M.; Baumann, M.; Löck, S.

Neural-network-based outcome predictions may enable further treatment personalization of patients
with head and neck cancer. The development of neural networks can prove challenging when a limited number of cases is available. Therefore, we investigated whether multitask learning strategies, implemented through the simultaneous optimization of two distinct outcome objectives (multi-outcome) and combined with a tumor segmentation task, can lead to improved performance of convolutional neural networks (CNN) and vision transformers (ViT). Model training was conducted on two distinct multicenter datasets for the endpoints loco-regional control (LRC) and progression-free survival (PFS), respectively. The first dataset consisted of pre-treatment computed tomography (CT) imaging for 290 patients and the second dataset contained combined positron emission tomography (PET)/CT for 224 patients. Discriminative performance was assessed by the concordance index (C-index). Risk stratification was evaluated using log-rank tests. Across both datasets, CNN and ViT model ensembles achieved similar results. Multitask approaches showed favorable performance in most investigations. Multi-outcome CNN models trained with segmentation loss were identified as the optimal strategy across cohorts. On the PET/CT dataset, an ensemble of multi-outcome CNNs trained with segmentation loss achieved the best discrimination (C-index: 0.29, 95% confidence interval (CI): 0.22-0.36) and successfully stratified patients into groups with low and high risk of disease progression (p=0.003). On the CT dataset, ensembles of multi-outcome CNNs and of single-outcome ViTs trained with segmentation loss performed best (C-index: 0.26 and 0.26, CI: 0.18-0.34 and 0.18-0.35, respectively), both with significant risk stratification for LRC in independent validation (p=0.002 and p=0.011). Further validation of the developed multitask-learning models is planned based on a prospective validation study, which is currently ongoing.

Keywords: survival analysis; vision transformer; convolutional neural network; multitask learning; tumor segmentation; head and neck cancer; Cox proportional hazards; loco-regional control; progression-free survival; discrete-time survival models

Involved research facilities

  • OncoRay

Related publications

Permalink: https://www.hzdr.de/publications/Publ-37388