Ultra-fast data processing and image reconstruction using parallel processing architectures


Ultra-fast data processing and image reconstruction using parallel processing architectures

Bieberle, A.; Vogt, S.; Wagner, M.; Bieberle, M.; Barthel, F.; Hampel, U.

An ultra-fast electron beam X-ray tomography measuring system (Fischer et al., 2008) was developed at the Helmholtz-Zentrum Dresden-Rossendorf (HZDR) that is prior applied to fundamental multiphase flow investigations, e.g. in various technical devices, and for validation of enhanced flow simulation models, e.g. developed for computational fluid dynamic codes (CFD). The ultra-fast computed tomography (CT) system delivers contactless cross-sectional material distributions with a spatial resolution of approximately 1 mm and a temporal resolution of max. 8 kHz. Currently, both measuring data transfer as well as data processing has been identified as the most time consuming processes that have to be tackled to ensure an optimal use of that worldwide unique CT technique. As a first step, the data reconstruction algorithm is transferred to many-core graphics processing units (GPUs) using the so called ultra-fast X-ray imaging UFO framework (in this case: the filtered back projection algorithm). Subsequently, most of the data processing algorithms, originally implemented as sequentially executed code on single-core central processing units (CPUs), are adapted for both multi-core CPUs and, eventually, many-core GPUs application. To increase the performance improvements once more, an advanced performance PC (AP-PC) with two parallel operated high performance graphics processing units (Tesla K20c, NVIDIA®), a six-core processor (Xeon E5-1650 v3, Intel®) and a high data bus speed and memory and transfer volume (DDR4, 2133 MHz, 128 GByte) is assembled. Thus, data processing performance could be improved again using the specifically assembled hardware configurations. The timing results show that an optimized multi-core CPU-based code increases the data processing performance by a factor of 40. Moreover, the applied many-core GPU-optimized code, including the AP-PC hardware configuration adaptions, led to a data processing performance improvement factor of 138.

Keywords: computed tomography; many-core graphics processing units; multi-core central processing units; massive parallel data processing

  • Contribution to proceedings
    7th International Symposium on Process Tomography, 01.-03.09.2015, Dresden, Deutschland
    Proceedings of the 7th International Symposium on Process Tomography
  • Poster
    7th International Symposium on Process Tomography, 01.-03.09.2015, Dresden, Deutschland

Permalink: https://www.hzdr.de/publications/Publ-22033
Publ.-Id: 22033