A scalable pipeline for effective forecast of COVID-19 in Germany, Czechia, and Poland


A scalable pipeline for effective forecast of COVID-19 in Germany, Czechia, and Poland

Abdussalam, W.

We develop a software system for the spatial, temporal, and strategic optimization of the use of tests for SARS-COV-2. We have built an operational data store (ODS) using PostgreSQL to continuously consolidate datasets from multiple data sources, perform collaborative work, facilitate high performance data analysis, and trace changes. The ODS has been built to store the COVID-19 data from Germany, Czechia, and other areas. We have built the schema of metadata which is capable of orderly storing the data from those regions, and is scalable to the entire world. Next, the ODS is populated using batch Extract, Transfer, and Load (ETL) jobs and SQL queries are created which reduce the need for pre-processing data. The data can then support forecasting using a version-controlled Arima and Holt Winter model and other analyses to support decision making. The jobs run at a weekly interval and plan to upgrade to a daily interval. The results are finally displayed as a web app at https://www.where2test.de.

  • Open Access Logo Lecture (Conference) (Online presentation)
    Software Engineering 2022, 21.-25.02.2022, Potsdam Online, Germany

Downloads

Permalink: https://www.hzdr.de/publications/Publ-34314