Publications Repository - Helmholtz-Zentrum Dresden-Rossendorf

1 Publication

Introducing natural adversarial observations to a Deep Reinforcement Learning agent for Atari Games

Hanfeld, P.

Deep Learning methods are known to be vulnerable to adversarial attacks. Since Deep Reinforcement Learning agents are based on these methods, they are prone to tiny input data changes. Three methods for adversarial example generation will be introduced and applied to agents trained to play Atari games. The attacks target either single inputs or can be applied universally to all possible inputs of the agents. They were able to successfully shift the predictions towards a single action or to lower the agent’s confidence in certain actions, respectively. All proposed methods had a severe impact on the agent’s performance while producing invisible adversarial perturbations. Since natural-looking adversarial observations should be completely hidden from a human evaluator, the negative impact on the performance of the agents should additionally be undetectable. Several variants of the proposed methods were tested to fulfil all posed criteria. Overall, seven generated observations for two of three Atari games are classified as natural-looking adversarial observations.

Keywords: Reinforcement Learning; Adversarial Attacks; Deep Learning

  • Master thesis
    Hochschule Mittweida, 2021
    Mentor: Chorowski, Jan; Villmann, Thomas
    75 Seiten

Permalink: https://www.hzdr.de/publications/Publ-33889
Publ.-Id: 33889