A Parallel Code for Kinetic 3D Lattice MC Simulations of Nucleation, Growth and Ostwald Ripening of Nanocrystals

Schmeißer, Nils; Kunicke, Manfred; Heinig, Karl-Heinz

A Parallel Code for Kinetic 3D Lattice MC Simulations of Nucleation, Growth and Ostwald Ripening of Nanocrystals

Schmeißer, N.; Kunicke, M.; Heinig, K.-H.

The continuing exponential increase in computer power together with the recent developments of very efficient numerical procedures allow nowadays to perform predictive atomic-scale computer simulations for material science. This holds especially for advanced microelectronic devices where functional units consists more and more often of 10^6 atoms or even less. In this situation the design of new materials and devices is more and more frequently supported by atomic-scale computer simulations.

A kinetic 3D Monte-Carlo code based on stochastic probabilistic two-center cellular automaton using a double bookkeeping technique, one in the particle vector and the other in the lattice space was originally coded in PASCAL and tested on an INTEL PC, later in C and on HP workstations. This implementation should be speeded up considerably in order to undertake real scientific simulations. The only way to get the needed speedup is parallelisation.

We started with a look at the given sequential code, translated it into C and run it on our S-Class Server. While translating we had to understand a lot of things, e.g. the double booking technique, so that we could do first steps of scalar optimisation during this process.

The most simple idea of a parallel approach is distributing the lattice across the processors. But because of the physical problem modelling the growth of clusters of implanted particles in the lattice such an approach will lead to load imbalances and destroy the effect of parallelisation.

With an another approach we tried to distribute the work done in the particle space instead of the lattice space. We developed a graph-theoretical approach based on skeletons to find out a work distribution across 2^n processors that assure s a good load-balance. The disadvantage of this approach is the work needed to compute the distribution which is of order O(n^3) compared to the simulation which is of order O(n).

At this point it turned out the given model was not suitable for parallelising the algorithm. So we had feed-back with the physicist and together we developed a model of the physical process where the algorithm which solves the problem could be formulated in parallel. The new model allowed us to divide the solution into three main steps, where two of them can be done parallel and will therefor improve the speed of the algorithm.

We implemented this algorithm in a mixed language modularised program using FORTRAN for the computational part and C for the I/O part. The program runs on our S-Class server using maximum 8 processors.

In our presentation the methodology of creating an optimised parallel code for the Monte-Carlo simulation of Ostwald Ripening, the schema of the parallel code and first real physical simulation results will be shown.

Lecture (Conference)
The 13th Annual HPC User Group Conference 1998

Permalink: https://www.hzdr.de/publications/Publ-1106