# Performance¶

## Parallelism¶

Tamaas implements shared-memory parallelism using thrust. The Thrust backend can be controlled with the following values of the backend build option:

omp

Thrust uses its OpenMP backend (the default). The number of threads is controlled by OpenMP.

cpp

Thurst does not run in threads (i.e. sequential). This is the recommanded option if running multiple MPI tasks.

tbb

Thrust uses its TBB backend. Note that this option is not fully supported by Tamaas.

Tip

When using the OpenMP or TBB backend, the number of threads can be manually controlled by the initialize function. When OpenMP is selected for the backend, the environment variable OMP_NUM_THREADS can also be used to set the number of threads.

FFTW has its own system for thread-level parallelism, which can be controlled via the fftw_threads option:

none

threads

FFTW uses POSIX/Win32 threads for parallelism.

omp

FFTW uses OpenMP.

Note

As with the Thrust backend, the number of threads for FFTW can be controlled with initialize.

Finally, the boolean variable use_mpi controls wheter Tamaas is compiled with MPI-parallelism. If yes, Tamaas will be linked against libfftw3_mpi regardless of the thread model.

Important

Users wary of performance should use MPI, as it yields remarkably better scaling properties than the shared memory parallelism models. Care should also be taken when compiling with both OpenMP and MPI support: setting the number of threads to more than one in an MPI context can decrease performance.

## Integration algorithm¶

In its implementation of the volume integral operators necessary for elastic-plastic solutions, Tamaas differenciates two way of computing the intermediate integral along $$z$$ in the partial Fourier domain:

• Cutoff integration: because the intermediate integral involves kernels of the form $$\exp(q(x-y))$$, it is easy to truncate the integral when $$x$$ and $$y$$ are far apart, especially for large values of $$q$$. This changes the complexity of the intermediate integral from $$O(N_1N_2N_3^2)$$ (the naive implementation) to $$O(\sqrt{N_1^2 + N_2^2}N_3^2)$$.

• Linear integration: this method relies on a separation of variables $$\exp(q(x-y)) = \exp(qx)\cdot\exp(-qy)$$. This allows to break the dependency in $$N_3^2$$ of the number of operations, so that the overall complexity of the intermediate integral is $$O(N_1N_2N_3)$$.

Details on both algorithms can be found in 1. Tamaas uses linear integration by default because it is faster in many cases without introducing a truncation error. Unfortunatly, it has a severe drawback when considering systems with a fine surface discretization: due to $$q$$ increasing with the number of points on the surface, the separated terms $$\exp(qx)$$ and $$\exp(-qy)$$ may overflow and underflow respectively. Tamaas will warn if that is the case, and users have two options to remedy the situation:

Both these options negatively affect the performance, and it is up to the user to select the optimal solution for their particular use case.

## Computational methods & Citations¶

Tamaas uses specialized numerical methods to efficiently solve elastic and elastoplastic periodic contact problems. Using a boundary integral formulation and a half-space geometry for the former allow (a) the focus of computational power to the contact interface since the bulk response can be represented exactly, (b) the use of the fast-Fourier transform for the computation of convolution integrals. In conjunction with a boundary integral formulation of the bulk state equations, a conjugate gradient approach is used to solve the contact problem.

Note

The above methods are state-of-the-art in the domain of rough surface contact. Below are selected publications detailing the methods used in elastic contact with and without adhesion:

For elastic-plastic contact, Tamaas uses a similar approach by implementing a volume integral formulation of the bulk equilibrium equations. Thanks to kernel expressions that are directly formulated in the Fourier domain, the method reduces the algorithmic complexity, memory requirements and sampling errors compared to traditional volume integral methods (Frérot, Bonnet, Anciaux and Molinari, Computer Methods in Applied Mechanics and Engineering, 2019, arxiv:1811.11558). The figure below shows a comparison of run times for an elasticity problem (only a single solve step) between Tamaas and Akantu, a high-performance FEM code using the direct solver MUMPS.

Comparison of run times between the volume integral implementation (with cutoff integration) of Tamaas and an FEM solve step with a Cholesky factorization performed by Akantu+MUMPS. $$N$$ is the total number of points.

Further discussion about the elastic-plastic solver implemented in Tamaas can be found in Frérot, Bonnet, Anciaux and Molinari, (Computer Methods in Applied Mechanics and Engineering, 2019, arxiv:1811.11558).

1

L. Frérot, “Bridging scales in wear modeling with volume integral methods for elastic-plastic contact,” École Polytechnique Fédérale de Lausanne, 2020 (Section 2.3.2). 10.5075/epfl-thesis-7640.