# Density-functional theory and beyond – high-throughput screening and big-data analytics, towards exascale computational materials science

## août 26 - septembre 5

We propose a ten-day « Hands-On » school on first-principles approaches to calculate the electronic structure and relevant properties of materials, targeting students and early-career postdocs and ranging from the basics up to some of the most advanced aspects of the field. We will discuss the intrinsic and numerical accuracy, efficiency, and reproducibility of the underlying approximations with a focus on density-functional theory (DFT), but also on quantum-chemistry methods and many-body perturbation theory. Application examples will include: structure sampling for Multiscale problems (from molecules, nanoclusters to solids/surfaces), ab initio statistical mechanics, electronic and heat transport and optical properties, and high-throughput materials discovery. We want to focus as well on screening/searching approaches in materials and chemical spaces and exascale challenge problems.

The morning sessions will feature extended keynote lectures, covering the current state-of-the-art of modern electronic structure theory. In the afternoon sessions of the weekdays and on the weekend in-between, participants will deepen selected topics in tutored hands-on sessions focused on paradigmatic physical problems that can be addressed by first-principles approaches today. As the school progresses we will increasingly highlight the burgeoning role of both big data analytics and exascale computing, which are rapidly becoming essential elements of an integrated approach to modern computational materials science.

Flash presentations sessions for the participants, scheduled early in the program, will allow students, tutors, and lecturers to engage with each other and seed an open, stimulating environment for the entire school. The proposed format was highly successful in the past (roughly biannual since 1994). We expect similarly broad community interest and participant enthusiasm for the proposed 2019 workshop.

The Hands-On school includes about 30 keynote lectures (50 minutes + 10 minutes of discussion), covering the current state of modern electronic-structure theory linked to ab initio thermodynamics, statistical mechanics and materials discovery. This includes the basic concepts, but also advanced topics and techniques that go beyond standard DFT and relevant topics from neighbouring fields, e.g., quantum chemistry and GW many-body theory. The school will be given by leading and renowned general experts (e.g., Tkatchenko, Rubio, Weitao Yang, Sanvito among others). The selection of lecturers from different countries reflects the diversity of the field and also the global character of the electronic-structure community. Participants will thus not only get an overview of the field but will also be able to create and extend their professional networks to current and future key players in the field. Most of the proposed lecturers have already indicated their interest in the school and agreed to participate.

* Introduction

To set the scene, the organisers will provide a brief overview of the topics and aims of the school. Following the title of the school, Matthias Scheffler will then open the school with a motivational lecture describing the role and recent achievements of high-throughput screening and big data analytics in materials discovery. Here, we will also look forward towards the exciting new frontier of exascale computing and the promises if holds for computational materials science.

* Implementing DFT

Density-functional theory (DFT) [1,2] is undoubtedly the most successful and influential electronic-structure approach in materials science, thanks to the computational efficiency and reasonable accuracy of current density functional approximations (DFAs) for many purposes. DFAs allow for the prediction of total energy based quantities like structural, elastic, and vibrational properties of solids, often in excellent agreement with experiment. Among the many flavours of DFAs, one finds: the local-density approximation (LDA) [2,3], generalized gradient approximations (GGAs) [4-7] or meta-GGAs [8-11], optimized effective potential methods [12], van der Waals density functional theory [13,14], “generalized” schemes such as hybrid [15-17] or double-hybrid functionals [18-20] and many more. While the resulting set of theories is powerful, choosing a DFA “for the right reasons” for practical computation simulations can pose significant challenges. Lectures by Volker Blum, Weitao Yang, Will Huhn, David Vanderbilt and Sergey Levchenko, Alexandre Tkatchenko.

* Accuracy and reproducibility

Even for the same underlying DFAs, a broad ecosystem of different implementations for electronic-structure theory exists, with characteristic strengths and weaknesses. Numerical accuracy, efficiency and reproducibility of the calculations using a given DFA depend more strongly on their numerical implementation than commonly thought [21,22]. The key choice is the form of the mathematical discretization or basis set, for example, by plane waves [23], Gaussian-type orbitals [24], linearized augmented plane waves [25], Wannier-functions [26-28], numeric atom-centred orbitals [29-31], and many more options. Important other physical choices, such as the treatment of core electrons, of the electrostatic potential and its boundary conditions, of relativistic effects, etc, are tightly coupled to the chosen basis set. Obviously, the same physical results for the same question should be obtained from each of these choices if properly implemented. The question of reproducibility is thus rapidly increasing in importance, most prominently evidenced in a recent community-wide effort which highlights the materiality of this topic by comparing 15 solid-state codes, using 40 different potentials or basis set types, assessing the quality of the GGA equation of state for 71 elemental crystals [21,22]. In order to assess reproducibility and accuracy for a given set of methods benchmark data sets have also been devised. Navigating the different options is a challenge for anyone active in the field, but this is especially true for new researchers. Lectures by: Will Huhn, Igor Ying Zhang and Carsten Baldauf.

* Python-ASE

Practitioners of modern computational materials science typically employ a number of different codes, each particularly suited for a certain type of calculation. Organising the running of appropriate codes, retrieving and analysing output data produced by varied sources, and transferring data between codes can all be performed under the umbrella of the Atomic Simulation Environment (ASE). This powerful general approach to simulation is based on the user-friendly Python programming language which is thus a pre-requisite for utilising ASE. Lectures will be given by ASE developers and expert users on both the Pythonic fundamentals and on concrete useful examples. Lecturers: Ask Larssen, Krystian Thygesen, Christian Carbogno and Bjork Hammer.

* Physical Properties and Electronic Excitations: DFT and beyond

Despite the popularity of DFT, it is well-documented that commonly used (semi)local and hybrid exchange-correlation functionals are often insufficient to address specific, fundamental phenomena, including charge transfer, weak dispersion interactions and so-called strongly correlated systems [32,33]. Recent developments in computational materials science include the introduction of sophisticated quantum-chemistry methods [34-38] and many-body Green’s function theory [39-41] to condensed matter physics. Furthermore, new-generation DFAs inspired by the other fields are emerging very quickly [18-20,42-45]. For many practical problems involving electronic excitations, it is necessary to go beyond ground-state theory. Green’s function based many-body techniques are employed from the condensed matter physics side: most often, G0W0 based on a fixed reference [46-48], self-consistent GW approaches [49-51], the Bethe-Salpeter Equation [52] for neutral excitations (e.g., for optical properties) etc. Such methods will be compared with modern TD-DFT approaches. Angel Rubio and Miguel Marques will present the basic knowledge and applications of these methods, and will also discuss the numerical accuracy and reproducibility of these methods using well-established benchmark datasets. David Casanova will provide an overview to spin-flip methods to study excited states. Xinguo Ren will focus on both basic concepts and recent progress in quantum chemistry and many-body perturbation theory. Gemma Solomon, Christian Carbogno and Stefano Sanvito will introduce methods to tackle transport properties on complex systems.

* Time and length scales

Molecular dynamics (MD) simulations at realistic conditions (i.e. including temperature) are a primary pathway to predict the properties of real materials and molecules, for example ensemble properties or vibrational spectra. Born-Oppenheimer or Car-Parrinello MD [53] with classical nuclei and Newton’s equation seem relatively simple, but feature a number of numerical challenges. These range from integration artefacts, resulting in energy drifts, to the proper derivation of statistical ensemble averages, etc. Furthermore, the approximate (re)introduction of quantum nuclear effects [54], electronic excitations through explicitly time-dependent DFT [55], or even the correlated dynamics of electrons and ions [56] can be essential to achieve physically correct results, for instance for heat transport in solids. As an extension of the more widespread thermodynamic Monte Carlo methods, the kinetic Monte Carlo (kMC) method is a useful coarse-graining tool to simulate the long-time dynamics of processes occurring in nature [57,58]. These methods together with the basic knowledge of ab initio thermodynamics will be covered in lectures by Mariana Rossi, Luca Ghiringhelli, Sergey Levchenko, Karsten Reuter and Peter Kratzer.

* Navigating materials and compound space and big-data driven materials science

A unique promise of electronic structure theory is to serve as a fully stand-alone, unbiased predictive tool of new compounds with optimized target property. In order to realize this promise, property predictions must be available for a vast space of possible compound compositions and compound geometries / topologies for solids, clusters [59], and molecules [60-62]. With the rise of the Materials’ Genome Project, high-throughput sampling of large segments of chemical or materials space has become a very active field of research [63]. Electronic-structure theory is of particular importance as a solid foundation in order to apply multiscale methods [64,65]. Much effort is spent in the field of data-driven research, for instance, “data mining” [66,67] or “machine learning” [68-70]. These approaches are ideally highly automated, relying on enormous numbers of calculations and thus will increasingly require exascale computing. However, these methods are still young and thus subject to pitfalls, for instance, outliers due to computational or even unexpected technical errors. The following topics will be covered specifically: Sampling of large conformational spaces of molecules, clusters, and solids (Scott Woodley) as well as transition path investigations (Carsten Baldauf); machine learning and big data (Stefano Curtarolo), with more general lectures given by Carlos Mera Acosta and Luca Ghiringhelli.

* The exascale frontier

Finally, we take a speculative look to the future of computational materials science with lectures covering large-scale simulations and exascale computing. The need for increasing realism in materials simulations will requires a corresponding increase in computing power. In this respect exascale computing (i.e. 10^18 operations per second) is widely regarded to be the next important frontier in large-scale detailed materials simulation. The challenges face by both hardware and software for exascale computing and the types of simulations that such capabilities will permit are covered (lecture by Nicholas Hine).

In summary, our proposed school will cover the full breadth of electronic structure based research, beginning from the fundamental concepts, and all the way to the latest development in the field, including practical examples that are paradigmatic for the science, in principle agnostic of any specific code. In the organizers’ experience, it is the comprehensive focus that makes this school attractive to a large segment of researchers entering the field. We note that the main workhorse for electronic-structure tutorials in the workshop will be the FHI-aims code, with which all organizers and tutors are familiar, but stress again that this is expressly not intended to be a code-specific workshop. This philosophy is also reflected in the list of invited speakers. Active contributors to other codes for density-functional theory calculations, including Hardy Gross, Angel Rubio, Miguel Marques, David Vanderbilt, David Casanova, Stefano Sanvito, Gemma Solomon and Nicholas Hine but also force fields code with Adri van Duin.