Modern Computational Science - Summer School

MCS12: Scientific Program

The first week of the School mainly deals with the fundamentals of Scientific Computing and an introduction to numerical optimization algorithms. More advanced optimization techniques and special topics and applications will be covered in weeks one and two. Hands-on lab exercises will complement the lectures.


Part I: Fundamentals of Scientific Computing and Numerical Optimization

Software Engineering for Scientists


Helmut G. Katzgraber, Texas A&M University and ETH Zurich


Writing complex computer programs to study scientific problems requires careful planning and an in-depth knowledge of programming languages and tools, as well as analysis and visualization programs. In this lecture, the importance of using the right tool for the right problem is emphasized. Furthermore, the core concepts needed to work in a Linux/UNIX environment are presented. Common tools to organize computer programs (cvs and make), as well as to debug (gdb, valgrind) and improve them (gprof, compiler flags) are discussed, followed by simple data reduction strategies and visualization tools. Finally, some scientific libraries such as Boost, GSL, LEDA and Numerical Recipes are outlined.


Introduction to Complexity Theory I + II


Stephan Mertens, Otto-von-Guericke-Universität Magdeburg and Santa Fe Institute, New Mexico


Computational complexity theory aims at understanding the hardness of computational problems that we care about. We will see that quite a few of these problems are infeasible in the sense that their solution requires impossibly long time, even on instances of moderate size. We will see how to establish this kind of infeasibility, and we will learn how to deal with infeasible problems in practice. En passant we will find out what the mysterious "P=NP" problem is all about and why it is a million dollar problem. Our ramble through complexity theory will lead us to problems where we are better off guessing a solution, and to problems for which approximating the solution is as hard as finding the exact answer. And we will also glance into the abyss of well-defined computational problems that can't be solved at all - even if we were willing to spend an unlimited amount of time and space on their solution.

References

  1. C. Moore and S. Mertens, The Nature of Computation, Oxford University Press 2011; see also www.nature-of-computation.org


Pseudo random number generation


Helmut G. Katzgraber, Texas A&M University and ETH Zurich


Random numbers play a crucial role in science and industry. Many numerical methods require the use of random numbers, in particular the Monte Carlo method. Therefore it is of paramount importance to have efficient random number generators. The differences, advantages and disadvantages of true and pseudo random number generators are discussed with an emphasis on the intrinsic details of modern and fast pseudo random number generators. Furthermore, standard tests to verify the quality of the random numbers produced by a given generator are outlined. Finally, standard scientific libraries with built-in generators are presented, as well as different approaches to generate non-uniform random numbers. Potential problems that one might encounter when using large parallel machines are discussed.


Introduction to Monte Carlo Simulations / Advanced Monte Carlo Simulations


Helmut G. Katzgraber, Texas A&M University and ETH Zurich


Monte Carlo methods play an important role in scientific computation, especially when problems have a vast phase space. In this lecture an introduction to the Monte Carlo method is given. Concepts such as Markov chains, detailed balance, critical slowing down, and ergodicity, as well as the Metropolis algorithm are explained. The Monte Carlo method is illustrated by numerically studying the critical behavior of the two-dimensional Ising ferromagnet using finite-size scaling methods.
In the second part of the lecture advanced Monte Carlo methods are described (parallel tempering Monte Carlo) and illustrated with nontrivial models from the physics of glassy systems. Special attention is paid to the equilibration of the simulations, as well as the inclusion of autocorrelation effects.


Statistical data analysis


Oliver Melchert, University of Oldenburg


In a simplified sense, the progression of a scientific project that involves numerical experiments can be subdivided into two basic parts: the implementation of a model system and the data production and analysis. This lecture is concerned with basic data analysis. In this regard, a selection of frequently required tools, such as histograms, bootstrap resampling and the chi-square test will be introduced and illustrated by means of examples. The presented material is intended to be useful in two ways. If you are not (yet) familiar with those tools, the lecture will certainly help you to become a more effective data analyst. If you already have an overview over the presented material and if you are familiar with the subject, it will recap those topics.
The concepts and techniques introduced here are of general interest and are, at best, employed by computational aid. Consequently, an exemplary implementation of the presented techniques using the Python programming language is provided. To this end, the lecture is supplemented by a set of Python scripts that allow you to reproduce virtually all examples and figures shown in the course of the lecture.


Introduction to Numerical Optimization I + II


Thomas Schuster, University of Oldenburg


Minimizing a function over an admissible set of parameters such that certain given constraints are satisfied is not only a fundamental task in numerical mathematics but is also of a practical relevance which hardly can be overestimated. This task serves as adequate model to solve problems in industry, economics, natural sciences, finance or medicine. We start our lectures with some simple examples of optimization problems and categorize them. The main focus lies on linear optimization (linear programming), i.e., the minimization of a linear function subject to constraints which are given by linear (in)equalities.
First we outline fundamental, mathematical properties of linear programs and then we deduce the simplex method in detail which represents the most prominent method to solve linear optimization problems. At the end we summarize some impoortant methods of unconstrained optimization: steepest descent, line search, CG and Gauß - Newton method. The theory is elucidated by some examples and the subsequent hands-on exercises.



Part II: Adanced Optimization Methods

Interval Methods I + II


Martin Fränzle, University of Oldenburg


Since the 1960s, interval arithmetic has been known as a computationally inexpensive means of reasoning about functions operating over sets of real numbers, being able to, e.g., safely overapproximate the image of a real-valued function. Due to its conservative approximation properties within set-based reasoning, it can be exploited for a variety of inherently numeric algorithms with guaranteed properties. Especially its extension to interval constraint propagation, which employs multiple forms of interval arithmetic as a means of pruning candidate solution spaces, has manifold applications in optimization. We will provide an introduction to interval arithmetic, interval constraint propagation, and their applications.


Convex Optimization: Algorithms and Applications I + II


Friedrich Eisenbrand, EPFL Lausanne


Convex optimization problems have a wide range of applications in diverse fields like engineering, production planning, machine learning, or verification, to name just a few. The goal of this course is to introduce the audience to convex optimization problems, to modeling issues and to some small selection of algorithms to solve convex optimization problems. If time permits, we will discuss the relationship of convex optimization (semidefinite programming) and solving systems of polynomial inequalities over the reals.


Nature-inspired Optimization


Oliver Kramer, University of Oldenburg


Evolution strategies are nature-inspired meta-heuristics that solve optimization problems. The idea of translating evolutionary principles into an algorithmic framework for optimization has a long history. First, we introduce the Evolution Strategy (ES), which is a famous variant of evolutionary algorithms for numerical solution spaces, employing recombination and Gaussian mutation. Adaptation of step sizes turns out to be one of the most important tasks. We will introduce and compare various step size adaptation techniques, e.g., self-adaptation that allows adapting to local solution space characteristics automatically.


Design in the presence of uncertainty: The Scenario Approach I + II


Maria Prandini and Simone Garatti, Politecnico di Milano


In these lectures, we shall describe the “scenario approach” methodology to solve design problems in the presence of uncertainty. Motivated by applications in systems and control, we shall focus on design problems that can be reformulated as finite-dimensional optimization problems, where a performance index has to be optimized, possibly subject to some constraints. To account for the uncertain elements entering the optimization, a probabilistic approach is adopted where the optimal performance and constraints satisfaction have to be only achieved over a set of uncertainty instances having probability larger than a given level (chance-constrained optimization). Chance-constrained optimization problems are hard to solve in general, and the scenario approach provides an effective methodology for obtaining an approximate solution via random sampling of the uncertainty instances. Probabilistic guarantees on the scenario solution can be provided, under the assumption that the performance index and the constraints are convex in the optimization variables. The objective of these lectures is to illustrate the scenario approach at a tutorial level, focusing mainly on algorithmic aspects. Its versatility will be pointed out through some examples in systems and control.



Part III: Special Topics and Applications

Minimum weight spanning trees of weighted scale-free networks


Oliver Melchert, University of Oldenburg


In this lecture we will consider the minimum-weight spanning tree (MST) problem, i.e., one of the simplest and most vital combinatorial optimization problems. We will discuss a particular greedy algorithm that allows to compute a MST for undirected weighted graphs, namely Kruskal's algorithm, and we will study the structure of MSTs obtained for weighted scale-free random graphs. This is meant to clarify whether the structure of MSTs is sensitive to correlations between the edge weights and the topology of the underlying scale free graphs.


Combinatorial Optimization in Bioinformatics I + II


Lectures cancelled due to an emergency

Ernst Althaus, Johannes Gutenberg-Universität Mainz


In bioinformatics, lots of combinatorial optimization problems have to be solved and software has been developed over many years now. For many of these problems the data sets are very large and the problems are algorithmically complicated so that only heuristics can be used. On the other hand, discussions with biologists or bioinformaticians often reveal problems that are algorithmically very interesting and not too large to develop methods that are interesting also from a algorithmic side. In the lectures, we will discuss several of these problems.


Ground States in the Hydrophobic-polar Model of Protein Folding I + II


Thomas Prellberg, Queen Mary University of London


We discuss uniform sampling algorithms that are based on stochastic growth methods, using sampling of extreme configurations of polymers in simple lattice models as a motivation.
We shall show how a series of clever enhancements to a fifty-odd year old algorithm, the Rosenbluth method, led to a cutting-edge algorithm capable of uniform sampling of equilibrium statistical mechanical systems of polymers in situations where competing algorithms failed to perform well. Examples range from collapsed homo-polymers near sticky surfaces to models of protein folding.


Phase transitions and clustering properties of optimization problems


Alexander K. Hartmann, University of Oldenburg


One central subject in computer science are NP-complete problems. These are problems which are hard in the sense that no fast algorithms to solve them exist. Interestingly, many practical applications belong to this class. When studying problems on suitably parametrized ensembles, one finds phenomena strongly reminiscent of phase transitions as appearing for physical systems. One also observes regions in phase space where the problem can tyoically be quickly solved using the available algorithms, i.e. where the problem does not typically appear to be hard. One can better understand these phase transitions using concepts and methods from statistical physics. These developments are exemplified considering the Vertex-cover Problem and the Satisfiability Problem.
In particular, we investigate the relationship between the behavior of the ensemble, the typical running time of algorithms and the structure of the solution landscape. We use branch-and-bound type algorithms to obtain exact solutions of these problems for moderate system sizes as well as parallel tempering simulations (MC3).
Using two methods (direct neighborhood-based clustering and hierarchical clustering), we investigate the structure of the solution space. The main result is that the correspondence between solution-space structure and the phase diagrams of the problems is not unique. Namely, for the vertex cover problem we observe a drastic change of the solution space from large single clusters (in the replica-symmetric phase, as obtained by previous analytical studies) to multiple nested levels of clusters (in the replica-symmetry broken phase). For the satisfiability problem, there is no simple relationship, since there is more than one change of the solution landscape as a function of the ensemble parameter and these changes do not always coincide with changes of the typical computational complexity.


Quantum Chemistry


Thorsten Klüner, University of Oldenburg


In this lecture, foundations of Theoretical Chemistry will be introduced. Based on the Born-Oppenheimer approximation [1], which implies a separation of time-scales of electronic and nuclear motion, we discuss strategies to solve the time-independent Schrödinger equation for the motion of electrons in an arbitrary chemical system for a frozen nuclear configuration. After the introduction of general concepts (molecular Hamiltonian, Born-Oppenheimer approximation, Pauli principle), we discuss Hartree-Fock theory and briefly summarize electron correlation methods. These ab initio approaches allow to construct global potential energy surfaces (PES) if the nuclear configuration is varied for all relevant degrees of freedom and the corresponding electronic Schrödinger equations are solved numerically. The PES can then be used as an external potential for nuclear motion which provides the basis for a quantum description of chemical reactions.
Thus, in the second part of this lecture, we will discuss methods to solve the time-dependent Schrödinger equation for the nuclei under the influence of a given potential energy surface. In particular, we will discuss foundations and numerical aspects of wave packet calculations including external laser fields, quantum dissipation, and optimal control.

References

  1. M. Born, R. Oppenheimer, Annalen der Physik 389 (1927), 457


The Automata Zoo I + II


Holger Hermanns, Universität des Saarlandes, Saarbrücken


Modern computer systems are reactive. Your phone, your Wi-Fi, an airbag controller. They interact with and react to stimuli of the environment, and are supposed to do so reliably and in real-time. An airbag controller must fire within milliseconds after a crash, but not after rolling over a speed bump.
Effective analysis methods for reactive systems are rooted in a modern automata theory. This theory is a very practical one, in daily use to build models of complex reactive systems. What to do with these models? Model checkers are software tools used to guarantee properties of these models. This lecture series introduces the basic concepts of this applied automata theory, together with the principles of model checking. We look into the model checking of finite and infinite state systems, into timed systems, probabilistic systems, and discuss various striking application examples. If time permits, we will cover automated testing based on the automata models at hand.