Modern Computational Science - Summer School
The first week of the School mainly deals with the
fundamentals of Scientific Computing
and an introduction to numerical optimization algorithms.
More advanced optimization techniques
and special topics and applications
will be covered in weeks one and two.
Hands-on lab exercises will complement the lectures.
Part I: Fundamentals of Scientific Computing and Numerical Optimization
Software Engineering for Scientists
Helmut G. Katzgraber, Texas A&M University and ETH Zurich
Writing complex computer programs to study scientific problems requires careful planning and an in-depth knowledge of programming languages and tools, as well as analysis and visualization programs. In this lecture, the importance of using the right tool for the right problem is emphasized. Furthermore, the core concepts needed to work in a Linux/UNIX environment are presented. Common tools to organize computer programs (cvs and make), as well as to debug (gdb, valgrind) and improve them (gprof, compiler flags) are discussed, followed by simple data reduction strategies and visualization tools. Finally, some scientific libraries such as Boost, GSL, LEDA and Numerical Recipes are outlined.
Introduction to Complexity Theory I + II
Stephan Mertens, Otto-von-Guericke-Universität Magdeburg and Santa Fe Institute, New Mexico
Computational complexity theory aims at understanding the hardness
of computational problems that we care about. We will see that quite
a few of these problems are infeasible in the sense that their solution
requires impossibly long time, even on instances of moderate size.
We will see how to establish this kind of infeasibility, and we will learn
how to deal with infeasible problems in practice.
En passant we will find out what the mysterious "P=NP" problem
is all about and why it is a million dollar problem.
Our ramble through complexity theory will lead us to problems
where we are better off guessing a solution, and to problems for which
approximating the solution is as hard as finding the exact answer.
And we will also glance into the abyss of well-defined computational
problems that can't be solved at all - even if we were willing to spend
an unlimited amount of time and space on their solution.
References
- C. Moore and S. Mertens, The Nature of Computation, Oxford University Press 2011; see also www.nature-of-computation.org
Pseudo random number generation
Helmut G. Katzgraber, Texas A&M University and ETH Zurich
Random numbers play a crucial role in science and industry. Many numerical methods require the use of random numbers, in particular the Monte Carlo method. Therefore it is of paramount importance to have efficient random number generators. The differences, advantages and disadvantages of true and pseudo random number generators are discussed with an emphasis on the intrinsic details of modern and fast pseudo random number generators. Furthermore, standard tests to verify the quality of the random numbers produced by a given generator are outlined. Finally, standard scientific libraries with built-in generators are presented, as well as different approaches to generate non-uniform random numbers. Potential problems that one might encounter when using large parallel machines are discussed.
Introduction to Monte Carlo Simulations / Advanced Monte Carlo Simulations
Helmut G. Katzgraber, Texas A&M University and ETH Zurich
Monte Carlo methods play an important role in scientific computation,
especially when problems have a vast phase space.
In this lecture an introduction to the Monte Carlo method is given.
Concepts such as Markov chains, detailed balance, critical slowing down,
and ergodicity, as well as the Metropolis algorithm are explained.
The Monte Carlo method is illustrated by numerically studying
the critical behavior of the two-dimensional Ising ferromagnet
using finite-size scaling methods.
In the second part of the lecture advanced Monte Carlo methods
are described (parallel tempering Monte Carlo) and illustrated
with nontrivial models from the physics of glassy systems.
Special attention is paid to the equilibration of the simulations,
as well as the inclusion of autocorrelation effects.
Statistical data analysis
Oliver Melchert, University of Oldenburg
In a simplified sense, the progression of a scientific project that
involves numerical experiments can be subdivided into two basic parts:
the implementation of a model system and the data production and analysis.
This lecture is concerned with basic data analysis. In this regard,
a selection of frequently required tools, such as histograms,
bootstrap resampling and the chi-square test will be introduced
and illustrated by means of examples.
The presented material is intended to be useful in two ways.
If you are not (yet) familiar with those tools, the lecture will certainly
help you to become a more effective data analyst. If you already have
an overview over the presented material and if you are familiar
with the subject, it will recap those topics.
The concepts and techniques introduced here are of general interest
and are, at best, employed by computational aid. Consequently, an exemplary
implementation of the presented techniques using the Python
programming language is provided. To this end, the lecture is supplemented
by a set of Python scripts that allow you to reproduce virtually
all examples and figures shown in the course of the lecture.
Introduction to Numerical Optimization I + II
Thomas Schuster, University of Oldenburg
Minimizing a function over an admissible set of parameters such that
certain given constraints are satisfied is not only a fundamental task
in numerical mathematics but is also of a practical relevance which hardly
can be overestimated. This task serves as adequate model to solve problems
in industry, economics, natural sciences, finance or medicine.
We start our lectures with some simple examples of optimization problems
and categorize them. The main focus lies on linear optimization
(linear programming), i.e., the minimization of a linear function
subject to constraints which are given by linear (in)equalities.
First we outline fundamental, mathematical properties of linear programs
and then we deduce the simplex method in detail which represents
the most prominent method to solve linear optimization problems.
At the end we summarize some impoortant methods of unconstrained
optimization: steepest descent, line search, CG and Gauß - Newton
method.
The theory is elucidated by some examples and the subsequent
hands-on exercises.
Part II: Adanced Optimization Methods
Interval Methods I + II
Martin Fränzle, University of Oldenburg
Since the 1960s, interval arithmetic has been known as a computationally inexpensive means of reasoning about functions operating over sets of real numbers, being able to, e.g., safely overapproximate the image of a real-valued function. Due to its conservative approximation properties within set-based reasoning, it can be exploited for a variety of inherently numeric algorithms with guaranteed properties. Especially its extension to interval constraint propagation, which employs multiple forms of interval arithmetic as a means of pruning candidate solution spaces, has manifold applications in optimization. We will provide an introduction to interval arithmetic, interval constraint propagation, and their applications.
Convex Optimization: Algorithms and Applications I + II
Friedrich Eisenbrand, EPFL Lausanne
Convex optimization problems have a wide range of applications in diverse fields like engineering, production planning, machine learning, or verification, to name just a few. The goal of this course is to introduce the audience to convex optimization problems, to modeling issues and to some small selection of algorithms to solve convex optimization problems. If time permits, we will discuss the relationship of convex optimization (semidefinite programming) and solving systems of polynomial inequalities over the reals.
Nature-inspired Optimization
Oliver Kramer, University of Oldenburg
Evolution strategies are nature-inspired meta-heuristics that solve optimization problems. The idea of translating evolutionary principles into an algorithmic framework for optimization has a long history. First, we introduce the Evolution Strategy (ES), which is a famous variant of evolutionary algorithms for numerical solution spaces, employing recombination and Gaussian mutation. Adaptation of step sizes turns out to be one of the most important tasks. We will introduce and compare various step size adaptation techniques, e.g., self-adaptation that allows adapting to local solution space characteristics automatically.
Design in the presence of uncertainty: The Scenario Approach I + II
Maria Prandini and Simone Garatti, Politecnico di Milano
In these lectures, we shall describe the “scenario approach” methodology to solve design problems in the presence of uncertainty. Motivated by applications in systems and control, we shall focus on design problems that can be reformulated as finite-dimensional optimization problems, where a performance index has to be optimized, possibly subject to some constraints. To account for the uncertain elements entering the optimization, a probabilistic approach is adopted where the optimal performance and constraints satisfaction have to be only achieved over a set of uncertainty instances having probability larger than a given level (chance-constrained optimization). Chance-constrained optimization problems are hard to solve in general, and the scenario approach provides an effective methodology for obtaining an approximate solution via random sampling of the uncertainty instances. Probabilistic guarantees on the scenario solution can be provided, under the assumption that the performance index and the constraints are convex in the optimization variables. The objective of these lectures is to illustrate the scenario approach at a tutorial level, focusing mainly on algorithmic aspects. Its versatility will be pointed out through some examples in systems and control.
Part III: Special Topics and Applications
Minimum weight spanning trees of weighted scale-free networks
Oliver Melchert, University of Oldenburg
In this lecture we will consider the minimum-weight spanning tree (MST) problem, i.e., one of the simplest and most vital combinatorial optimization problems. We will discuss a particular greedy algorithm that allows to compute a MST for undirected weighted graphs, namely Kruskal's algorithm, and we will study the structure of MSTs obtained for weighted scale-free random graphs. This is meant to clarify whether the structure of MSTs is sensitive to correlations between the edge weights and the topology of the underlying scale free graphs.
Combinatorial Optimization in Bioinformatics I + II
Lectures cancelled due to an emergency
Ernst Althaus, Johannes Gutenberg-Universität Mainz
In bioinformatics, lots of combinatorial optimization problems have to be solved and software has been developed over many years now. For many of these problems the data sets are very large and the problems are algorithmically complicated so that only heuristics can be used. On the other hand, discussions with biologists or bioinformaticians often reveal problems that are algorithmically very interesting and not too large to develop methods that are interesting also from a algorithmic side. In the lectures, we will discuss several of these problems.
Ground States in the Hydrophobic-polar Model of Protein Folding I + II
Thomas Prellberg, Queen Mary University of London
We discuss uniform sampling algorithms that are based on stochastic growth
methods, using sampling of extreme configurations of polymers
in simple lattice models as a motivation.
We shall show how a series of clever enhancements to a fifty-odd year old
algorithm, the Rosenbluth method, led to a cutting-edge algorithm capable
of uniform sampling of equilibrium statistical mechanical systems of
polymers in situations where competing algorithms failed to perform well.
Examples range from collapsed homo-polymers near sticky surfaces to models
of protein folding.
Phase transitions and clustering properties of optimization problems
Alexander K. Hartmann, University of Oldenburg
One central subject in computer science are NP-complete problems. These
are problems which are hard in the sense that no fast algorithms to
solve them exist. Interestingly, many practical applications
belong to this class. When studying problems on suitably parametrized
ensembles, one finds phenomena strongly reminiscent of phase transitions
as appearing for physical systems.
One also observes regions in phase space where the problem can tyoically
be quickly solved using the available algorithms, i.e. where
the problem does not typically appear to be hard. One can better understand
these phase transitions using concepts and methods from
statistical physics. These developments are exemplified considering
the Vertex-cover Problem and the Satisfiability Problem.
In particular, we investigate the relationship between the behavior
of the ensemble, the typical running time of algorithms and the structure
of the solution landscape.
We use branch-and-bound type algorithms to obtain exact
solutions of these problems for moderate system sizes as
well as parallel tempering simulations (MC^{3}).
Using two methods (direct neighborhood-based clustering and
hierarchical clustering), we investigate the structure of the solution
space. The main result is that the correspondence between
solution-space structure and the phase diagrams of the problems is not
unique. Namely, for the vertex cover problem we observe a drastic change
of the solution space from large single clusters (in the replica-symmetric
phase, as obtained by previous analytical studies) to multiple
nested levels of clusters (in the replica-symmetry broken phase).
For the satisfiability problem, there is no simple relationship,
since there is more than one change of the solution landscape
as a function of the ensemble parameter and these changes do not always
coincide with changes of the typical computational complexity.
Quantum Chemistry
Thorsten Klüner, University of Oldenburg
In this lecture, foundations of Theoretical Chemistry will be introduced.
Based on the Born-Oppenheimer approximation [1], which implies a separation
of time-scales of electronic and nuclear motion, we discuss strategies
to solve the time-independent Schrödinger equation for the motion
of electrons in an arbitrary chemical system for a frozen nuclear
configuration.
After the introduction of general concepts (molecular Hamiltonian,
Born-Oppenheimer approximation, Pauli principle), we discuss Hartree-Fock
theory and briefly summarize electron correlation methods.
These ab initio approaches allow to construct global potential
energy surfaces (PES) if the nuclear configuration is varied for all
relevant degrees of freedom and the corresponding electronic
Schrödinger equations are solved numerically.
The PES can then be used as an external potential for nuclear motion
which provides the basis for a quantum description of chemical reactions.
Thus, in the second part of this lecture, we will discuss methods
to solve the time-dependent Schrödinger equation for the nuclei
under the influence of a given potential energy surface.
In particular, we will discuss foundations and numerical aspects
of wave packet calculations including external laser fields,
quantum dissipation, and optimal control.
References
- M. Born, R. Oppenheimer, Annalen der Physik 389 (1927), 457
The Automata Zoo I + II
Holger Hermanns, Universität des Saarlandes, Saarbrücken
Modern computer systems are reactive. Your phone, your Wi-Fi,
an airbag controller. They interact with and react to stimuli of the
environment, and are supposed to do so reliably and in real-time. An
airbag controller must fire within milliseconds after a crash, but not
after rolling over a speed bump.
Effective analysis methods for reactive systems are rooted in a modern
automata theory. This theory is a very practical one, in daily use to
build models of complex reactive systems. What to do with these
models? Model checkers are software tools used to guarantee properties
of these models. This lecture series introduces the basic concepts of
this applied automata theory, together with the principles of model
checking. We look into the model checking of finite and infinite state
systems, into timed systems, probabilistic systems, and discuss
various striking application examples. If time permits, we will cover
automated testing based on the automata models at hand.