Modern Computational Science  Summer School
The first week of the School will be mainly devoted to
"Fundamentals" (Part I), while
"Special Topics" will be dealt with
in week two (Part II).
Handson lab exercises will complement the lectures.
For dates and times, please refer to the
Schedule.
Part I: Fundamentals
Basic Algorithms and Data Structures I  II
Michael Sonnenschein, Ute Vogel, University of Oldenburg
Adequate data structures and efficient algorithms are fundamental for good solutions of computational problems. Therefore, in the first part, we introduce some basic data structures of computer science (lists, trees and graphs) together with their efficient organisation for solving common computational problems. In the second part, we discuss general techniques for designing efficient algorithms and for analysing their complexity and performance. We illustrate these techniques by standard sorting and searching algorithms. Moreover, we will also see examples for computational problems which are intractable by efficient algorithms.
Software Engineering for Scientists
Helmut G. Katzgraber, Texas A&M University and ETH Zürich
Writing complex computer programs to study scientific problems requires careful planning and an indepth knowledge of programming languages and tools, as well as analysis and visualization programs. In this lecture, the importance of using the right tool for the right problem is emphasized. Furthermore, the core concepts needed to work in an Unix environment are presented. Common tools to organize computer programs (cvs and make), as well as to debug (gdb, valgrind) and improve them (gprof, compiler flags) are discussed, followed by simple data reduction strategies and visualization tools. Finally, some scientific libraries such as Boost, GSL, LEDA and Numerical Recipes are outlined.
Random number generation
Helmut G. Katzgraber, Texas A&M University and ETH Zürich
Random numbers play a crucial role in science and industry. Many numerical methods require the use of random numbers, in particular the Monte Carlo method. Therefore it is of paramount importance to have efficient random number generators. The differences, advantages and disadvantages of true and pseudo random number generators are discussed with an emphasis on the intrinsic details of modern and fast pseudo random number generators. Furthermore, standard tests to verify the quality of the random numbers produced by a given generator are outlined. Finally, standard scientific libraries with builtin generators are presented, as well as different approaches to generate nonuniform random numbers. Potential problems that one might encounter when using large parallel machines are discussed.
Heuristic Optimisation
Michael Sonnenschein, Ute Vogel, University of Oldenburg
To solve optimisation problems, the best solutions in a space of feasible solutions must be determined. Often, the algorithms for computing this best solution have a too high complexity to be performed in acceptable time. In this case, heuristic algorithms are used, which deliver good solutions in acceptable time, but cannot guarantee the optimal solution. We introduce the heuristic optimisation methods Tabu Search and possible applications. Additionally, we will give a very short glance on Genetic Algorithms and Ant Colony Optimisation.
Exploratory Data Analysis III
Thomas Kneib, University of Oldenburg
Exploratory data analysis falls in between the two classical branches
of statistics which are dealing with the description of observed data
without referring to the underlying data generating mechanism
(descriptive statistics) or with methods for drawing inferential
conclusions about the data generating mechanism (inferential statistics).
While the former does not make use of any stochastic assumptions
about the data, the latter relies on the formulation of statistical models
in terms of suitable probability distributions.
Exploratory data analysis bridges this gap in providing statistical tools
that are often based on descriptive statistics but already hint
at inferential procedures.
In particular, exploratory data analysis is often used to generate
hypotheses about features of the data generating mechanism or on relations
between variables.
In the first lecture, we will introduce methods for describing
the distribution of univariate variables.
These will either concentrate on specific features such as the location
or the variability or may aim at describing the complete distribution.
The second lecture provides us with possibilities to study interrelations
between variables.
A particular focus will be on graphical methods but we will also sketch
certain association measures such as correlation coefficients.
Semiparametric Regression
Thomas Kneib, University of Oldenburg
Regression models form one of the most important model classes of applied statistics when interest is on analysing the impact of covariates on a response variable. While parametric models rely on prespecified functional forms and therefore require strong assumptions about the relation between covariates and response, semiparametric extensions allow for a purely datadriven investigation of these relations. In this lecture, we will present some of the basic ideas underlying semiparametric regression in general and penalised spline smoothing in particular. We will also sketch extensions to bivariate semiparametric regression and spatial smoothing.
Introduction to Monte Carlo Methods I  II
Helmut G. Katzgraber, Texas A&M University and ETH Zürich
Monte Carlo methods play an important role in scientific computation,
especially when problems have a vast phase space.
In this lecture an introduction to the Monte Carlo method is given.
Concepts such as Markov chains, detailed balance, critical slowing down,
and ergodicity, as well as the Metropolis algorithm are explained.
The Monte Carlo method is illustrated by numerically studying
the critical behavior of the twodimensional Ising ferromagnet
using finitesize scaling methods.
In the second part of the lecture advanced Monte Carlo methods
are described (parallel tempering Monte Carlo) and illustrated
with nontrivial models from the physics of glassy systems.
Special attention is paid to the equilibration of the simulations,
as well as the inclusion of autocorrelation effects.
Efficient approximation of solutions of nonlinear
partial differential equations using amplitude equations I  II
Hannes Uecker, University of Oldenburg
Despite the steady increase in computing power many nonlinear partial differential equations still cannot be solved by brute force numerical methods in acceptable time. However, often the method of multiple scales resulting in socalled amplitude equations can be used to approximate physically or technologically interesting solutions with high accuracy. We explain this method using a number of examples, including patternforming systems from mathematical biology, nonlinear wave equations used to model pulse propagation in optical fibers, and kinetic equations like the VlasovPoisson system from plasma physics.
Part II: Special Topics
Hydro and Sediment Dynamics III
HansJörg Wolff, Karsten Lettmann, University of Oldenburg
Over the last decade the areaspecific hydrodynamics and the suspended
particulate matter (SPM) dynamics in the EastFrisian Wadden Sea have been
extensively investigated under the umbrella of the DFG Research Group on
BioGeoChemistry of Tidal Flats at the University of Oldenburg.
Highresolution modelling efforts in combination with highresolution
measurements at the Wadden Sea time series station and novel theoretical
approaches have revealed various new and exciting insights into this
complex system thus significantly contributing to the current state
of research.
The available suite of numerical models integrates worldwide experience
with innovative concepts in numerical modelling such as stateoftheart
parameterizations of physical processes and novel concepts of suspended
particulate matter transport. These models have been widely used
for various research applications in the Wadden Sea and the German Bight
in different horizontal resolutions.
Combined with sophisticated measurements using ADCP and satellite data
these models enable a reliable state estimate for hydrodynamics
and sediment dynamics.
The response of the Wadden Sea system to atmospheric and hydrological
impacts (storms and sealevel rise) will be highlighted in these lectures
along with the basic underlying physics and model details.
A new model approach using a finitevolume method on unstructured grids
is discussed in a descriptive manner, showing the underlying ideas,
the model characteristics, the necessary grid generation and some results
from benchmarking the model on various computer architectures.
Scientific results from an application to the waste water dispersal
close to the city of Wilhelmshaven give first insights
into the model performance in the area of the East Frisian coast.
Foundations of Quantum Chemistry
Rainer Koch, University of Oldenburg
Quantum mechanics and its applications to chemistry have grown into a powerful tool helping a wide variety of scientists understanding chemical problems. This introductory lecture to quantum chemistry will cover the theory of quantum mechanics starting with the molecular Hamiltonian and the Schrödinger equation. Terms such as BornOppenheimer approximation, SCF, LCAO, and variation principle are explained on the way to the HartreeFock theory as the fundamental concept. Additionally, an overview over basis set families and their differences is given. Configuration interaction will be introduced as an approach to treat electron correlation and electronically excited states. A brief overview over various methods used in theoretical chemistry is given at the end of the lecture.
Surface Photochemistry from First Principles
Thorsten Klüner, University of Oldenburg
Photodesorption of small molecules from surfaces is one of the most
fundamental processes in surface photochemistry.
Despite its apparent simplicity, a microscopic understanding beyond
a qualitative picture still poses a true challenge for theory.
While the dynamics of nuclear motion can be treated on various levels
of sophistication, all approaches suffer from the lack of sufficiently
accurate potential energy surfaces, in particular for electronically
excited states involved in the desorption scenario.
In the last decade, we have developed a systematic and accurate
methodology to reliably calculate accurate ground and excited state
potential energy surfaces (PES) for different adsorbatesubstrate systems.
These potential energy surfaces serve as a prerequisite for subsequent
quantum dynamical wave packet calculations, which allow for a direct
simulation of experimentally observable quantities such as velocity
distributions.
In this contribution, I will focus on recent results obtained for
photodesorption of NO and CO from a NiO(100) surface.
In contrast to previous studies, we were able to construct highly accurate
potential energy surfaces based on correlated quantum chemical
calculations (CASPT2/CCSD(T)).
Despite the enormous computational cost, this level of theory turns out
to be crucial, since less sophisticated approaches such as density
functional theory (DFT) cannot even provide a reliable description
of ground state properties, not to mention electronically
excited states [1].
These potential energy surfaces were used in subsequent wave packet
studies which reveal new desorption mechanisms.
In the NO/NiO(100) case, we observed an antiAntoniewicz scenario
in which the wave packet is initially repelled from the surface
but eventually reaches a turning point before it is backscattered
from the surface.
Stateresolved desorption velocity distributions have been calculated,
and the results are in good agreement with experimental findings [2].
In the CO/NiO(100) system, we observe the formation of a genuine covalent
bond upon photoexcitation for the first time. As demonstrated
in the current study, this chemical bond formation is the crucial step
in the desorption mechanism for this system [3,4].
Again, our results are in good agreement with recent experiments.
References

I. Mehdaoui, T. Klüner, J. Phys. Chem. A 111 (2007),
13233: Bonding of CO and NO to NiO(100): A strategy for obtaining
accurate adsorption energies.

I. Mehdaoui, D. Kröner, M. Pykavy, H.J. Freund, T. Klüner,
Phys. Chem. Chem. Phys. 8, 1584 (2006): Photoinduced desorption
of NO from NiO(100): Calculation of the fourdimensional potential
energy surfaces and systematic wave packet studies.

I. Mehdaoui, T. Klüner, Phys. Rev. Lett. 98, 037601 (2007):
Understanding Surface Photochemistry from First Principles:
The Case of CONiO(100).
 I. Mehdaoui, T. Klüner, Phys. Chem. Chem. Phys. 10 (2008), 4559: New mechanistic insight into electronically excited CONiO(100): A quantum dynamical analysis.
An introduction to phylogenetic analysis
Olaf R.P. BinindaEmonds, University of Oldenburg
Biological species do not represent independent units of analysis
because of the pattern of descent with modification through evolution.
Although this fact has been appreciated since at least Darwin,
it is only in the past 15 years that biologists have increasingly
appreciated the need to account for evolutionary relatedness
in their analyses.
However, phylogeny transcends pure biology to play an important role
in related fields.
For instance, alignment programs like Clustal rely on phylogenetic
information in constructing their guide trees.
Unfortunately, finding the optimal phylogeny for a given data set,
like most other interesting problems in computational biology,
is an NPhard problem because the number of potential solutions ("tree
space") grows superexponentially with the number of taxa in the analysis.
With the rapid growth of phylogenomic data and the scope of the analyses,
phylogenetic inference has also garnered increasing attention
from computer scientists eager to find fast solutions
to this difficult problem.
In this talk, I introduce the problem of phylogenetic inference in general
and discuss several optimization criteria (parsimony, likelihood,
and distance methods) and the strengths and weaknesses
of each for building evolutionary trees.
Molecular homology and sequence alignment
Olaf R.P. BinindaEmonds, University of Oldenburg
Phylogenetic analysis depends crucially on the data underlying
the analysis, with issues of data quality being but one aspect
of the problem.
A second, slightly less appreciated aspect is that of data comparability
and specifically the evolutionary homology (= derivation from a common
shared ancestry) of the individual characters being examined.
The goal of sequence alignment is to establish the positional homology
of the individual elements (nucleotide bases or amino acid residues)
of two or more molecular sequences (DNA or proteins, respectively).
Like many other problems in computational biology, the simultaneous
alignment of three or more sequences is known to be NPhard.
Fortunately, many fast and reasonable heuristics to this problem do exist,
however.
In this talk, I first examine the concept of molecular homology
before describing efficient algorithms for the alignment of pairs
of sequences.
I then expand on this to discuss potential solutions and strategies
to tackle the problem of multiple sequence alignment.
Maximum Likelihood Inference of Phylogenetic Trees:
Heuristics, Algorithms, and Parallelism by Example of RAxML
Alexandros Stamatakis, TU München
Following the general introduction to the field by Olaf BinindaEmonds
I will address implementation issues and potential solutions by example of
RAxML which is a realworld opensource program for phylogenetic inference.
In the first part of this lecture, I will address the design of search
heuristics, the design of search convergence criteria, methods for fast
computation of support values, and efficient algorithms for computing
the topological distance between trees.
In the second part of the lecture I will describe how to parallelize
tree searches and the Maximum Likelihood function on a variety of parallel
architectures ranging from FPGAs to the IBM BlueGene using for instance
Pthreads, OpenMP and MPI.
I will also address solutions for load balance problems in the
phylogenetic likelihood function and address issues regarding
single versus doubleprecision floating point arithmetics
and SSE3based vectorization.
PDFs of papers and RAxML source code available at
wwwkramer.in.tum.de/exelixis/.
Intermediate Disturbance Hypothesis  the random way out
Jan Freund, University of Oldenburg
When organisms compete in a nonfluctuating environment, basic models
of population dynamics suggest that superior competitors ultimately drive
inferior competitors to extinction, and the species number declines
 a process called competitive exclusion.
To reconcile the competitionexclusion principle with the observed
coexistence of many species, Connell (1978) proposed the Intermediate
Disturbance Hypothesis (IDH) according to which disturbances
of intermediate strength and frequency act to mainain high levels
of biodiversity.
Since realistic environmental fluctuations will be of random nature,
a model approach to the IDH must involve concepts of stochastic processes.
In this lecture we will give a short introduction into stochastic dynamics
and illustrate the IDH via stochastic simulations of two competing
species.
A generalisation to more species and diverse types of randomness will
be left to a handson session.
References

Connell, J.H. (1978)
Diversity in tropical rain forests and coral reefs.
Science 199: 13021310

Shea, K., Roxburgh, S.H., and Rauschert, E.S.J. (2004)
Moving from pattern to process: coexistence mechanisms
under intermediate disturbance regimes.
Ecology Letters 7: 491508
 Mannella, R. (2000) A Gentle Introduction to the Integration of Stochastic Differential Equations. In: Stochastic Processes in Physics, Chemistry, and Biology. Edited by J.A. Freund, Th. Pöschel, Lecture Notes in Physics, vol. 557, pp.353364. Springer, Berlin, 2000.
Paradox of plankton  the chaotic way out
Ulrike Feudel, University of Oldenburg
The life of species is determined by their growth and death depending
on environmental conditions.
Particularly, the availability of nutrients is essential for the survival
of the species.
However, the nutrient availability is not only dependent on what nature
offers in a certain region but is largely determined by the existence
of other species feeding on the same nutrients.
Hence competition of species plays an important role in the time evolution
of a community of species in an ecosystem.
One of the most famous principles in theoretical ecology is
the competitionexclusion principle that states that in equilibrium
the number of species which can coexist can not be larger than the number
of limiting resources.
This is sometimes called the "survival of the fittest",
because the best adapted species, which is considered to be the fittest,
would win the competition.
However, observations of plankton species in the ocean show that thousands
of species can coexist on a rather low number of limiting resources,
such as e.g. nitrogen, phosphorous and light.
This socalled paradox of plankton illustrates that the
competitionexclusion principle of theoretical ecology is not fulfilled
in nature, though it is often observed in laboratory experiments.
There have been several attempts to solve this problem including
spatial and temporal changes in environmental conditions.
We illustrate another way of solving the paradox of plankton by showing
that coexistence of many species competing for a small number of limiting
resources becomes possible if the time evolution of species
does not reach an equilibrium, but species coexist in a chaotic manner,
so that available nutrients can be shared in a more efficient way.
The lecture does not only include the explanation of the paradox
of plankton but offers also an introduction to the basic principles
of chaos theory to understand its application to an important problem
in ecology.
The unified theory of biodiversity  the neutral way out
Bernd Blasius, University of Oldenburg
The unified theory of biodiversity is a simple mathematical model
for explaining the diversity and relative abundance of species
in "neutral" ecological communities.
Here neutrality refers to the assumption of per capita ecological
equivalence among all individuals of a species, and thus it stands
in striking contrast to the believe of many ecologists that different
species behave in different ways from one another.
Notwithstanding neutral theories have been applied successfully
to describe empirical patterns in diverse ecosystems,
giving rise to a controversial debate of "neutrality vs
the niche" in modern ecology.
This lecture will provide a gentle introduction into neutral models.
Starting from Hubble's first formulation we will discuss the calculation
of species abundance distributions, the comparison to field data
and several modern developments of the theory.
Computational Fluid Dynamics I  III
Joachim Peinke, University of Oldenburg
In this contribution, we want to discuss some aspects of how to solve
the fundamental equations for fluid mechanics.
The fluid problems are commonly described by the Navier Stokes equation,
i.e. in terms of a partial differential equation.
Although this equation is already an approximation, the solution
of it causes serious problems.
We want to focus on methods to solve the Navier Stokes equation
for technical applications.
First we will discuss the principles of different approximations.
Then we will discuss some applications for the wind energy issue.
We will in addition introduce the participants into an open source software
for solving the Navier Stokes equation, a socalled CFD
(computational fluid dynamics) solver.