#### Modern Computational Science - Summer School

The first week of the School mainly deals with the
basics in the field of Extreme Events
and other, fundamental methods of Computational Science (Part I).
More specialized topics of current interest
will be covered in week two (Part II).
Hands-on lab exercises will complement the lectures.

For dates and times, please see the
Schedule.

## Part I: Fundamentals

### Software Engineering for Scientists

####
Helmut G. Katzgraber,
*Texas A&M University* and *ETH Zurich*

Writing complex computer programs to study scientific problems requires careful planning and an in-depth knowledge of programming languages and tools, as well as analysis and visualization programs. In this lecture, the importance of using the right tool for the right problem is emphasized. Furthermore, the core concepts needed to work in a Linux/UNIX environment are presented. Common tools to organize computer programs (cvs and make), as well as to debug (gdb, valgrind) and improve them (gprof, compiler flags) are discussed, followed by simple data reduction strategies and visualization tools. Finally, some scientific libraries such as Boost, GSL, LEDA and Numerical Recipes are outlined.

### Advanced Programming Techniques I + II

####
Frauke Liers,
*University of Cologne*

### Introduction to rare-event theory: Theory, applications, and simulations

####
Hugo Touchette,
*Queen Mary University of London*

The theory of large deviations deals with the probabilities of rare events
(or fluctuations) that are exponentially small as a function
of some parameter, e.g., the number of random components of a system,
the time over which a stochastic system is observed, the amplitude
of the noise perturbing a dynamical system, or the temperature
of a chemical reaction.
The theory has applications in many different scientific fields,
ranging from queuing theory to statistics and from finance to engineering.
It is also increasingly used in statistical physics for studying
both equilibrium and nonequilibrium systems.
In this context, deep analogies can be made between familiar concepts
of statistical physics, such as the entropy and the free energy,
and concepts of large deviation theory having more technical names,
such as rate functions and scaled cumulant generating functions.

The first part of these lectures will introduce the basic elements
of large deviation theory at a level appropriate for undergraduate
and early graduate students in physics, engineering, chemistry,
and applied mathematics.
The focus there will be on the simple but powerful ideas behind
large deviation theory, as opposed to the formalism,
and on the application of these ideas to simple stochastic processes,
such as sums of independent and identically distributed random variables
and Markov processes.
Some physical applications of these processes will be covered
in exercises.

In the second part, the problem of numerically evaluating large deviation
probabilities will be treated at a basic level.
The fundamental idea of importance sampling will be introduced together
with its sister idea, the exponential change of measure.
Other numerical methods based on sample means and generating functions
as well as applications to Markov chains and stochastic differential
equations will also covered.

### Pseudo random number generation

####
Helmut G. Katzgraber,
*Texas A&M University* and *ETH Zurich*

Random numbers play a crucial role in science and industry. Many numerical methods require the use of random numbers, in particular the Monte Carlo method. Therefore it is of paramount importance to have efficient random number generators. The differences, advantages and disadvantages of true and pseudo random number generators are discussed with an emphasis on the intrinsic details of modern and fast pseudo random number generators. Furthermore, standard tests to verify the quality of the random numbers produced by a given generator are outlined. Finally, standard scientific libraries with built-in generators are presented, as well as different approaches to generate non-uniform random numbers. Potential problems that one might encounter when using large parallel machines are discussed.

### Introduction to Monte Carlo Simulations I + II

####
Helmut G. Katzgraber,
*Texas A&M University* and *ETH Zurich*

Monte Carlo methods play an important role in scientific computation,
especially when problems have a vast phase space.
In this lecture an introduction to the Monte Carlo method is given.
Concepts such as Markov chains, detailed balance, critical slowing down,
and ergodicity, as well as the Metropolis algorithm are explained.
The Monte Carlo method is illustrated by numerically studying
the critical behavior of the two-dimensional Ising ferromagnet
using finite-size scaling methods.

In the second part of the lecture advanced Monte Carlo methods
are described (parallel tempering Monte Carlo) and illustrated
with nontrivial models from the physics of glassy systems.
Special attention is paid to the equilibration of the simulations,
as well as the inclusion of autocorrelation effects.

### Basic data analysis

####
Oliver Melchert,
*University of Oldenburg*

In a simplified sense, the progression of a scientific project
that involves numerical experiments can be subdivided into two basic
parts: the implementation of a model system and the data production
and analysis.
The presented lecture is concerned with basic data analysis.
In this regard, a selection of frequently required tools, such
as histograms, bootstrap resampling and the chi-square test will be
introduced and illustrated by means of examples.
The presented material is intended to be useful in two ways:
If you are not (yet) familiar with those tools, the presented lecture
will certainly help you to become a more effective data analyst.
If you already have an overview over the presented material
and if you are familiar with the subject, it will recap those
topics.

The concepts and techniques introduced here are of general interest
and are, at best, employed by computational aid.
Consequently, an exemplary implementation of the presented techniques
using the python programming language is provided.
To this end, the lecture is supplemented by a set of
python scripts that allow you to reproduce virtually all
examples and figures shown in the course of the lecture.

### Introduction to Numerical Methods I + II

####
Thomas Schuster,
*University of Oldenburg*

In the natural sciences and in engineering the problem often arises
to approximate a highly nonlinear or even unknown function by functions
that are easy to handle. For instance, if the outcome of an experiment
is a finite number of pairs of certain quantities, then one would describe
a correlation between these quantities by a function which is supposed
to be as simple as possible. Such 'simple' functions are, e.g.,
polynomials. The mathematical problem to compute functions that meet
certain given values is called *interpolation*

Another fundamental task in mathematics is the evaluation of integrals
which cannot be computed analytically (e.g., by using the fundamental
theorem of calculus). Thus an approximation is needed.
*Numerical integration* means the approximation of given integrals
by methods which only require a finite number of values of the integrand
at some specific points. The simplest technique to gain such methods
is to replace the integrand by its interpolation polynomial
and to integrate the latter one exactly.

The lecture first gives an introduction to polynomial interpolation
which is then used to get the *Newton - Cotes formulas*
of numerical integration.

### Extreme eigenvalues of random matrices

####
Bernd Blasius,
*University of Oldenburg*

### Large deviations in thermodynamics

####
Andreas Engel,
*University of Oldenburg*

With the progress in nano-technology and the increasing interest in the physical foundations of microbiology there is a growing need for a theoretical understanding of processes of energy conversion at the mesoscopic scale. In the thermodynamics of small systems variables like work, heat and entropy become fluctuating quantities. When the systems are driven away from equilibrium the tails of the corresponding probability distributions convey valuable information about their thermodynamic properties. The talk gives a short introduction to these problems and discusses a method to determine the large deviation properties of work delivered by a driven Langevin system.

## Part II: Special Topics

### Transition Path Sampling I + II

####
Christoph Dellago,
*University of Vienna*

###
Extreme polymer conformations I + II:

From Rosenbluth Sampling to PERM - rare event sampling with stochastic
growth algorithms

####
Thomas Prellberg,
*Queen Mary University of London*

We discuss uniform sampling algorithms that are based on stochastic growth
methods, using sampling of extreme configurations of polymers
in simple lattice models as a motivation.

We shall show how a series of clever enhancements to a fifty-odd year old
algorithm, the Rosenbluth method, led to a cutting-edge algorithm capable
of uniform sampling of equilibrium statistical mechanical systems of
polymers in situations where competing algorithms failed to perform well.
Examples range from collapsed homo-polymers near sticky surfaces to models
of protein folding.

### Extreme weather/stock market events I + II

####
Joachim Peinke,
*University of Oldenburg*

### Analysis of stock market crashes I + II

####
Ryan Woodard,
*ETH Zurich*

### Introduction to sequence alignments

####
Olaf R.P. Bininda-Emonds,
*University of Oldenburg*

Phylogenetic analysis depends crucially on the data underlying
the analysis, with issues of data quality being but one aspect
of the problem.
A second, slightly less appreciated aspect is that of data comparability
and specifically the evolutionary homology (= derivation from a common
shared ancestry) of the individual characters being examined.
The goal of sequence alignment is to establish the positional homology
of the individual elements (nucleotide bases or amino acid residues)
of two or more molecular sequences (DNA or proteins, respectively).
Like many other problems in computational biology, the simultaneous
alignment of three or more sequences is known to be NP-hard.
Fortunately, many fast and reasonable heuristics to this problem do exist,
however.

In this talk, I first examine the concept of molecular homology
before describing efficient algorithms for the alignment of pairs
of sequences.
I then expand on this to discuss potential solutions and strategies
to tackle the problem of multiple sequence alignment.

### Sampling of extreme sequence alignments I + II

####
Alexander K. Hartmann,
*University of Oldenburg*

### Simulation of extreme traffic events I + II

####
Christoph Dobler,
*ETH Zurich*

Today, typical techniques in the field of transport planning are designed
to analyze scenarios which describe common situations like an ordinary
working day without remarkable incidents.
A scenario containing exceptional events that occur by chance
increases the complexity of the required model significantly.
A short introduction into a traditional methodology as well as some
commonly used techniques is given.
Moreover, the benefits of modern approaches are highlighted.
MATSim, a multi-agent traffic flow simulation which implements
those new techniques, is presented.

It is highlighted why, both the traditional methodology as well
as the agent-based simulation approach, will fail for scenarios
with unexpected events. An improvement of the simulation approach
is presented, which overcomes these problems and, therefore,
is an appropriate instrument for such scenarios.