Modern Computational Science - Summer School

Modern Computational Science 2011 - Abstracts

The first week of the School mainly deals with the basics in the field of Extreme Events and other, fundamental methods of Computational Science (Part I). More specialized topics of current interest will be covered in week two (Part II). Hands-on lab exercises will complement the lectures.

For dates and times, please see the Schedule.

Part I: Fundamentals

Software Engineering for Scientists

Helmut G. Katzgraber, Texas A&M University and ETH Zurich

Writing complex computer programs to study scientific problems requires careful planning and an in-depth knowledge of programming languages and tools, as well as analysis and visualization programs. In this lecture, the importance of using the right tool for the right problem is emphasized. Furthermore, the core concepts needed to work in a Linux/UNIX environment are presented. Common tools to organize computer programs (cvs and make), as well as to debug (gdb, valgrind) and improve them (gprof, compiler flags) are discussed, followed by simple data reduction strategies and visualization tools. Finally, some scientific libraries such as Boost, GSL, LEDA and Numerical Recipes are outlined.

Advanced Programming Techniques I + II

Frauke Liers, University of Cologne

Introduction to rare-event theory: Theory, applications, and simulations

Hugo Touchette, Queen Mary University of London

The theory of large deviations deals with the probabilities of rare events (or fluctuations) that are exponentially small as a function of some parameter, e.g., the number of random components of a system, the time over which a stochastic system is observed, the amplitude of the noise perturbing a dynamical system, or the temperature of a chemical reaction. The theory has applications in many different scientific fields, ranging from queuing theory to statistics and from finance to engineering. It is also increasingly used in statistical physics for studying both equilibrium and nonequilibrium systems. In this context, deep analogies can be made between familiar concepts of statistical physics, such as the entropy and the free energy, and concepts of large deviation theory having more technical names, such as rate functions and scaled cumulant generating functions.
The first part of these lectures will introduce the basic elements of large deviation theory at a level appropriate for undergraduate and early graduate students in physics, engineering, chemistry, and applied mathematics. The focus there will be on the simple but powerful ideas behind large deviation theory, as opposed to the formalism, and on the application of these ideas to simple stochastic processes, such as sums of independent and identically distributed random variables and Markov processes. Some physical applications of these processes will be covered in exercises.
In the second part, the problem of numerically evaluating large deviation probabilities will be treated at a basic level. The fundamental idea of importance sampling will be introduced together with its sister idea, the exponential change of measure. Other numerical methods based on sample means and generating functions as well as applications to Markov chains and stochastic differential equations will also covered.

Pseudo random number generation

Helmut G. Katzgraber, Texas A&M University and ETH Zurich

Random numbers play a crucial role in science and industry. Many numerical methods require the use of random numbers, in particular the Monte Carlo method. Therefore it is of paramount importance to have efficient random number generators. The differences, advantages and disadvantages of true and pseudo random number generators are discussed with an emphasis on the intrinsic details of modern and fast pseudo random number generators. Furthermore, standard tests to verify the quality of the random numbers produced by a given generator are outlined. Finally, standard scientific libraries with built-in generators are presented, as well as different approaches to generate non-uniform random numbers. Potential problems that one might encounter when using large parallel machines are discussed.

Introduction to Monte Carlo Simulations I + II

Helmut G. Katzgraber, Texas A&M University and ETH Zurich

Monte Carlo methods play an important role in scientific computation, especially when problems have a vast phase space. In this lecture an introduction to the Monte Carlo method is given. Concepts such as Markov chains, detailed balance, critical slowing down, and ergodicity, as well as the Metropolis algorithm are explained. The Monte Carlo method is illustrated by numerically studying the critical behavior of the two-dimensional Ising ferromagnet using finite-size scaling methods.
In the second part of the lecture advanced Monte Carlo methods are described (parallel tempering Monte Carlo) and illustrated with nontrivial models from the physics of glassy systems. Special attention is paid to the equilibration of the simulations, as well as the inclusion of autocorrelation effects.

Basic data analysis

Oliver Melchert, University of Oldenburg

In a simplified sense, the progression of a scientific project that involves numerical experiments can be subdivided into two basic parts: the implementation of a model system and the data production and analysis. The presented lecture is concerned with basic data analysis. In this regard, a selection of frequently required tools, such as histograms, bootstrap resampling and the chi-square test will be introduced and illustrated by means of examples. The presented material is intended to be useful in two ways: If you are not (yet) familiar with those tools, the presented lecture will certainly help you to become a more effective data analyst. If you already have an overview over the presented material and if you are familiar with the subject, it will recap those topics.
The concepts and techniques introduced here are of general interest and are, at best, employed by computational aid. Consequently, an exemplary implementation of the presented techniques using the python programming language is provided. To this end, the lecture is supplemented by a set of python scripts that allow you to reproduce virtually all examples and figures shown in the course of the lecture.

Introduction to Numerical Methods I + II

Thomas Schuster, University of Oldenburg

In the natural sciences and in engineering the problem often arises to approximate a highly nonlinear or even unknown function by functions that are easy to handle. For instance, if the outcome of an experiment is a finite number of pairs of certain quantities, then one would describe a correlation between these quantities by a function which is supposed to be as simple as possible. Such 'simple' functions are, e.g., polynomials. The mathematical problem to compute functions that meet certain given values is called interpolation

Another fundamental task in mathematics is the evaluation of integrals which cannot be computed analytically (e.g., by using the fundamental theorem of calculus). Thus an approximation is needed. Numerical integration means the approximation of given integrals by methods which only require a finite number of values of the integrand at some specific points. The simplest technique to gain such methods is to replace the integrand by its interpolation polynomial and to integrate the latter one exactly.

The lecture first gives an introduction to polynomial interpolation which is then used to get the Newton - Cotes formulas of numerical integration.

Extreme eigenvalues of random matrices

Bernd Blasius, University of Oldenburg

Large deviations in thermodynamics

Andreas Engel, University of Oldenburg

With the progress in nano-technology and the increasing interest in the physical foundations of microbiology there is a growing need for a theoretical understanding of processes of energy conversion at the mesoscopic scale. In the thermodynamics of small systems variables like work, heat and entropy become fluctuating quantities. When the systems are driven away from equilibrium the tails of the corresponding probability distributions convey valuable information about their thermodynamic properties. The talk gives a short introduction to these problems and discusses a method to determine the large deviation properties of work delivered by a driven Langevin system.

Part II: Special Topics

Transition Path Sampling I + II

Christoph Dellago, University of Vienna

Extreme polymer conformations I + II:
From Rosenbluth Sampling to PERM - rare event sampling with stochastic growth algorithms

Thomas Prellberg, Queen Mary University of London

We discuss uniform sampling algorithms that are based on stochastic growth methods, using sampling of extreme configurations of polymers in simple lattice models as a motivation.
We shall show how a series of clever enhancements to a fifty-odd year old algorithm, the Rosenbluth method, led to a cutting-edge algorithm capable of uniform sampling of equilibrium statistical mechanical systems of polymers in situations where competing algorithms failed to perform well. Examples range from collapsed homo-polymers near sticky surfaces to models of protein folding.

Extreme weather/stock market events I + II

Joachim Peinke, University of Oldenburg

Analysis of stock market crashes I + II

Ryan Woodard, ETH Zurich

Introduction to sequence alignments

Olaf R.P. Bininda-Emonds, University of Oldenburg

Phylogenetic analysis depends crucially on the data underlying the analysis, with issues of data quality being but one aspect of the problem. A second, slightly less appreciated aspect is that of data comparability and specifically the evolutionary homology (= derivation from a common shared ancestry) of the individual characters being examined. The goal of sequence alignment is to establish the positional homology of the individual elements (nucleotide bases or amino acid residues) of two or more molecular sequences (DNA or proteins, respectively). Like many other problems in computational biology, the simultaneous alignment of three or more sequences is known to be NP-hard. Fortunately, many fast and reasonable heuristics to this problem do exist, however.
In this talk, I first examine the concept of molecular homology before describing efficient algorithms for the alignment of pairs of sequences. I then expand on this to discuss potential solutions and strategies to tackle the problem of multiple sequence alignment.

Sampling of extreme sequence alignments I + II

Alexander K. Hartmann, University of Oldenburg

Simulation of extreme traffic events I + II

Christoph Dobler, ETH Zurich

Today, typical techniques in the field of transport planning are designed to analyze scenarios which describe common situations like an ordinary working day without remarkable incidents. A scenario containing exceptional events that occur by chance increases the complexity of the required model significantly. A short introduction into a traditional methodology as well as some commonly used techniques is given. Moreover, the benefits of modern approaches are highlighted. MATSim, a multi-agent traffic flow simulation which implements those new techniques, is presented.
It is highlighted why, both the traditional methodology as well as the agent-based simulation approach, will fail for scenarios with unexpected events. An improvement of the simulation approach is presented, which overcomes these problems and, therefore, is an appropriate instrument for such scenarios.