Module
1 Wednesday 27 January - Friday 29 January
2010
Application of evolutionary algorithms
to solve complex problems in quantitative genetics and
bioinformatics
Lecturers:
Professor Brian Kinghorn
Dr Cedric Gondro
University of New England, Armidale,
NSW Australia
Practical Evolutionary Algorithms to Solve Problems in
Genetics and Bioinformatics
Summary
This course aims to empower its participants by
sharing some knowledge and skills that have greatly benefitted the presenters.
The development of evolutionary algorithms is not a hard science – there are
tips and tricks, and opportunities for tweaking. Here we discuss the 'art' to
developing and using these algorithms. The course should be of general interest
to both students and researchers who are not familiar with EAs. And to some
extent even those who are, given the practical approach adopted in the course.
Most attendees at some time or other will be involved in problem solving - EAs
are very handy to know given their generality and ease of implementation.
Overview
Given the ever growing quantity of data generated by high-throughput genomic projects and the need to extract information and generate knowledge from these complex datasets, it is ever more important that researchers have access and are familiar with methods that can help unravel this complexity. In this course we will overview evolutionary algorithms (EAs), which are a relatively simple computational approach that uses natural evolutionary processes as an analogy for solving complex problems through, what is essentially, an intelligent search of the solution space. Evolutionary algorithms have been quite extensively adopted in bioinformatics to solve a wide range of problems (e.g. design optimization of microarray experiments, design of SNP chips, haplotype reconstruction, multiple sequence alignment, reconstruction of phylogenies, protein folding, inference of biological pathways/networks, parameterization problems, etc).
To some extend evolutionary algorithms are suitable for any problem that can be represented and in which it is possible to select a superior solution in relation to another one. So, one of the main strengths of evolutionary algorithms is that, differently from hard arm approaches which tend to be very problem specific, EAs are largely problem 'agnostic', meaning that the same method can very easily be applied to an entirely different problem or situation. This makes them suitable for a wide range of genetics and bioinformatics problems and a worthy addition to the toolkit of a geneticist.
For this course our aim is to introduce the general concept of EAs but focus on a couple of specific algorithms that we have used extensively in our work and which work very well for most scenarios. We will discuss the algorithms (just for completeness: Differential Evolution and Gene Expression Programming), show how to implement them in practice and discuss the pros/cons, pitfalls and probably the most important part of the course: the practical tips and tricks that we have learned from using these methods applied to various problems over the years: e.g. speed issues, breaking out of local optima, diagnosing convergence, parameter setting, repeatability/robustness, among others. We will also illustrate all concepts with 'real-world' bioinformatics applications that we have previously worked on.
Topics
A brief overview of Evolutionary Algorithms
Problems not for EC
Problems for EC
Architecture for solving problems
What are heuristics?
Deterministic versus stochastic methods
From random search to simulated annealing
Evolution of evolutionary computation
Types of EC and choosing a method
EC framework
Evolution and population genetics 101
Skeleton of an EA (canonical GA)
Parameters: population size, mating schemes, selection, mutation, recombination
Introduction to fitness and objective functions
Differential Evolution
How does DE work?
Parameter settings
Breaking out of local optima
Genetic Programming
Overview of GP
Gene Expression Programming
Selection and search operators
Fitness and bloat
Evolutionary computation in practice
Problem representation
Moulding the solution space
Managing constraints
The need for constraints
How to apply constraints
Hard and soft constraints
Changing the goal posts
Changing on the fly
Managing change of direction
Diagnosing convergence
Criteria for stopping
Maximum solutions
Examples in bioinformatics, systems biology and
Artificial Life
Multiple sequence alignment
Optimization of microarray experimental designs
Model discovery and parameterization
Alife virtual agents