Armidale Animal Breeding Summer Course 2010

 

Module 1       Wednesday 27 January - Friday 29 January 2010

 

Application of evolutionary algorithms 
to solve complex problems in quantitative genetics and bioinformatics

 

Lecturers:

                                Professor Brian Kinghorn

                                Dr Cedric Gondro

                                University of New England, Armidale, NSW Australia

 

Content:

Practical Evolutionary Algorithms to Solve Problems in Genetics and Bioinformatics

 

Summary

This course aims to empower its participants by sharing some knowledge and skills that have greatly benefitted the presenters. The development of evolutionary algorithms is not a hard science – there are tips and tricks, and opportunities for tweaking. Here we discuss the 'art' to developing and using these algorithms. The course should be of general interest to both students and researchers who are not familiar with EAs. And to some extent even those who are, given the practical approach adopted in the course. Most attendees at some time or other will be involved in problem solving - EAs are very handy to know given their generality and ease of implementation. 

 

Overview

Given the ever growing quantity of data generated by high-throughput genomic projects and the need to extract information and generate knowledge from these complex datasets, it is ever more important that researchers have access and are familiar with methods that can help unravel this complexity. In this course we will overview evolutionary algorithms (EAs), which are a relatively simple computational approach that uses natural evolutionary processes as an analogy for solving complex problems through, what is essentially, an intelligent search of the solution space. Evolutionary algorithms have been quite extensively adopted in bioinformatics to solve a wide range of problems (e.g. design optimization of microarray experiments, design of SNP chips, haplotype reconstruction, multiple sequence alignment, reconstruction of phylogenies, protein folding, inference of biological pathways/networks, parameterization problems, etc).

 

To some extend evolutionary algorithms are suitable for any problem that can be represented and in which it is possible to select a superior solution in relation to another one. So, one of the main strengths of evolutionary algorithms is that, differently from hard arm approaches which tend to be very problem specific, EAs are largely problem 'agnostic', meaning that the same method can very easily be applied to an entirely different problem or situation. This makes them suitable for a wide range of genetics and bioinformatics problems and a worthy addition to the toolkit of a geneticist.

 

For this course our aim is to introduce the general concept of EAs but focus on a couple of specific algorithms that we have used extensively in our work and which work very well for most scenarios. We will discuss the algorithms (just for completeness: Differential Evolution and Gene Expression Programming), show how to implement them in practice and discuss the pros/cons, pitfalls and probably the most important part of the course: the practical tips and tricks that we have learned from using these methods applied to various problems over the years: e.g. speed issues, breaking out of local optima, diagnosing convergence, parameter setting, repeatability/robustness, among others. We will also illustrate all concepts with 'real-world' bioinformatics applications that we have previously worked on.

 

 

 

Topics

A brief overview of Evolutionary Algorithms

Problems not for EC

Problems for EC

Architecture for solving problems

What are heuristics?

Deterministic versus stochastic methods

From random search to simulated annealing

Evolution of evolutionary computation

Types of EC and choosing a method

 

EC framework

Evolution and population genetics 101

Skeleton of an EA (canonical GA)

Parameters: population size, mating schemes, selection, mutation, recombination

Introduction to fitness and objective functions

 

Differential Evolution

How does DE work?

Parameter settings

Breaking out of local optima

 

Genetic Programming

Overview of GP

Gene Expression Programming

Selection and search operators

Fitness and bloat

 

Evolutionary computation in practice

Problem representation

Moulding the solution space

Managing constraints

            The need for constraints

            How to apply constraints

            Hard and soft constraints

Changing the goal posts

            Changing on the fly

            Managing change of direction

Diagnosing convergence

            Criteria for stopping

            Maximum solutions

           

Examples in bioinformatics, systems biology and Artificial Life

Multiple sequence alignment

Optimization of microarray experimental designs

Model discovery and parameterization

Alife virtual agents