Neural Network Lab

Learning to Use Genetic Algorithms and Evolutionary Optimization

Evolutionary optimization (EO) is a type of genetic algorithm that can help minimize the error between computed output values and training data target output values. Use this demo program to learn to the method.

Evolutionary optimization (EO) is a technique for finding approximate solutions to difficult or impossible numeric optimization problems. In particular, EO can be used to train a neural network. EO is loosely based on biological chromosomes and genes, and reproductive mechanisms including selection, chromosome crossover and gene mutation. EO is really a type of genetic algorithm (GA) and implementations of the EO technique are sometimes called real-valued genetic algorithms, or just genetic algorithms.

A neural network is essentially a complex function that accepts numeric inputs and generates numeric outputs. A fully connected neural network with m inputs, h hidden nodes and n outputs has (m * h) + (h * n) + (h + n) weights and biases. Training a neural network is the process of finding values for the weights and biases so that, for a given set of training data with known input and output values, the computed outputs of the network closely match the known outputs. In other words, training a neural network is a numerical optimization problem where the goal is to minimize the error between computed output values and training data target output values.

Among my colleagues, the three most common approaches for training a neural network are using the back-propagation algorithm, using particle swarm optimization, and using evolutionary optimization. In order to use EO to train a neural network you must have a solid grasp of exactly how EO works. Because neural networks are fairly complicated systems, in my opinion it's best to learn about EO by examining a simpler problem than training a neural network. Take a look at the screenshot of a demo program in Figure 1.

[Click on image for larger view.] Figure 1. Evolutionary Optimization Demo Program

The demo program uses EO to find an approximate solution to the dummy problem of identifying the values of x0, x1, x2, x3, x4, and x5 so that the function f(x0,x1,x2,x3,x4,x5)= x0^2 + x1^2 + x2^2 + x3^2 + x4^2 + x5^2 is minimized. This minimization optimization problem has an exact solution of x0 = x1 = x2 = x3 = x4 = x5 = 0.0 and so EO is not really necessary and is used just to demonstrate the algorithm.

Evolutionary Optimization Parameters
In high-level pseudo-code, an EO algorithm works like this:

set up parameters
initialize a population of random solutions
loop until done
  select two good parent solutions
  use them to generate two child solutions
  place children into population
  immigrate a new solution into population
end loop
return best solution found

EO is a meta-heuristic, which means that it's really a set of guidelines that can be used to design a specific algorithm. There are many design choices you must make, such as how to select parent solutions, how to generate children solutions and how to replace old solutions with new solutions.

The demo program in Figure 1 begins by setting up nine parameters that are used by the EO algorithm. Compared to other optimization techniques, EO requires many free parameters (this is considered a weakness of EO).

The first parameter is the number of problem dimensions, or put another way, the number of x-values that must be found. For the demo problem this is 6 because the values for x0, x1, x2, x3, x4, and x5 must be found. When using EO for neural network training, the number of problem dimensions will be the number of network weights and bias values. In EO, a possible solution is represented by a virtual chromosome, which is realized as an array of type double. Each cell in the chromosome array is called a gene, so the problem dimension can also be referred to as the number of genes. There are many variations in genetic algorithm vocabulary. For example, the array that this article calls a chromosome is also called a genotype.

The second parameter used by EO is the population size, which is set to 50 in the demo. EO works with a collection of possible solutions rather than just a single solution. In general, the population size for an EO algorithm is often between 10 and 1,000.

The third and fourth EO parameters are minGene and maxGene, which are set to -10.0 and +10.0 respectively. These values define the range of possible initial values for each x-value (gene) in a solution (chromosome). When using EO to train a neural network, if the input data is normalized, then the vast majority of normalized values will be between -10 and +10, and therefore reasonable weights and bias values will also be between -10 and +10. If you use EO for some problem other than neural network training, you may have to change the values of minGene and maxGene.

The fifth EO parameter is the mutation rate, which is set to 0.20 in the demo. EO is an iterative process. In each iteration, each x-value in the chromosome array is mutated (changed) with probability equal to the mutation rate. EO is often very sensitive to parameter values, especially the mutation rate, and in general you must experiment with different values. That said, based on my experience, EO mutation rates between 0.10 and 0.50 typically work best. Note that mutation rate values for genetic algorithms that use a bit representation for solutions are typically much smaller than those used by EO algorithms that use real-valued solutions.

The sixth EO parameter is the mutation change factor, set to 0.01 in the demo. This value controls the magnitude of change in a gene/x-value if in fact the gene is mutated. Typically, values between 0.001 and 0.1 work well, but again, in many situations you must experiment a bit.

The seventh EO parameter is the selection pressure, often called tau in research literature, which is set to 0.40 in the demo. The selection pressure is used by the part of the algorithm that determines which pair of parent chromosomes will be selected to be used to generate new child solutions. You can think of selection pressure as loosely meaning the likelihood that the two best chromosomes will be selected. Typical effective values for tau range from 0.30 to 0.70.

The eighth and ninth EO parameters control how the EO main processing loop terminates. The max-generations parameter is hard limit on the maximum number of iterations of the main EO processing loop. In the demo max-generations is set to 5000 and was determined by some preliminary experimentation. The early-exit error, set to 0.000010 in the demo, establishes a short-circuit loop exit condition; if at any point in time the best solution found has an error less than the early-exit error, the processing loop breaks immediately.

Overall Demo Program Structure
To create the demo program I launched Visual Studio and created a C# console application named EvolutionaryOptimization. After the template code loaded into the editor, I deleted all references to .NET namespaces except for the one to the top-level System namespace. In the Solution Explorer window I renamed file Program.cs to EvolutionaryProgram.cs and VS automatically renamed class Program for me.

The overall program structure, with some minor edits to save space, is presented in Listing 1. The demo has most normal error-checking removed to keep the size of the code smaller and to keep the main ideas clear.

Listing 1: Demo Program Structure
using System;
namespace EvolutionaryOptimization
{
  class EvolutionaryProgram
  {
    static Random rnd = new Random(0);

    static void Main(string[] args)
    {
      Console.WriteLine("Begin Evolutionary Optimization demo");
      Console.Write("Goal is to minimize f(x0,x1,x2,x3,x4,x5) = ");
      Console.WriteLine("x0^2 + x1^2 + x2^2 + x3^2 + x4^2 + x5^2");
      Console.Write("Known solution is 0.0 at ");
      Console.WriteLine("x0 = x1 = x2 = x3 = x4 = x5 = 0.000000");

      int dim = 6;
      int popSize = 50;
      double minX = -10.0;
      double maxX = 10.0;
      double mutateRate = 0.20;
      double mutateChange = 0.01;
      double tau = 0.40;
      int maxGeneration = 5000;
      double exitError = 0.00001;

      Console.WriteLine("Setting problem dimension = " + dim);
      // display values of other parameters

      double[] bestSolution = Solve(dim, popSize, minX, maxX,
        mutateRate, mutateChange, tau, maxGeneration, exitError);

      Console.WriteLine("Best solution found:\n");
      for (int i = 0; i < bestSolution.Length; ++i)
        Console.Write(bestSolution[i] + " ");

      double error = EvolutionaryProgram.Error(bestSolution);
      Console.Write("(Absolute) error value at best solution = ");
      Console.WriteLine(error.ToString("F6"));

      Console.WriteLine("\nEnd Evolutionary Optimization demo\n");
      Console.ReadLine();
    }

    static double[] Solve(int dim, int popSize, double minX,
      double maxX, double mutateRate, double mutateChange,
      double tau, int maxGeneration, double exitError) { . . }
    
    private static Individual[] Select(int n,
      Individual[] population, double tau) { . . }


    
    private static Individual[] Reproduce(Individual parent1,
      Individual parent2, double minGene, double maxGene,
      double mutateRate, double mutateChange) { . . }
    
    private static void Mutate(Individual child, double maxGene,
      double mutateRate, double mutateChange) { . . }
    
    private static void Place(Individual child1, Individual child2,
      Individual[] population) { . . }
    
    private static void Immigrate(Individual[] population,
      int numGenes, double minGene, double maxGene,
      double mutateRate, double mutateChange) { . . }
    
    public static double Error(double[] x) { . . }
  } 

  public class Individual : IComparable<Individual>
  {
    public double[] chromosome;
    public double error;
    private int numGenes;
    private double minGene;
    private double maxGene;
    private double mutateRate;
    private double mutateChange;

    static Random rnd = new Random(0);

    public Individual(int numGenes, double minGene,
      double maxGene, double mutateRate, double mutateChange) { . . }
    
    public int CompareTo(Individual other) { . . }
  }
}

Most of the work is performed by method Solve. Method Solve calls helper methods Select (to pick two good parents), Reproduce (to generate two new child solutions), Place (to insert the children into the population) and Immigrate (to insert a new random solution into the population). Helper method Reproduce calls sub-helper method Mutate (to alter the new children slightly). Method Solve assumes the existence of a globally accessible class-scope Random object named rnd, an accessible Error function and an Individual class.

The Individual Class
Program-defined class Individual encapsulates a chromosome/solution. The class derives from IComparable so that Individual objects can be automatically sorted from best (smallest error) to worst (largest error). The Individual class definition is presented in Listing 2.

Listing 2: The Individual Class
public class Individual : IComparable<Individual>
{
  public double[] chromosome; // a solution
  public double error;
  private int numGenes; // problem dimension
  private double minGene;
  private double maxGene;
  private double mutateRate;
  private double mutateChange;

  static Random rnd = new Random(0);

  public Individual(int numGenes, double minGene,
    double maxGene, double mutateRate, double mutateChange)
  {
    this.numGenes = numGenes;
    this.minGene = minGene;
    this.maxGene = maxGene;
    this.mutateRate = mutateRate;
    this.mutateChange = mutateChange;
    this.chromosome = new double[numGenes];
    for (int i = 0; i < this.chromosome.Length; ++i)
      this.chromosome[i] = 
       (maxGene - minGene) * rnd.NextDouble() + minGene;
    this.error = EvolutionaryProgram.Error(this.chromosome);
  }

  public int CompareTo(Individual other)
  {
    if (this.error < other.error) return -1;
    else if (this.error > other.error) return 1;
    else return 0;
  }
}

Array member chromosome represents a possible solution to the problem being investigated. Typically, standard genetic algorithms use a form of bit representation, while evolutionary algorithms use real values, but this vocabulary distinction is arbitrary. Member variable error holds the error associated with the object's chromosome/solution. This value is often named fitness but I prefer the name error in situations where the goal is to minimize some value such as neural network training error. For simplicity, members chromosome and error are declared with public scope but you may want to use private scope along with get and set properties.

The Individual class constructor generates a random solution, where each cell/gene of the chromosome array is between minGene and maxGene, using a class-scope Random object. Notice the constructor calls the globally accessible Error function to supply the value for the error member. An alternative, but in my opinion, messier approach, is to pass the error function to the constructor as a delegate.

The CompareTo method is coded so that when a Sort method is called on a collection of Individual objects, the objects will be sorted from smallest error (best) to largest error (worst).

The Error method is defined as:

public static double Error(double[] x)
{
  double trueMin = 0.0;
  double z = 0.0;
  for (int i = 0; i < x.Length; ++i)
    z += (x[i] * x[i]);
  return Math.Abs(trueMin - z);
}

Here the absolute value of the difference between the true, known minimum value of the dummy function (0.0) and the computed value, z, is returned. Common alternatives are to return the squared difference between the true and computed values, or to return the square root of the squared difference. Notice all these approaches ensure that error is always a non-negative number.


comments powered by Disqus

Featured

Subscribe on YouTube