## Incorporating New Ideas from Genetics

Haploid crossover and mutation are only the barest bones of real-world genetic systems. I have discussed some extensions, including diploidy, inversion, gene doubling, and deletion. Other GA researchers have looked at genetics-inspired mechanisms such as dominance, translocation, sexual differentiation (Goldberg 1989a, chapter 5), and introns (Levenick 1991). These all are likely to have important roles in nature, and mechanisms inspired by them could potentially be put to excellent use in...

## Royal Road Functions

The Schema Theorem, by itself, addresses the positive effects of selection (allocating increasing samples of schemas with observed high performance) but only the negative aspects of crossover that is, the extent to which it disrupts schemas. It does not address the question of how crossover works to recombine highly fit schemas, even though this is seen by many as the major source of the search power of genetic algorithms. The Building Block Hypothesis states that crossover combines short,...

## How Do Genetic Algorithms Work

Although genetic algorithms are simple to describe and program, their behavior can be complicated, and many open questions exist about how they work and for what types of problems they are best suited. Much work has been done on the theoretical foundations of GAs (see, e.g., Holland 1975 Goldberg 1989a Rawlins 1991 Whitley 1993b Whitley and Vose 1995). Chapter 4 describes some of this work in detail. Here I give a brief overview of some of the fundamental concepts. The traditional theory of GAs...

## Hosts and Parasites Using GAs to Evolve Sorting Networks

Designing algorithms for efficiently sorting collections of ordered elements is fundamental to computer science. Donald Knuth (1973) devoted more than half of a 700-page volume to this topic in his classic series The Art ofComputer Programming. The goal of sorting is to place the elements in a data structure (e.g., a list or a tree) in some specified order (e.g., numerical or alphabetic) in minimal time. One particular approach to sorting described in Knuth's book is the sorting network, a...

## Steady State Selection

Most GAs described in the literature have been generational at each generation the new population consists entirely of offspring formed by parents in the previous generation (though some of these offspring may be identical to their parents). In some schemes, such as the elitist schemes described above, successive generations overlap to some degree some portion of the previous generation is retained in the new population. The fraction of new individuals at each generation has been called the...

## Finite Population Model

The infinite-population case involves deterministic transitions from to -'i lj and thus from to in an infinite population there are no sampling errors. In contrast, modeling a finite population requires taking account of the stochastic effects of sampling. To address the finite-population case, Nix and Vose (1991) modeled the simple GA as a Markov chain. Markov chains are stochastic processes in which the probability that the process will be in state j at time t depends only on the state i at...

## Randommutation hill climbing RMHC

Call this string best-evaluated Choose a locus at random to flip. If the flip leads to an equal or higher fitness, then set best-evaluated to the resulting string. Go to step 2 until an optimum string has been found or until a maximum number of evaluations have been performed. Return the current value of best-evaluated. (This is similar to a zero-temperature Metropolis method.) We performed 200 runs of each algorithm, each run starting with a different random-number...

## Hitchhiking in the Genetic Algorithm

What caused our GA to perform so badly on R relative to RMHC One reason was hitchhiking once an instance of a higher-order schema is discovered, its high fitness allows the schema to spread quickly in the population, with zeros in other positions in the string hitchhiking along with the ones in the schema's defined positions. This slows the discovery of schemas in the other positions, especially those that are close to the highly fit schema's defined positions. In short, hitchhiking seriously...

## Evolving Lisp Programs

John Koza (1992,1994) has used a form of the genetic algorithm to evolve Lisp programs to perform various tasks. Koza claims that his method genetic programming (GP) has the potential to produce programs of the necessary complexity and robustness for general automatic programming. Programs in Lisp can easily be expressed in the form of a parse tree, the object the GA will work on. As a simple example, consider a program to compute the orbital period P of a planet given its average distance A...

## Computer Exercises

Write a genetic algorithm to replicate Hinton and Nowlan's experiment. Make plots from your results similar to those in figure 3.4, and compare your plots with that figure. Do a run that goes for 2000 generations. At what frequency and at what generation do the question marks reach a steady state Could you roughly predict this frequency ahead of time Run a GA on the fitness function f(X) the number of ones in x, where x is a chromosome of length 20. (See computer exercise 1 in chapter 1 for...

## Formalization of GAs

The mathematicians Michael Vose and Gunar Liepins (1991) developed a formal model based on the following simple GA Start with a random population of binary strings of length l. Calculate the fitness f(x) of each string x in the population. Choose (with replacement) two parents from the current population with probability proportional to each string's relative fitness in the population. Cross over the two parents (at a single randomly chosen point) with probability pc to form two offspring. (If...

## Simple Model of the Baldwin Effect

Genetic assimilation is well known in the evolutionary biology community. Its predecessor, the Baldwin effect, is less well known, though it has recently been picked up by evolutionary computationalists because of an interesting experiment performed by Geoffrey Hinton and Steven Nowlan (1987). Hinton and Nowlan employed a GA in a computer model of the Baldwin effect. Their goal was to demonstrate this effect empirically and to measure its magnitude, using a simplified model. An extremely simple...

## Simple Genetic Algorithm

Given a clearly defined problem to be solved and a bit string representation for candidate solutions, a simple GA works as follows Start with a randomly generated population of n 7-bit chromosomes (candidate solutions to a problem). Calculate the fitness f(x) of each chromosome x in the population. Repeat the following steps until n offspring have been created a. Select a pair of parent chromosomes from the current population, the probability of selection being an increasing function of...

## Examples of Fitness Functions

One common application of GAs is function optimization, where the goal is to find a set of parameter values that maximize, say, a complex multiparameter function. As a simple example, one might want to maximize the real-valued one-dimensional function f(y) y + I sin(32.v) . 0 < y < n (Riolo 1992). Here the candidate solutions are values of y, which can be encoded as bit strings representing real numbers. The fitness calculation translates a given bit string x into a real number y and then...

## Thought Exercises

For the fitness function defined by Equation 4.5, what are the average fitnesses of the schemas (a) 1 ** *, (b) 11 * *, and (c) 1 * 1 * * How many schemas are there in a partition with k defined bits in an 7-bit search space Consider the fitness function f(x number of ones in x, where x is a chromosome of length 4. Suppose the GA has run for three generations, with the following populations Define on-line performance at function evaluation step t as the average fitness of all the individuals...

## The Two Armed Bandit Problem

The tradeoff between exploration and exploitation can be instructively modeled in a simple scenario the Two-Armed Bandit problem. This problem has been studied extensively in the context of statistical decision theory and adaptive control e.g., see Bellman 1961 . Holland 1975 used it as an as a mathematical model of how a GA allocates samples to schemas. The scenario is as follows. A gambler is given N coins with which to play a slot machine having two arms. A conventional slot machine is...

## Predicting Dynamical Systems

Norman Packard 1990 has developed a form of the GA to address this problem and has applied his method to several data analysis and prediction problems. The general problem can be stated as follows A series of observations from some process e.g., a physical system or a formal dynamical system take the form of a set of pairs, where , are independent variables and y is a dependent variable 1didN . For example, in a weather prediction task, the independent variables might be some set of features of...

## Grammatical Encoding

The method of grammatical encoding can be illustrated by the work of Hiroaki Kitano 1990 , who points out that direct-encoding approachs become increasingly difficult to use as the size of the desired network increases. As the network's size grows, the size of the required chromosome increases quickly, which leads to problems both in performance how high a fitness can be obtained and in efficiency how long it takes to obtain high fitness . In addition, since direct-encoding methods explicitly...

## Limitations of Static Schema Analysis

A number of recent papers have questioned the relevance of schema analysis to the understanding of real GAs e.g., Grefenstette 1993 Mason 1993 Peck and Dhawan 1993 . Here I will focus on Grefenstette's critique of the Static Building Block Hypothesis. The following qualitative formulation of the Schema Theorem and the Building Block Hypothesis should now be familiar to the reader The simple GA increases the number of instances of low-order, short-defininglength, high-observed-fitness schemas...

## Rank Selection

Rank selection is an alternative method whose purpose is also to prevent too-quick convergence. In the version proposed by Baker 1985 , the individuals in the population are ranked according to fitness, and the expected value of each individual depends on its rank rather than on its absolute fitness. There is no need to scale fitnesses in this case, since absolute differences in fitness are obscured. This discarding of absolute fitness information can have advantages using absolute fitness can...

## Nextascent hill climbing NAHC

Call this string current-hilltop. For ifrom 1 to l where lis the length of the string , flip bit i if this results in a fitness increase, keep the new string, otherwise flip bit i back. As soon as a fitness increase is found, set current-hilltop to that increased-fitness string without evaluating any more bit flips of the original string. Go to step 2 with the new current-hilltop, but continue mutating the new string starting immediately after the bit position at...

## Genetic Algorithms And Traditional Search Methods

In the preceding sections I used the word search to describe what GAs do. It is important at this point to contrast this meaning of search with its other meanings in computer science. There are at least three overlapping meanings of search Search for stored data Here the problem is to efficiently retrieve information stored in computer memory. Suppose you have a large database of names and addresses stored in some ordered way. What is the best way to search for the record corresponding to a...

## Evolutionary Reinforcement Learning

A second computational demonstration of the Baldwin effect was given by David Ackley and Michael Littman 1992 . Their primary goal was to incorporate reinforcement learning an unsupervised learning method into an evolutionary framework and to see whether evolution could produce individuals that not only behaved appropriately but also could correctly evaluate the situations they encountered as beneficial or dangerous for future survival. In Ackley and Littman's Evolutionary Reinforcement...

## Fitness Proportionate Selection with Roulette Wheel and Stochastic Universal Sampling

Holland's original GA used fitness-proportionate selection, in which the expected value of an individual i.e., the expected number of times an individual will be selected to reproduce is that individual's fitness divided by the average fitness of the population. The most common method for implementing this is roulette wheel sampling, described in chapter 1 each individual is assigned a slice of a circular roulette wheel, the size of the slice being proportional to the individual's fitness. The...