How Do Genetic Algorithms Work
Although genetic algorithms are simple to describe and program, their behavior can be complicated, and many open questions exist about how they work and for what types of problems they are best suited. Much work has been done on the theoretical foundations of GAs (see, e.g., Holland 1975; Goldberg 1989a; Rawlins 1991; Whitley 1993b; Whitley and Vose 1995). Chapter 4 describes some of this work in detail. Here I give a brief overview of some of the fundamental concepts.
The traditional theory of GAs (first formulated in Holland 1975) assumes that, at a very general level of description, GAs work by discovering, emphasizing, and recombining good "building blocks" of solutions in a highly parallel fashion. The idea here is that good solutions tend to be made up of good building blocks—combinations of bit values that confer higher fitness on the strings in which they are present.
Holland (1975) introduced the notion of schemas (or schemata) to formalize the informal notion of "building blocks." A schema is a set of bit strings that can be described by a template made up of ones, zeros, and asterisks, the asterisks representing wild cards (or "don't cares"). For example, the schema H = 1 * * * * 1 represents the set of all 6bit strings that begin and end with 1. (In this section I use Goldberg's (1989a) notation, in which H stands for "hyperplane." H is used to denote schemas because schemas define hyperplanes—"planes" of various dimensions—in the /dimensional space of length/ bit strings.) The strings that fit this template (e.g., 100111 and 110011) are said to beinstances of H.The schema H is said to have two defined bits (nonasterisks) or, equivalently, to be of order 2. Its defining length (the distance between its outermost defined bits) is 5. Here I use the term "schema" to denote both a subset of strings represented by such a template and the template itself. In the following, the term's meaning should be clear from context.
Note that not every possible subset of the set of lengthl bit strings can be described as a schema; in fact, the huge majority cannot. There are 21 possible bit strings of length l, and thus 221 possible subsets of strings, but there are only 3l possible schemas. However, a central tenet of traditional GA theory is that schemas are—implicitly—the building blocks that the GA processes effectively under the operators of selection, mutation, and singlepoint crossover.
How does the GA process schemas? Any given bit string of length 1 is an instance of 21 different schemas. For example, the string 11 is an instance of ** (all four possible bit strings of length 2), *1, 1*, and 11 (the schema that contains only one string, 11). Thus, any given population of n strings contains instances of between 21 and n x 21 different schemas. If all the strings are identical, then there are instances of exactly 21 different schemas; otherwise, the number is less than or equal to n x 21. This means that, at a given generation, while the GA is explicitly evaluating the fitnesses of the n strings in the population, it is actually implicitly estimating the average fitness of a much larger number of schemas, where the average fitness of a schema is defined to be the average fitness of all possible instances of that schema. For example, in a randomly generated population of n strings, on average half the strings will be instances of 1***...* and half will be instances of 0 ***...*. The evaluations of the approximately n/2 strings that are instances of 1***...* give an estimate of the average fitness of that schema (this is an estimate because the instances evaluated in typicalsize population are only a small sample of all possible instances). Just as schemas are not explicitly represented or evaluated by the GA, the estimates of schema average fitnesses are not calculated or stored explicitly by the GA. However, as will be seen below, the GA's behavior, in terms of the increase and decrease in numbers of instances of given schemas in the population, can be described as though it actually were calculating and storing these averages.
We can calculate the approximate dynamics of this increase and decrease in schema instances as follows. Let H be a schema with at least one instance present in the population at time t. Let m(H,t) be the number of instances of H at time t, and let U(H,t) be the observed average fitness of Hat time t (i.e., the average fitness of instances of Hin the population at time t). We want to calculate E(m(H, t + 1)), the expected number of instances of H at time t + 1. Assume that selection is carried out as described earlier: the expected number of offspring of a string x is equal to /(£)/
: where /(x)is the fitness of x and : is the average fitness of the population at time t. Then, assuming x is in the population at time t, letting x I H denote "x is an instance of H," and (for now) ignoring the effects of crossover and mutation, we have
by definition, since U(H, t) = (£xih /(x))/m(H,t)for x in the population at time t. Thus even though the GA does not calculate U(H,t) explicitly, the increases or decreases of schema instances in the population depend on this quantity.
Crossover and mutation can both destroy and create instances of H. For now let us include only the destructive effects of crossover and mutation—those that decrease the number of instances of H. Including these effects, we modify the right side of equation 1.1 to give a lower bound on E(m(H, t + 1)). Let pc be the probability that singlepoint crossover will be applied to a string, and suppose that an instance of schema H is picked to be a parent. Schema H is said to "survive" under singlepoint crossover if one of the offspring is also an instance of schema H. We can give a lower bound on the probability Sc(H) that Hwill survive singlepoint crossover:
where d(H) is the defining length of Hand l is the length of bit strings in the search space. That is, crossovers occurring within the defining length of H can destroy H (i.e., can produce offspring that are not instances of H), so we multiply the fraction of the string that H occupies by the crossover probability to obtain an upper bound on the probability that it will be destroyed. (The value is an upper bound because some crossovers inside a schema's defined positions will not destroy it, e.g., if two identical strings cross with each other.) Subtracting this value from 1 gives a lower bound on the probability of survival Sc(H). In short, the probability of survival under crossover is higher for shorter schemas.
The disruptive effects of mutation can be quantified as follows: Let pm be the probability of any bit being mutated. Then Sm(H), the probability that schema Hwill survive under mutation of an instance of H, is equal to (1 p m)o(H), where o(H) is the order of H(i.e., the number of defined bits in H). That is, for each bit, the probability that the bit will not be mutated is 1 p m, so the probability that no defined bits of schema Hwill be mutated is this quantity multiplied by itself o(H) times. In short, the probability of survival under mutation is higher for lowerorder schemas.
These disruptive effects can be used to amend equation 1.1:
This is known as the Schema Theorem (Holland 1975; see also Goldberg 1989a). It describes the growth of a schema from one generation to the next. The Schema Theorem is often interpreted as implying that short, loworder schemas whose average fitness remains above the mean will receive exponentially increasing numbers of samples (i.e., instances evaluated) over time, since the number of samples of those schemas that are not disrupted and remain above average in fitness increases by a factor of U(H,t)lf(t) at each generation. (There are some caveats on this interpretation; they will be discussed in chapter 4.)
The Schema Theorem as stated in equation 1.2 is a lower bound, since it deals only with the destructive effects of crossover and mutation. However, crossover is believed to be a major source of the GA's power, with the ability to recombine instances of good schemas to form instances of equally good or better higherorder schemas. The supposition that this is the process by which GAs work is known as the Building Block Hypothesis (Goldberg 1989a). (For work on quantifying this "constructive" power of crossover, see Holland 1975, Thierens and Goldberg 1993, and Spears 1993.)
In evaluating a population of n strings, the GA is implicitly estimating the average fitnesses of all schemas that are present in the population, and increasing or decreasing their representation according to the Schema Theorem. This simultaneous implicit evaluation of large numbers of schemas in a population of n strings is known as implicitparalelism (Holland 1975). The effect of selection is to gradually bias the sampling procedure toward instances of schemas whose fitness is estimated to be above average. Over time, the estimate of a schema's average fitness should, in principle, become more and more accurate since the GA is sampling more and more instances of that schema. (Some counterexamples to this notion of increasing accuracy will be discussed in chapter 4.)
The Schema Theorem and the Building Block Hypothesis deal primarily with the roles of selection and crossover in GAs. What is the role of mutation? Holland (1975) proposed that mutation is what prevents the loss of diversity at a given bit position. For example, without mutation, every string in the population might come to have a one at the first bit position, and there would then be no way to obtain a string beginning with a zero. Mutation provides an "insurance policy" against such fixation.
The Schema Theorem given in equation 1.1 applies not only to schemas but to any subset of strings in the search space. The reason for specifically focusing on schemas is that they (in particular, short, highaveragefitness schemas) are a good description of the types of building blocks that are combined effectively by singlepoint crossover. A belief underlying this formulation of the GA is that schemas will be a good description of the relevant building blocks of a good solution. GA researchers have defined other types of crossover operators that deal with different types of building blocks, and have analyzed the generalized "schemas" that a given crossover operator effectively manipulates (Radcliffe 1991; Vose 1991).
The Schema Theorem and some of its purported implications for the behavior of GAs have recently been the subject of much critical discussion in the GA community. These criticisms and the new approaches to GA theory inspired by them will be reviewed in chapter 4.
Fitness Resolution Fortress
Learning About Fitness Resolution Fortress Can Have Amazing Benefits For Your Life And Success! Start Planning To Have Excellent Health And Fitness Today!
Responses

ren19 days ago
 Reply