# The Three Languages of GP

[This is a draft of an intro­duc­tory chap­ter of the book; expect some changes as I fin­ish up the first par­tial release. Also, I note some links and cross-references don’t trans­late to the blog directly.]

# Three Lan­guages

I like to say that suc­cess on GP project involves work­ing in three lan­guages. Look again at the 3×5 card, and you’ll see hints of them all there. I call them the Lan­guage of Answers, the Lan­guage of Search, and the Lan­guage of the Project.

## The Lan­guage of Answers: What?

A GP sys­tem itself doesn’t “think”. It’s a sys­tem for accel­er­at­ing the explo­ration of alter­na­tive answers to a formally-stated ques­tion. A sin­gle project will often shift to explore sev­eral dif­fer­ent ques­tions: mat­ters of whim­si­cal open-ended curios­ity or earnestly ded­i­cated pur­po­sive sci­ence. But you should pay atten­tion to only one at a time.

To be able to treat any par­tic­u­lar ques­tion using GP, and explore the vast num­ber of diverse alter­na­tive answers, you will need to write code that embod­ies your inter­ests and goals.

Set­ting up a GP project obliges you to do some cod­ing work. Usu­ally the major­ity of your work will be the design and imple­men­ta­tion of a domain-specific lan­guage. It doesn’t have to be very com­pli­cated, but it will need the flex­i­bil­ity and capac­ity to describe any inter­est­ing answer to your project’s ques­tion of the moment — the “smart” answers, and also the “dumb” ones. After all: your prob­lem is inter­est­ing because you don’t know which answers are smart and which dumb.…

I pre­fer to call the scripts you write in this domain-specific lan­guage — as inter­preted in the con­text of your prob­lem — “Answers”. In the GP lit­er­a­ture you’ll see them called “indi­vid­u­als” and “genomes”; those are his­tor­i­cally impor­tant terms, but they carry a lot of poten­tially mis­lead­ing metaphor­i­cal bag­gage. Here we’ll stick with the term Answers, to remind us that they are con­tin­gent on the prob­lem you’re considering.

You’ll have seen folks list­ing some of the poten­tial appli­ca­tion of GP, talk­ing about evolv­ing “pro­grams” and “strate­gies” and “puz­zle solu­tions” and “mol­e­cules” and “con­trollers” and “robots”… all kinds of com­plex actual things. As language-using humans, some­times we mis­take our rep­re­sen­ta­tions of con­cepts for the con­cepts them­selves. Remem­ber: a script doesn’t do any­thing until we run it on a par­tic­u­lar inter­preter or com­piler, and even then only with cer­tain vari­ables bound to mean­ing­ful values.

A strat­egy is a mean­ing­less poem until you invoke it in the con­text in which it was con­ceived; we can­not mean­ing­fully read a “pure strat­egy” with­out know­ing what war or game or busi­ness it was meant for. A real mol­e­cule is not a string of “ACGT”, or even a pretty col­ored pic­ture of lit­tle candy balls on sticks, but nonethe­less when we “evolve mol­e­cules” we’re evolv­ing balls on sticks or strings of let­ters… then inter­pret­ing those in some mol­e­c­u­lar sim­u­la­tion. A con­troller for a robot is only a string until you upload it to a phys­i­cal robot or a sim­u­la­tion so you can see what might hap­pen when it runs. A plan for trad­ing stocks is mean­ing­less — and risky — with­out also con­sid­er­ing the par­tic­u­lar his­tor­i­cal con­text and the spe­cific trade exe­cu­tion sys­tem for which it was devel­oped. And so on. An Answer needs both pieces of infra­struc­ture: a state­ment or script writ­ten in a domain-specific lan­guage (often one you design), and also a for­mal set­ting in which the func­tion embod­ied in a script can be expressed and explored meaningfully.

If the Answers in your project are sim­ple DNA sequence strings like ACGTCTAGCA..., you’ll also need to obtain (or write) a sim­u­la­tor that trans­lates those strings into pro­teins, or folds them, or tests them for tox­i­c­ity, or does what­ever a com­puter needs to do in order to deter­mine the salient aspects of their func­tion. If you want to evolve robot con­troller scripts, you’ll need a real or a sim­u­lated robot that can exe­cute your con­troller scripts and reveal their function.

This is true even for the sim­plest and most com­mon appli­ca­tion of GP1, sym­bolic regres­sion — fit­ting math­e­mat­i­cal func­tions to train­ing data. The most com­mon approach is to rep­re­sent these math­e­mat­i­cal equa­tions as S-expressions, a form famil­iar to many Com­puter Sci­en­tists who learned to pro­gram in Lisp. For exam­ple, ( + x ( / 2 9 ) ) is an S-expression rep­re­sent­ing the func­tion $y=x+\frac{2}{9}$.

But notice that the S-expression script ( + x ( / 2 9 ) ) is not in itself the math­e­mat­i­cal func­tion (unless you hap­pen to be run­ning a Clo­jure inter­preter in your head or some­thing). Even though it’s very close to the runnable code, it’s not fully an Answer until you express it by pars­ing and eval­u­at­ing its out­put value in an inter­preter — one in which $x$ has a num­ber assigned to it.2

Even when there’s a “general-purpose” GP-ready full-featured lan­guage avail­able — some­thing like Clo­jush or even a human-readable lan­guage like Java—you’ll usu­ally need to expand it with libraries or cus­tom code to include domain-specific vocab­u­lary. And for rea­sons we’ll dis­cover in the first project, some­times when you use a full-featured lan­guage, you’ll also need to trim back its capacity.

Focus for a moment on the phrase “domain-specific” and how it needs to cut both ways: You don’t typ­i­cally find for...next loops or set-theoretic oper­a­tions in sym­bolic regres­sion projects, because peo­ple are ask­ing for arith­metic Answers, and those peo­ple rarely see for...next loops in arith­metic. You can fit data algo­rith­mi­cally using loops and Boolean oper­a­tors and bit-shifting — after all, that’s how com­put­ers them­selves do it. But you won’t find a shift_right oper­a­tor in most off-the-shelf sym­bolic regres­sion pack­ages, because the Answers that arise which use it to explore the prob­lem would “feel weird”.

If you’re work­ing on a project where you want to explore string-matching algo­rithms to clas­sify DNA into genes and introns, your Lan­guage of Answers will prob­a­bly include some­thing about reg­u­lar expres­sions. Not a lot about sin() and cos().

If you’re work­ing on a project where you want to explore game-playing algo­rithms for a text-based dun­geon crawl, your Lan­guage of Answers will prob­a­bly include prim­i­tives like look and if and fight. And maybe if you’re fancy, you’ll roll in a library for cre­at­ing deci­sion trees so your adven­turer can learn. But again, not a lot of sin() or cos() hap­pen­ing in the ol’ Crypt of Creatures.

And just to prove I’ve got noth­ing against trigonom­e­try as such: If you’re work­ing on a project where you want to explore the set of plane geom­e­try dia­grams which can be con­structed using a straight-edge and com­pass, you will almost cer­tainly want some sin() and cos() float­ing around in the mix.

### So GP is “auto­mated” how exactly?

No escap­ing it. In almost every GP project, you will need to hand-code this Lan­guage of Answers. Both parts: not just the “scripts” but also the con­tex­tu­al­iz­ing sys­tem used to inter­pret scripts and express their func­tions meaningfully.

Does this seem like a lot of effort? It’s not, when you put it in per­spec­tive. Real­ize that when you explore a prob­lem with GP, you should expect to exam­ine mil­lions of alter­na­tive Answers. In tra­di­tional approaches to problem-solving, you might (if you’re Ever So Smart), be able to con­sider a few dozen— the ones you can keep in your head and note­books. Even if you use algo­rith­mic tools like lin­ear pro­gram­ming, real­ize they are para­met­ric explo­rations of dif­fer­ent con­stant assign­ments… within one Answer at a time.

If you want access to the mil­lions instead of the dozens, you need to put in the up-front work to pro­gram­mat­i­cally rep­re­sent the struc­ture of Answers, and also hook up the mech­a­nisms needed to express them func­tion­ally. That’s the invest­ment you make.

## The Lan­guage of Search: How?

“Lan­guage of Search” is my catch-all for the innu­mer­able tricks of the GP trade. I count any­thing that changes the sub­set of Answers you’re con­sid­er­ing, includ­ing ran­dom guess­ing and assign­ing them a score based on their per­for­mance in context.

There’s all the famil­iar biologically-inspired stuff like crossover, muta­tion, selec­tion, and the more fan­ci­ful manip­u­la­tions. And also the idiomatic tools we use to imple­ment learn­ing or evolv­ing or improv­ing: pop­u­la­tions, back-propagation, selec­tion, sta­tis­ti­cal analy­sis, 1+1 Evo­lu­tion­sstrate­gie.… Basi­cally any­thing and every­thing that reduces the amount of per­sonal atten­tion you need to pay to all those alter­na­tive Answers.

There is no par­tic­u­lar “right way” to use or com­bine these com­po­nents. They’re really all design pat­terns, and they are used dif­fer­ently in dif­fer­ent geo­graph­i­cal regions and schools; they are most like the mythic mar­tial arts styles you see in movies, and the par­tic­u­lar moves one school or Mas­ter may teach his stu­dents. But just as the mar­tial arts share a pur­pose (if not an atti­tude), the many parts of the Lan­guage of Search address one ques­tion: Based on what you have dis­cov­ered already, how do you iden­tify new Answers that will be more satisfying?

Every GP project uses selec­tion in one form or another, so let’s look at that more closely. Say we’ve built a GP sys­tem with a pop­u­la­tion of 100 Answers, and we want to design a process to pick “par­ents” in order to breed a new gen­er­a­tion. There are lit­er­ally hun­dreds of approaches, but here are four. We might:

• …pick two par­ents with equal prob­a­bil­ity and remove them from the pop­u­la­tion; breed them to pro­duce two or more off­spring; keep the two best-performing family-members (includ­ing, pos­si­bly, the par­ents), and replace those win­ning fam­ily mem­bers back into the population.
• …pick two par­ents ran­domly from the pop­u­la­tion, using a bias towards better-scoring ones; breed those two par­ents to pro­duce one off­spring, and set it aside in a new “gen­er­a­tion”; con­tinue (with replace­ment of par­ents) until you have as many in the next gen­er­a­tion as you did in the last.
• …pick two par­ents at ran­dom from the pop­u­la­tion, with uni­form prob­a­bil­ity; breed them, and return the par­ents and the off­spring to the pop­u­la­tion; con­tinue until the pop­u­la­tion size is dou­bled; destroy half the pop­u­la­tion, culling it back down to the size where it started.
• …pick ten dif­fer­ent indi­vid­u­als from the pop­u­la­tion with uni­form prob­a­bil­ity; choose the best one of that tour­na­ment to be the first par­ent; repeat for the sec­ond par­ent; breed, and then… (&c &c)

These are all per­fectly rea­son­able and prac­ti­cal ways of choos­ing answers to breed and cull from a pop­u­la­tion. Three of them have for­mal names, even. Occa­sion­ally one may feel “bet­ter” than another for a given project, but none is intrin­si­cally bet­ter in all situations.

My point in list­ing them is to high­light the obvi­ous fact that they’re all just recipes in a for­mal lan­guage: the lan­guage I’m refer­ring to as the Lan­guage of Search. The “prim­i­tives” in this lan­guage are things you can surely see in my ver­bal descrip­tions: eval­u­a­tion, sub­set­ting and sam­pling, breed­ing (itself a whole blan­ket process that usu­ally refers to “mix­ing up Answer scripts with one another”)… and of course the basic pro­gram­ming infra­struc­ture of iter­a­tion and con­di­tional exe­cu­tion and sort­ing.

All nor­mal com­puter pro­gram­ming stuff, though maybe a bit more sto­chas­tic than you’re used to. But note that the Lan­guage of Search isn’t lim­ited to code: There’s an impor­tant class of GP sys­tems known as “user-centric” or “inter­ac­tive”, in which a real live human being makes con­scious deci­sions as part of the algo­rithm. This is a valu­able tool for explor­ing mat­ters of aes­thet­ics and sub­jec­tive judge­ment. (And we’ll build some­thing like that in a later project.)

The Lan­guage of Search is huge, but it’s not oner­ous. While you almost always need to design and imple­ment your project’s Lan­guage of Answers, even the most “advanced” tools in the Lan­guage of Search toolkit are sim­ple in com­par­i­son. Things like “chop up a string and mix up the parts” or “change a token in a script to a ran­dom value” or “assign a score to an Answer by run­ning it in con­text, given spe­cific input conditions”.

When I keep say­ing GP is sim­ple, that’s what I mean: the Lan­guage of Search is sim­ple. It’s really just a big cat­a­log of small parts you cob­ble together, and there’s absolutely no rea­son you should try to learn all the tools any­body has ever tried, or use more than three or four basics in a given project.

And that would be your cue to ask: Why then does GP have a rep­u­ta­tion for being so hard?

## The Lan­guage of the Project: Why?

Almost all GP writ­ing focuses on the Lan­guage of Search, either spelling out new tools and algo­rithms, or hav­ing lit­tle bench­mark­ing con­tests between vari­a­tions. A bit of the writ­ing — mostly the­o­ret­i­cal Com­puter Sci­ence — touches on the Lan­guage of Answers under the head­ing “rep­re­sen­ta­tion theory”.

As far as I know, very lit­tle has been writ­ten about this stuff I’m call­ing the “Lan­guage of the Project”. Yet I argue it’s the most impor­tant of the three — not least because it’s the decid­ing fac­tor when it comes to pre­dict­ing whether a project will suc­ceed or fail.

The Lan­guage of the Project is the lan­guage we use to talk about our­selves, in our roles as part of the project. It’s the fram­ing we use to express what we want, and why. It’s our expres­sion of the rea­sons one Answer is more sat­is­fy­ing than another, and our con­sid­er­a­tion of the pos­si­bil­ity that no sat­is­fy­ing Answer exists. It’s the lan­guage we use to process the sur­prises GP inevitably throws our way.

Big chunks of my Lan­guage of the Project fall in the realm of well-studied dis­ci­plines: “user expe­ri­ence”, “project man­age­ment” and “domain mod­el­ing”. Why do I feel it’s impor­tant to con­coct a catch-all neol­o­gism just to lump together those esteemed fields for this spe­cial GP junk? Worse: why is a tech­ni­cal com­put­ing book about “arti­fi­cial intel­li­gence” get­ting all touchy-feely and psychological?

Sim­ple answer: Because peo­ple don’t like being surprised.

That may ring a bell, since when you check you will see that the sub­ti­tle of this very book is “The Engi­neer­ing of Use­ful Sur­prises”. And I specif­i­cally argued ear­lier that GP is “a pros­the­sis for accel­er­at­ing inno­va­tion” — inno­va­tion in the sense of sur­prises.

Yup. And that’s the biggest obsta­cle in the way of broader adop­tion of GP, and also the biggest obsta­cle you per­son­ally will have work­ing on your own projects: Peo­ple don’t like being sur­prised.

A lot of folks seem to have decided that GP is “auto­matic”; that it’s used for “auto­matic search”, or “auto­matic pro­gram­ming”, or build­ing “inven­tion machines” that spit out inven­tions that are of “human-competitive” qual­ity. Those folks won’t think my third Lan­guage is worth their attention.

To them, GP — and arti­fi­cial intel­li­gence more gen­er­ally — is a sort of self-contained box of magic think­ing stuff. I won­der if maybe those peo­ple have read post hoc reports of suc­cess­ful GP (or AI) projects, with­out con­sid­er­ing all of what hap­pens over the course of an actual project: a lot of non-artificial human think­ing, typ­ing, com­pil­ing, swear­ing, whiteboard-scribbling, and con­ver­sa­tion… fil­tered through a series of iter­a­tive pro­gram­ming attempts and argu­ments and writ­ing, until even­tu­ally an encour­ag­ing result was pub­lished. If you don’t count that as part of the project, then of course you shouldn’t think a GP (or AI) sys­tem includes the project team rewrit­ing the algo­rithms, or the plan­ning sketches, or the con­ver­sa­tions and read­ing, or the re-starts with dif­fer­ent set­tings to try to get more con­sis­tent results, or the sta­tis­ti­cal analy­ses try­ing to “tune” or “speed up” the thing, or even the story writ­ten down in the paper that describes what hap­pened.

And if you’re will­ing to draw the bound­ary around the sys­tem that way, in a way that leads you to think GP (or AI) is a self-contained magic box of think­ing stuff that peo­ple stand in front of and pat and hug and even­tu­ally coax intel­li­gence out of… well you ought to get started now, because time’s-a-wastin’.

But while you’re occu­pied in pat­ting and fos­ter­ing self-organized cre­ative urges, muse about it my way for a minute.

Recall that the Lan­guage of Answers is some­thing you will almost always build from scratch. It’s not just domain-specific, it’s often problem-specific. The only time you can get away with using a pre-cooked Lan­guage of Answers is when you’ve uncon­sciously selected a prob­lem that makes it eas­ier to stom­ach reuse, or reduced the domain-specific qual­i­ties to raw num­bers and true/false decisions.

Given that reminder: How do you design the con­stants, vari­ables and oper­a­tors to use in your project’s Lan­guage of Answers? Which instruc­tions will be more help­ful in mak­ing inter­est­ing Answers? Which will be too weird? How do you ensure every Answer will be syn­tac­ti­cally cor­rect, or seman­ti­cally con­sis­tent? Or do you have to? How do you know whether your Lan­guage of Answers is capa­ble of rep­re­sent­ing any sat­is­fy­ing Answer at all, let alone an “opti­mal” one? How do you tell and what do you do when your GP sys­tem is ignor­ing impor­tant tools you want to see it use?

Those are ques­tions from the Lan­guage of the Project. No mat­ter where you draw your sys­tem lines, a per­son needs to ask and answer these ques­tions. Every time, for every project, for every prob­lem. And a per­son needs to design and imple­ment the solu­tions to them, using the other tools at their dis­posal. None of that is “automatic”.

And you may also recall that the Lan­guage of Search is a bulging toolkit, full of lit­er­ally thou­sands of design pat­terns and rules of thumb for manip­u­lat­ing answers in context-dependent use­ful ways. I can describe six­teen muta­tion algo­rithms with­out break­ing a sweat; then you’ve got crossover, and sim­u­lated anneal­ing, and steady-state pop­u­la­tion dynam­ics, and demes, triv­ial geog­ra­phy, hill-climbing, ini­tial­iza­tion bias­ing, multi-objective sort­ing, par­ti­cle swarms, automatically-defined func­tions, ver­ti­cal slic­ing, age-layered pop­u­la­tions.… Any rif­fle through any GP book will give you fifty more.

Given that reminder: How do you pick the mech­a­nisms for search and learn­ing in your project? How do you know which com­bi­na­tion may be best or even use­ful for your prob­lem? What do you even watch in order to decide whether a GP search is “work­ing” or not? Should you let your cur­rent search run longer, or start it over again? If you start it over, should you change the para­me­ters a bit, or try a dif­fer­ent design pat­tern? What do you do when it gives you an answer that “solves the prob­lem” in a totally stu­pid way?3

A per­son needs to mind­fully adapt the struc­ture of the project to fit the dynamic con­text of their wants and knowl­edge, and man­age the sys­tem into giv­ing them the answers they will find satisfying.

My “Lan­guage of the Project” isn’t iden­ti­cal with user expe­ri­ence, or project man­age­ment, or domain mod­el­ing, or even their union. Those dis­ci­plines are admirable, but they are designed for unac­cel­er­ated human-powered projects.

### “Excuse me: What just happened?”

You write soft­ware. I know this, or you wouldn’t bother read­ing this far. If a project isn’t giv­ing you sat­is­fy­ing answers — whether it involves GP or not — then you (per­son­ally) need to check that it’s imple­mented cor­rectly. And when you’re con­vinced it is run­ning as intended, you then (per­son­ally) need to reflect and decide whether it’s doing what you want it to. And if you decide that it isn’t, then you (per­son­ally) need to either change how it’s writ­ten, or change what you think it’s for.

In non-GP projects — soft­ware devel­op­ment or finan­cial or home improve­ment or med­ical research projects — there’s a rea­son­able sense that one can “re-start”. But of course in the con­text of human-powered projects, “re-starting” is never mis­un­der­stood to mean “from the same ini­tial con­di­tions”. You (per­son­ally, with all the other human beings on your team) “re-start” hav­ing learned some­thing use­ful and help­ful. You intend to do some­thing dif­fer­ently the sec­ond time around, and you don’t have to con­cen­trate very hard on remem­ber­ing to change stuff.

This dif­fer­ence between you-before-the-first-try and you-after-the-first-try doesn’t get men­tioned, because it’s such a fun­da­men­tal fact of life that it goes with­out say­ing. But notice that you (per­son­ally) are under­stood intu­itively to be part of the problem-solving sys­tem before and after the “re-start”.

Just the other day I was work­ing on the code for a later sec­tion of this book: the part where we will evolve Conway’s Game of Life. I found that the GP sys­tem I started with was hav­ing a lot of trou­ble pro­duc­ing inter­est­ing answers. I worked a few days, try­ing to get it to do what I expected.

And then I real­ized that it had been work­ing the whole time. I mean totally work­ing. It gave me the best pos­si­ble answer, every time.

Only then did I real­ize that the ques­tion I was ask­ing was super bor­ing. There was only one right answer, and the GP sys­tem I built kept giv­ing me that answer. Imme­di­ately.

Now if you are one of the folks who want to think GP is a self-contained box of magic think­ing stuff, this might seem like a good out­come, and not a prob­lem. Who wouldn’t want an “opti­miza­tion algo­rithm” to give them The One Right Answer?

Well, me. And you, I expect.

I would sound like this, if I were on stage at the Amaz­ing Answer Machine Show: “Ladies and Gen­tle­men, I am think­ing of a spe­cial algo­rithm! I have pro­vided this, The Box of Magic Think­ing Stuff, with 512 carefully-chosen exam­ples and a col­lec­tion of use­ful tools, none of which in itself is the algo­rithm. By recom­bin­ing those tools in a very com­pli­cated way while I stand over here, The Box will now guess the func­tion I’m think­ing of in a mat­ter of mere moments.…”

A card trick. Boring.

What did I do then? I revised my notion of the project’s goals. I “re-started”, and in doing so I changed the story I’d been telling myself, the ques­tions I was ask­ing, and I expanded the Lan­guage of Answers accordingly.

The answer my GP sys­tem gave me was a sur­prise. One I wasn’t men­tally pre­pared to under­stand, not least because it hap­pened in a mat­ter of sec­onds where I was expect­ing it to take some time. When I finally parsed what it kept repeat­ing, I had a sec­ond sur­prise: the ques­tion I had asked was boring.

If I had been work­ing in a tra­di­tional unac­cel­er­ated way — with a white­board or a yel­low legal pad, chew­ing on the end of a pen and pac­ing with my hands behind my back like a think-tank car­i­ca­ture — I might have frowned and erased some stuff, or crum­pled up a page or two and made a cup of tea.

I wouldn’t have been surprised.

### Mixed bless­ings

Intro­spec­tion is hard. Most peo­ple, for what­ever rea­son, don’t like to ques­tion their assump­tions. They like cer­tain­ties and prov­able cor­rect­ness, famil­iar mod­els and known best prac­tices, math­e­mat­i­cal rigor pre­sented on a buoy­ant comfort-cushion of assumptions.

That’s what I mean when I say they don’t like to be surprised.

Sur­prises aren’t just pleas­ant eureka moments, they’re also the oh shit moments. GP can be use­ful as an “inno­va­tion pros­the­sis” because it short­ens the time between those eureka surprises.

GP feels com­pli­cated and dif­fi­cult and annoy­ing because it also short­ens the time between oh shit sur­prises. And it can’t tell the difference.

GP projects often fail because novices run into oh shit sur­prises before any eureka ones. They’re cul­tur­ally mal­adapted to cope with this dis­or­der: they’re often Very Smart Com­puter Sci­en­tists or early-adopter domain experts, and they can pick up some infor­ma­tion from the books or the nerds down the street, and they start dab­bling in what I’ve called the Lan­guages of Answers and Search.

But nobody ever tells them about these inevitable oh shits.

I going to focus on this cobbled-together “Lan­guage of the Project” exactly because of those issues. I’ve watched dozens of Very Smart engi­neery peo­ple dive in and (metaphor­i­cally) drown in GP. We need to erase the tra­di­tional bound­ary between what you think of as “the project” and you (per­son­ally), the “researcher”.

This is not to advance some Agilist social agenda, but rather as a cop­ing mech­a­nism. Your best and most use­ful habits as a Very Smart Per­son are based on your expe­ri­ences think­ing very hard and hand-coding solu­tions to prob­lems one at a time, and con­sid­er­ing a few dozens of alter­na­tives. With­out any thousand-fold enhancement.

I see it often: Smart per­son down­loads some pack­age; writes some code; fol­lows along with a tuto­r­ial and builds a GP sys­tem and—boom—it starts spit­ting out ten thou­sand reasonable-sounding solu­tions every hour. Already they’re way out­side the range of what their habits pre­pare them for. But they’re Very Smart, and so they look at the answers they have so far, and they fid­dle with some things and change some para­me­ters… and—boom—in an hour they have ten thou­sand com­pletely dif­fer­ent answers.

“What just happened?”

When it works, answers emerge from a GP sys­tem, in the sense of emer­gent behav­ior. Good Answers and bad ones. But real­ize they can’t emerge from a GP sys­tem of the sort I’m teach­ing you about — the sort that includes you (per­son­ally) as one of the com­po­nents — until you (per­son­ally) exam­ine those Answers and even­tu­ally decide you’re sat­is­fied. You can’t suc­ceed unless you can cope with the acceleration.

Here’s one of the core ques­tions in GP (and AI) research, a deep and trou­bling one that many man-years of research have been spent con­sid­er­ing: How do you know whether you should (a) keep a GP sys­tem run­ning, on the off chance it will get bet­ter soon and give you new unex­pected answers, or (b) stop it and start over from dif­fer­ent ini­tial conditions?

If you think GP (and AI) is a self-contained magic box of think­ing stuff: You don’t.

If you real­ize you’re a core com­po­nent in the GP sys­tem: Pick the one that is more sat­is­fy­ing to you at the moment, and try the other if that doesn’t work out.

And here is a deep-rooted prob­lem affect­ing all of search and opti­miza­tion, not just in AI but all com­pu­ta­tional approaches: How do you know a pri­ori which search tech­nique will pro­vide reli­ably bet­ter answers for a given problem?

If you think of the pro­gram as a self-contained box of opti­miza­tion tools (and magic think­ing stuff), the proven4 answer is: You can’t.

GP is sim­ple. Reg­u­lar old human-scale problem-solving is hard enough that peo­ple will tell you you’re a Very Smart Per­son if you demon­strate even occa­sional com­pe­tence. But cop­ing with a thousand-fold accel­er­a­tion will break your model of your­self and what you think you’re doing.

So. Let’s start breaking.

1. So com­mon that the old Wikipedia page for Sym­bolic Regres­sion now redi­rects to the one for Genetic Pro­gram­ming. Am I allowed to put a “facepalm” in a book?

2. I worry there’s a bit too much sub­tlety here: In some projects, an Answer may well be a for­mal func­tion that is not eval­u­ated with vari­able assign­ments — a project involv­ing alge­braic trans­for­ma­tions, for exam­ple. It’s the goal of sym­bolic regres­sion to fit par­tic­u­lar train­ing and test data; assign­ing those par­tic­u­lar val­ues is part of inter­pret­ing an Answer in that con­text.

3. Let me share a sym­bolic regres­sion result I was given by a sys­tem I was test­ing. I was just putting it through its paces, and so I was look­ing for func­tions that fit ten sam­pled data points from $y=x+6$. It came up with the per­fectly rea­son­able answer that started with $y=(2x — \frac{72x}{32x^2\div4x+\dots}$ and went on for four more lines after that. When I sim­pli­fied it, it meant the same thing as $y=x+6$, although along the way it added sev­en­teen con­stants together, mul­ti­plied them by 166, and divided by a huge num­ber to mul­ti­ply some extra terms by 0. This was the sort of sur­prise I mean.

4. This is an impor­tant result, and it pisses peo­ple off because it chal­lenges some of the same mod­els of self and project that I’m call­ing into ques­tion. It’s called the No Free Lunch Prob­lem for Search and Opti­miza­tion. Among other things, it demon­strates that for any per­for­mance cri­te­rion you can develop, the aver­age per­for­mance of any search algo­rithm — over all prob­lems — is no dif­fer­ent from the aver­age per­for­mance of any other algo­rithm.

## 3 thoughts on “The Three Languages of GP”

1. Thanks so much for shar­ing this. “This is a valu­able tool for explor­ing mat­ters of aes­thet­ics and sub­jec­tive judge­ment. (And we’ll build some­thing like that in a later project.)” My inter­est in GP is musi­cal, so I look for­ward to “later.”

Best wishes!