Using Core Wars to Simulate Evolution

A "document" by Terry Newton, April 7, 2009


A simulated environment is constructed in which warrior programs for the game of Core War compete with neighboring programs on a grid, replacing the losing members with mutated copies of the winning members. Initially the programs are randomly generated and usually non-functional but over time natural selection favors changes which produce programs which are able to outwit their opponents, thus simulating the process of evolution and demonstrating how “intelligent creatures” can arise from simple beginnings. The grid structure simulates a world where many species of warrior programs co-exist and develop competing strategies, just as in nature eventually the world becomes filled with creatures mostly or completely descended from a single initial form yet the members diversify and become composed of significantly different program code. When a mutated program is significantly different from its parent program it is given a new species label, which can be used to color the display and visually track the rise and fall of individual species of warrior programs.

Further information:

I'm calling this a "document" because at 47 pages it's a bit more than a typical paper and it doesn't use traditional academic-style cross-referencing. The first half is structured mostly like a paper and presents an overview of the process, documents three runs of using RedMixer to evolve "tiny" warriors for coresize 800 with some conclusions, and ends with a resources page with links to related material. The second half of the document provides a detailed analysis of the RedMixer program code (hopefully useful for other makers of file-based evolvers) and a listing of the actual code, making it mostly a standalone work requiring only QBasic and pmars to replicate similar experiments. If running the program it's best to get the regular RedMixer zip package (which includes a compiled version and more documentation) but if necessary the program code can be extracted from the document itself.

Download the cwevol.pdf document (1.4 megs)

11/12/09 [edited 11/18/09] - The RedMixer evolver used for the experiments in the document was written in QBasic which requires a dos-friendly OS (such as Windows XP or DosEmu/FreeDos running under Linux) and stores each warrior in individual files which increases disk activity (thus almost requiring a ram disk). The program was modified to permit compiling with FreeBasic for better Windows compatibility and to allow running natively under Linux but this doesn't help reduce file system usage. Recently a couple of new evolvers were written (John Metcalf's TEV and Christian Schmidt's Maezumo) that use a more disk-friendly technique - keep the warriors in an array and write only the warriors doing battle. The temporary filenames are constant so disk head movement should be minimal to none, and the files remain mostly in the disk cache on modern systems, updating only a few temp files every few seconds. While I still recommend always using a ram disk for file-based evolving, at least with this new system it is less of a requirement than when 1500+ files are being written to. Other nice advantage of evolving using an array is it permits non-linear mutations such as swapping redcode lines, and in general makes mutations easier since they only have to alter array numbers rather than strings within lines of redcode.

With these newly learned techniques I re-implemented the main concepts used in RedMixer to make MEVO, a multi-threaded evolver that can take advantage of multi-CPU machines by launching several copies of pmars at once. MEVO's look and parameters are similar to RedMixer but there are a few differences - it doesn't use separate settings for the A and B fields, it does not evolve the end number (warriors start at the first instruction), the crossover settings are minimal and use only "attraction" mode (selecting mates with the most wins), it has automatic benchmarking and saving of better-scoring warriors and can re-insert past warriors back into the soup (valhalla), and it generates a log of the benchmark results which can be plotted graphically to indicate performance over time. MEVO is able to generate warriors that place well on the nano hill and readily generates "paper" warriors for larger core sizes although these warriors aren't competitive with strong hand-coded warriors - human cleverness gains the edge as the search space becomes larger. MEVO's performance is mostly equivalent to RedMixer but so far top scores for larger core sizes are not quite as high, probably due to some of the omitted parameters. This could be the subject of further research.

4/12/09 - Evaluating evolver performance (take 2.5). On the nano hills (coresize 80, maximum of 5 instructions) evolved warriors do quite well, capturing top slots on the SAL Nano Hill and with several examples placing near the tops of the Koenigstuhl Nano Hill and the Corewar Infinite Nano Hill. As coresize (and the search space) increases the performance of evolved warriors drop dramatically. Some evolved warriors are present on the SAL Tiny Hill, being a closed-source hill it's hard to tell how many are evolved but it's a fairly tough hill to crack (had to use an old evolved rock:-). Presently by my count there are 11 evolved warriors in the top 100 slots (about the top third) of the Koenigstuhl Tiny Hill (coresize 800, maximum of 20 instructions) representing several evolvers. Performance of evolved warriors on coresize 8000 hills is mostly dismal, by my count there is only one non-tweaked evolved warrior present in the top third of the Koenigstuhl 94-Draft Hill with only 4 evolved examples in the top two-thirds of the hill (the top 755 slots out of 1133 entries total). These linked text files summarize the performance of the top 20 evolved warriors (again by my count) for coresize 800 and coresize 8000 presently inhabiting the Koenigstuhl hills.

So what does it all mean? Well that depends on what aspect is being analysed. In an overall sense it shows that at least for smaller core sizes evolved warriors can outwit a fairly large fraction of hand-coded warriors. However, the strongest evolving programs such as CCAI and MicroGP impart human knowledge to the code they are evolving. This is fine for playing the hills, it's core "war" after all... pretty much anything goes, even changing minor bits of other warriors to improve performance is often acceptable (if it worsens performance it's just embarrasing). CCAI and MicroGP are quite impressive, but one of the reasons they're so strong is they partially specify what the instruction sequences should be based on the analysis of code found in human-written warriors. To me anyway, it is more satisfying when reasonably strong warriors can be produced using simpler techniques that do not involve contact with human-written code (except of course for testing the output), even if the results aren't as strong as the warriors produced by the super-evolvers. Not to mention being approachable by a simple-minded programmer such as myself. I consider evolvers such as RedRace, YACE, RedMixer, REBS, RedMaker, Fizzle, Sys4 and one of the first evolvers, GA_WAR, as closer to being demonstrations of the principles of evolution and natural selection but this may be my own personal bias. There really isn't a "best" when it comes to corewar evolvers, each are tools that in the right hands can produce viable results even if tweaks have to be done to the code... one of the strongest evolvers - RedRace - is decended from GA_WAR which is weak by today's standards but was like magic when I first encountered it.

Perhaps the most satisfying way to play the evolving game is to write one's own evolver, hopefully my document gives hints about how to approach the problem, or at least one way to do it. A few more hints...

Make changes throughout the warrior based on random chance(s), so that there's a chance multiple things might be changed and also a chance that nothing at all is changed. For example, when inserting or deleting a line it helps if there's a chance some of the numbers might get incremented or decremented to preserve function. When a strong form does arise, it helps if there's a chance it might replicate as-is so that the form is not erased with the next generation (but if the change chance is too low the soup just fills with copies of the same code). It's best to have variables for the chances of various changes so workable settings can be found without having to redo the code.

All that's needed to drive evolution is battle scores and propagating the higher-scoring code (replacing the lower-scoring code) while making random changes. In the event of a tie just pick one using whatever means desired (RedMixer simply considers the first warrior the winner if the scores are the same). If the basics don't work then something is wrong and adding crossover or other esoteric techniques probably won't help, consider putting enable switches on the fancy features so their effectiveness can be evaluated, quite often complexity makes it worse.

The evaluation process shouldn't be too strict as often it takes several changes to transform a warrior form into a stronger form, often with an intermediate weak phase. Grid-based evolvers side-step this to a degree by permitting strong and weak code to coexist and by using a random selection scheme that's just as likely to battle a warrior against a weaker opponent than a stronger one, giving even weak code a chance to mutate into something stronger. Experiments show that the number of battle rounds makes a difference, lower rounds provides a certain chance that the evaluation might be "wrong" and let the weaker warrior survive, and when the rounds is set too high in the early stages of evolution there is a tendency for the initial stone-like warriors to not transform themselves into replicators. In advanced stages of evolution better selection (and higher scores) might be achieved by using a higher number of battle rounds. More study is needed to study this effect. The grid topology is just one method of avoiding early stagnation, there are other ways but perhaps avoid very strict evaluation methods like benchmarking every warrior then deleting all warriors that score below a certain number.

4/9/09 - I set up a few informal off-line hills to help study the behavior of evolved warriors and the mixing of evolved and hand-coded warriors. The size 800 and size 8000 evolved-only hills have each been been seeded with the 20 strongest published evolved warriors from the Koenigstuhl hills, and I put what I want on the mixed size 8000 hill. Some interesting results... despite having a strong Wilkies score, Machines Will Rule got 3rd place behind a couple RedRace warriors. On the size 800 evolved-only hill White Noise takes top spot, that's no surprise but 084 which I had considered fairly strong placed dead last, wasn't expecting that but makes sense considering of all the evolved papers. [4/13/09 - the mixed hill is a work in progress... the idea is to not permit hand-coded warriors with a Wilkies score greater than the highest-scoring evolved warrior on the hill. This sounds like an unfair advantage but isn't, it's perfectly possible for the warrior in the top slot to have a relatively low or even the lowest benchmark score, it all depends on the mix of warriors on the hill. The score rule does however keep the mixed hill on a "beginner" level and ensures at least some mixing of evolved and hand-coded warriors.]

4/7/09 - The first draft was pretty good but contained a few minor issues. The following changes were made...

Corrected a few grammatical errors and missing words.
Added a brief definition for the term "soup" (the collection of warriors).
Added a clarification that usually the soup fills with warriors that use the same overall strategy, rather than refering to new stronger species as "competing strategies" (which as usually defined in broad core war terms is rare but occasionally happens), the phrase "competing forms" is used instead. A single overall strategy can be optimized and recoded by evolution in a variety of ways to form warriors able to prevail over the existing soup members.
Cleaned up the performance charts by removing the truncated portions of the benchmark warrior names.

[extra ramblings removed]

3/26/09 - Initial release.