MEVO - a Multi-threaded Evolver

This evolver is designed to refine a few things...

Able to launch multiple pmars battles for better speed on multi-core systems
In-memory soup for less disk activity than a warrior-file-based evolver
Better compatibility with Windows and Linux systems (not dos or script)
Supports the "swap lines" mutation for evolving nano without length change
Optional selective crossover with adjustable granularity and proportion
Soup wrap-around is optional, can be enabled for X and/or Y directions
Automatic benchmarking and saving with optional re-insertion (Valhalla)
RedMixer-like interface for exploring the soup from within the program
Written in native FreeBasic for better code structure


Evolve starts evolution, pressing Esc returns to the interface.
The cursor is moved using the cursor keys to select a warrior.
List lists the warrior code, Run runs the warrior in pmarsv.
The 1 to 9 keys (on the numeric keypad) battles the warrior with a neighbor.
Test performs a benchmark test on the warrior.
Upon exit it saves all soup warriors and writes a soup.dat file for restarting.

Here is the MEVO source code, last modified 12/3/09.
Here is the documentation for MEVO, last modified 12/4/09.
Here is MEVO compiled for x86 Linux, code version 12/3/09, last modified 12/4/09.
Here is MEVO compiled for Windows, code version 12/3/09, last modified 12/4/09.

[these are for the original MEVO, a new version is now available... see below]

Notes - It often takes millions of battles and several days to produce competition-strength nano warriors, for larger core sizes don't expect anything beyond beginner-level strength. The included INI files have not been fully optimized. MEVO is a processor-intensive program and requires a computer and disk system that can operate continuously without overheating or other issues. Don't run MEVO from a thumbdrive or other limited-write "flash" disk such as those found in some netbooks. A ram disk is recommended, see the readme file in the package for instructions.

The Linux-compiled archive includes pmars and pmarsv binaries and benchmark sets, suitable for running from a Ubuntu live-CD or other x86 Linux systems. The included binaries are the slower but more compatible "N2" versions compiled using gcc 3.3, somewhat faster "N1" binaries are in the pmars092N.tar.gz archive. The Windows-compiled archive includes a version of "N2" pmars compiled using MinGW (gcc 3.2), however for size and other reasons the included pmarsv is the old dos 286 version, recommend replacing with the SDL "pmarsw" version.

The source can be compiled using the FreeBasic compiler, edit first to indicate platform (for Windows set both usewin and useuni to 0). Compile using option -mt to specify the thread-safe library, to enable bounds checking for debugging also include the -exx option (in addition to -mt). The Linux version must be run from a terminal (xterm, konsole, gnome-terminal etc), it checks to make sure TERM is set to something. The Windows exe can be run by double-clicking or for better performance run from a batch file that sets comspec to the binrun.exe utility (included in the package, tested using WinXP). All settings are controlled by an INI file, which is created if it doesn't exist. If soup.dat exists it's reloaded to continue a run, don't change instructions, soup size or pmars coresize etc during a run as the warrior code is saved in numerical form. To restart a run, remove the soup.dat file, the soup directory and if present the save directory and the mevo.log file.

More software, benchmarks and corewar info is at the corewar.co.uk and www.koth.org web sites.

MEVO2

It's been awhile since I played around... but got the bug and dragged out the old code to try some new things...

Added an option to evolve the core mutation rates for each individual warrior
Added an option to periodically increase the mutation rates to simulate "storms"
Allow using multiple bench sets to try to find warriors that are generally strong
Added an option to initialize warriors to a predefined instruction sequence
Added an option to only reinsert warriors that scored over a percentage of top score
Added an option to adjust local references after insert delete and swap operations

...not that these things have really helped so far :-) so leaving the original MEVO alone.

Other new features include...

Most filenames and directories can be specified in the INI file
Can point tempdir to a ramdisk without having to copy the whole thing
Can now restore directly from the soup directory, the .dat files are now optional
Option to periodically save the soup to avoid losing a run if something crashes the lights go out
Now uses a hash function to detect and avoid saving dups rather than relying on identical bench score
 (which only worked for nano set to 142 rounds to trigger the -P option anyway)
Somewhat better parsing when loading soup and save warriors, now understands pmars "load" format
Can now seed the soup directory with other evolved or human-made warriors
Reinsert no longer requires the saved warriors to be in sequence (but they still have to be numbered)
Added some debugging options - one option lists the last source/destination/cross code for each thread
Modified the plotlog utility so it displays stats and automatically refreshes the display
Added the (Linux-specific) blassic scripts I use for benchmarking to the readme file
Added more nano test sets - nanoht2 and nanoht3
Added bash shell scripts for listing the top warrior and last saved warrior, with automatic refresh
Added a bash script for archiving a run and deleting files to start over
 (have to do that a lot, most runs end up being dead ends)

Here's MEVO2 for 32-bit/x86 Linux - code version 3/2/15, last archive update 3/5/15

For 64-bit Linux enable multiarch and install a few i386 libraries.
For Windows, recompile with FreeBasic and use the pmars/pmarsv/binrun exe's from the original package.


3/15/15 - Updated the mevo2 archive - fixed typos in readme.txt and mevo_nano.ini. NanoCracker made KOTH on the Koenigstuhl Nano hill - wasn't expecting that but I'll take it. Possibly because it was selected using my benchallnano script (in the readme of the new package), this script tests a directory of warriors against several bench sets and sorts them by average score to find warriors that score well against all of the bench sets - usually the top.red warrior is not the overall best but just happens to score well against one bench set. BTW the funny case in NanoCracker wasn't just to look funny, dups in the instruction and modifier lists were spelled differently so I could see mutations that happened to pick the same instruction.

3/3/15 - MEVO2! Transfer Function got knocked from KOTH.. not a "this is war" moment (it had a very good run!) but it surprised me how strong bvowk's new evolved warrior was on the hill - around 160! He's doing something right. Time to experiment. One of the more interesting mods I made was added an array so it could track mutation rates for each warrior, and evolve them along with the warrior using similar methods used to evolve the warrior data values.. gradual increases/decreases and sometimes pick something else. I noticed that the warriors prefer lower rates than I usually set the evolver for.

Here's a new warrior I submitted to Sal Nano (made with this INI file)...

;redcode-nano
;name NanoCracker
;author Terry Newton
;strategy evolved by mevo2 on 2015-02-21
;strategy orig name 30_1/2048.red, gen 3896
;strategy nanoht2:152 nanobm07:154 nanob2:154
;generation 3896
;species 808
;origin 7_21
;assert CORESIZE==80
spL.i #-38,>-9
mOV.i <-1,{-1
MoV.i {-2,<-2
mOv.i >-34,{-3
dJN.I $-2,{-4
end
;soupwins 11
;instrate 0.01045615
;modrate 0.007155502
;moderate 0.005
;datarate 0.0051
;insrate 0.005797357
;delrate 0.01031488
;swaprate 0.0063132
;benchscore 152.87 (nanoht2 142 rounds)

And the bench scores against my new nanoht3 50-warrior test set...  (142 rounds, score, wins, ties)

02_08.red                 148.59   68    7     **************
0535.red 156.33 71 9 ***************
b1f3ad11-a6073e-f711ff69 191.54 88 8 *******************
b69e6908-a5be0bba-ab4445 142.25 66 4 **************
rdrc: Breakaway Carte 142.25 59 25 **************
c82f15b5-85011fd8-5a969d 165.49 78 1 ****************
90ae3a01-58e6f8d2-f52b2a 185.91 84 12 ******************
[RS] Cefaexive Motween 163.38 74 10 ****************
Clear-2 145.77 65 12 **************
Cosmic Uncertainity 162.67 75 6 ****************
Creamed Corn 156.33 73 3 ***************
Crimson Climber 145.77 68 3 **************
[RS] C-Strobia 164.78 75 9 ****************
Dodecadence 166.90 73 18 ****************
eerie glow 135.91 58 19 *************
e6843-5724-xt430-4-nano- 117.60 41 44 ***********
[RS] EvoTrick II 164.08 76 5 ****************
[RS] Existephall Apris 143.66 59 27 **************
Fragor Calx 166.90 76 9 ****************
Frothering Foam 133.09 58 15 *************
Furry Critter 142.95 65 8 **************
girl from the underworld 128.87 60 3 ************
hemlock 174.64 76 20 *****************
Hot Soup 157.04 70 13 ***************
JB268 145.07 67 5 **************
Left Alone 159.15 74 4 ***************
looking glass 166.19 76 8 ****************
Man&Machine 176.05 82 4 *****************
Military Grade Nano 149.29 69 5 **************
a slice of moonbeam pie 169.01 78 6 ****************
Muddy Mouse (RBv1.6r1.1. 154.22 71 6 ***************
Obsidian peasoup 117.60 51 14 ***********
Pacler Deux 135.91 58 19 *************
Plasmodium vivax 184.50 84 10 ******************
Little Red Rat 155.63 72 5 ***************
ripples in space-time 159.85 70 17 ***************
Rocket propelled monkey 154.22 71 6 ***************
rumpelstiltskin 128.87 52 27 ************
Science Abuse 153.52 70 8 ***************
Shutting Down Evolver No 155.63 72 5 ***************
Sleepy Lepus 157.74 70 14 ***************
rdrc: Snapback Sprite 174.64 81 5 *****************
the spiders crept 151.40 63 26 ***************
Steaming Pebbles 146.47 67 7 **************
Stegodon Aurorae 148.59 67 10 **************
Subterranean Homesick Al 150 63 24 **************
Transfer Function 140.14 62 13 **************
war of the snowflakes 131.69 51 34 *************
White Moon 161.97 74 8 ****************
wrath of the machines 187.32 84 14 ******************
Score = 154.35

And the results of a simulated hill using the same warriors...

1   156.91  b69e6908-a5be0bba-ab4445f7.rc by bvowk 
2 154.22 NanoCracker by Terry Newton
3 152.62 [RS] C-Strobia by inversed
4 152.25 Transfer Function by Terry Newton
5 151.46 eerie glow by John Metcalf
6 150.86 Hot Soup by Terry Newton
7 148.61 Crimson Climber by Terry Newton
8 148.37 Frothering Foam by Terry Newton
9 147.94 Little Red Rat by Terry Newton
10 147.81 Steaming Pebbles by Terry Newton
11 147.59 02_08.red by Terry Newton
12 147.55 Sleepy Lepus by Terry Newton
13 147.39 Pacler Deux by Roy van Rijn
14 147.00 [RS] Existephall Apris by inversed
15 146.86 ripples in space-time by S.Fernandes
16 146.81 Fragor Calx by Terry Newton
17 146.10 Creamed Corn by Terry Newton
18 145.38 Science Abuse by Roy van Rijn
19 145.09 Shutting Down Evolver Now.. by Roy van Rijn
20 145.04 hemlock by John Metcalf
21 144.86 Obsidian peasoup by Miz
22 144.82 White Moon by Fluffy
23 144.53 Furry Critter by Terry Newton
24 144.53 a slice of moonbeam pie by John Metcalf
25 144.15 the spiders crept by hwm
26 143.78 Cosmic Uncertainity by Terry Newton
27 143.68 0535.red by Terry Newton
28 143.33 [RS] EvoTrick II by inversed
29 143.33 Military Grade Nano by Ken Hubbard
30 143.27 Left Alone by Fluffy
31 142.77 [RS] Cefaexive Motween by inversed
32 142.65 Muddy Mouse (RBv1.6r1.1.231) by The MicroGP Corewars Collective
33 142.58 Man&Machine by Roy van Rijn
34 142.55 wrath of the machines by John Metcalf
35 140.90 Dodecadence by G.Labarga
36 140.04 JB268 by Terry Newton
37 139.94 c82f15b5-85011fd8-5a969d25.rc by bvowk
38 138.63 Rocket propelled monkey II by G.Labarga
39 136.99 rumpelstiltskin by gnik
40 136.35 b1f3ad11-a6073e-f711ff69.rc by bvowk
41 136.31 looking glass by John Metcalf
42 136.20 Stegodon Aurorae by S.Fernandes
43 135.52 e6843-5724-xt430-4-nano-eve78 by bvowk
44 135.30 girl from the underworld by John Metcalf
45 134.75 Clear-2 by Roy van Rijn
46 132.42 rdrc: Snapback Sprite by Dave Hillis
47 131.01 rdrc: Breakaway Carte by Dave Hillis
48 129.17 Plasmodium vivax by Fluffy
49 127.83 war of the snowflakes by John Metcalf
50 126.11 90ae3a01-58e6f8d2-f52b2a83 by caelian
51 125.46 Subterranean Homesick Alien by Roy van Rijn

On the actual Sal Nano hill NanoCracker placed 3rd... this set doesn't exactly replicate hill conditions.

12/10/11 - if running MEVO under a 64-bit version of GNU/Linux make sure the libncurses5:i386 package is installed.
12/26/11 - libxpm4:i386 is also required to use the plotlog program.

12/5/09 - a nano MEVO warrior called "Transfer Function" got KOTH at the SAL Nano Hill. I guess it works :-)

I've been using the nanobm07 and nanob2 benchmarks to pre-test warriors, but lately they've been a poor predictor of hill performance. So I put together a custom 30-warrior benchmark (in nanoht1.zip), not perfect but seemed to hit the hill scores within a few points (past tense since it's a moving target). I started with a bench set of all the current hill warriors I could find, then for each warrior if the bench score was too high compared to the hill score I added warrior(s) that it did poorly against, and if too low added warrior(s) that it did well against but reducing the scores of other warriors that were checking too high. Here's the resulting (dressed up) warrior...

;redcode-nano
;name Transfer Function
;author Terry Newton
;strategy evolved as 20_7 by MEVO, generation 696
;strategy selected using custom benchmark selected
;strategy to mimick hill performance, scored 154
;strategy nanobm07 148, nanob2 153
;generation 696
;species 6523
;origin 68_15
;assert CORESIZE==80
spl.i #23,}2
mov.i >10,}1
mov.i >-34,}-2
mov.i #32,}-2
djn.f $-4,<-36
end
;benchscore 154.06 (nanoht1 142 rounds)

...and here's a link to the mevo.ini file I used to make it. Around 1.7 million iterations I set reinsertmode and reinsertinterval both to 1 and ran for a short time to concentrate the soup, by then the top score was already around 154, bumped it up to 155. Put back to normal, ran a bit longer then settled on one of the new varients for its relatively high nanobm07 score. Here's the plotlog chart of the run...


12/4/09 - Updated the source, docs and packages for the 12/3/09 code, which adds a couple of subtle options to control selection of crossover mates - anychance (default 0) controls the chances of crossing with any warrior rather than just warriors of the same size and species, and bestchance (default 1) controls the chances of choosing a warrior with the most wins. Usually crossing dissimilar warriors results in broken code, but a certain amount of that can result in overall improvement by providing additional chances for weaker warriors to win battles and reproduce. Always choosing the warrior with the most wins might be limiting. Whether or not these changes improve the scores (I don't know yet), it feels better if warriors are free to select any warrior as the odds dictate. The Linux version contains an updated version of the plotlog utility that better auto-detects parameters from the log file and draws the average score curve using lines instead of points. The plotlog utility has not been updated in the Windows version, I really need a different approach as the present code can only run full-screen and to make up for that dumps to a bitmap file (using somewhat scary code), would much prefer a solution that can display graphics in a window then if a copy is needed it can be obtained using normal copy/paste functions.

11/29/09 - Updated the packages to the 11/27/09 code, also rewrote a lot of the docs (was going for less but ended up with more, oh well). Part of the doc update clarified the disk-write paranoia section to include a rough formula for calculating the lifetime of a solid-state disk with MEVO running on it - for a cheap MLC-type disk like in some netbooks it worked out to about 100 days before failure (under good conditions). That's why I include a warning (MEVO doesn't produce all that much disk activity but those things are very fragile). Then I computed the lifetime of a real server-grade SSD with an endurance of a million writes per block and 5GB free for leveling and it came out to 900 years. I don't think there's much concern about wearing those things out.

11/27/09 - One of the things I like about RedMixer is it can record a run as a sequence of ANSI soup dumps which can be played back at high speed, compressing hours and days into seconds. ANSI has drawbacks though, in raw form it's hard to control the playback speed unless a custom ANSI viewer program is used, and it's difficult to convert the output file to a standard animated GIF - the one app that did that obscure thing no longer works on my system. This time I took a different approach, rather than modifying the evolver to save snapshots I wrote a script that uses ImageMagick's import utility to save periodic snapshots to a sequence of GIF files which can then easily be combined to produce an animation. Here's the imgdump.sh script I'm using, once the individual GIFs were obtained (about 2.5 hours of run time) they were combined into a 2.8 meg animated GIF using ImageMagick's convert utility (command line: convert -delay 15 -loop 0 dump*.gif mevo800u_1.gif). The main drawback of this approach is the evolver window has to be displayed and unobstructed to record (I used the wmctrl utility to focus before each dump), this makes it difficult to use the computer for anything else while recording a run.

I added end number (starting location) mutation to the "test" source and docs, this is the last major thing RedMixer has that MEVO didn't have. Don't know yet if it will help or hurt or make no difference, but that's what research is all about. I'd been putting off this change due to its perceived complexity and because it required a change in the soup.dat format, but ended up being easy and compatible with previous data files. The hard part is dialing in the settings... [sample INI's in the docs are guesses]

11/25/09 - Minor update to the packages - added an altini directory under the benchtools directory so other core sizes can be benchmarked without editing the INI files, instead I can run use_tiny, use_standard etc to copy over an appropriate set of INI files for the core size being used. The primary mevo.ini for evolving must still be manually configured by deleting the existing mevo.ini and copying one of the examples to mevo.ini but that's easy enough, and I'm not satisfied with the INI's for larger core sizes (so I don't want to give the illusion that they work:-). MEVO is a code-evolution research project and a big part of the ongoing research is determining optimal evolving parameters and also determining what parameters can be omitted. It usually takes about 2 days to perform a run to "saturation", and each set of parameters takes at least 3 tries to be statistically significant. I'm gathering a lot of data but so far most of it only indicates what doesn't work all that well. Evolving strong nano warriors is fairly easy but it is quite difficult to produce even half-competitive warriors for larger core sizes. Of course that's what makes it interesting to try. For these experiments absolute strength is only an indicator, not the goal (good thing!), there are other evolvers (Maezumo, CCAI, Species, etc) that use more direct methods for producing stronger warriors however my goal is to not use direct knowledge of corewar techniques, but rather to study the creation of code where the optimum code sequences are not specified. Here is a weak but interesting warrior that was produced during one of the tiny runs...

;redcode                                                                       
;name 22_16
;author mevo
;strategy evolved
;generation 63
;species 141
;origin 40_7
;assert 1
spl.a #17,<-372
spl.b {124,#-285
mov.i >-5,}-1
end

11/21/09 - Updated the packaged Linux and Windows versions to the 11/19/09 code, tbe new options seem to improve things slightly (at least for nano, more testing is needed with larger coresizes) and the changes are fairly minimal so going for it. If there are any flub-ups with the new packages here are the previous Linux and Windows versions that use the 10/02/09 code.

11/19/09 - I've noticed that MEVO isn't quite as strong as RedMixer, especially for larger coresizes. It's very close but some of the things I omitted to simplify might have an effect so I'm experimenting with adding a few things. The "test" source and docs now include new INI options xwrap and ywrap to enable wrap-around, and a flipstart option to select the odds of starting the crossed warrior with the warrior that just won the battle as opposed to the mate (previously it was 50/50 but it might work better if it starts with the winner more often).

11/1/09 - Mostly fixed the (lack of) speed issue with Windows by making my own "fast" command interpreter, written in FreeBasic at that. The 1000 shell test now turns in 18 seconds, compared to 47 seconds when using the default "cmd" interpreter. The "binrun" program also works with Maezumo and should be adaptable to almost anything that shells binaries, just wrap it in a batch that sets comspec to the runbin.exe file. At the moment it has only been tested with Windows XP. The fallback feature assumes that the normal cmd.exe interpreter is in %windir%\System32, if not then it won't process anything except for binary shells.

The Windows version of MEVO now comes with binrun, it's still not as fast as the Linux version when evolving nano warriors but good enough for now, possibly approaching the limits of the platform. Can be made a bit faster by setting pmars in the INI to bin/pmars.exe instead of just pmars so it won't have to search the path for it.

10/31/09 - Updated both the Linux and Windows versions to include mods to the bench tools - the "findbest" program that cross-references two benchmark reports now can read settings from a file to permit scripting without having to type in filenames, now it works by just double-clicking after running the "bench1" and "bench2" tests.

I'm honing in on the shell speed issue - I found that on my WinXP system if I modify the startup batch file to include the line "set comspec="%windir%\SYSTEM32\COMMAND.COM" along with a "set mode lines=25" to avoid screen flicker the shells are over 50% faster... when applied to the shell test program it turns in about 30 seconds for 1000 shells, down from 47 seconds. This isn't exactly a desirable solution but it is faster despite having to shift to 16 bits then back to 32 bits with every shell (that seems to be what causes the flicker and surely is unnecessary overhead). I'm now searching for a 32-bit cmd replacement that can perform faster shells but so far not having luck - seems that the priority with add-on shells is with the GUI, more commands and other interactive features, I see no evidence that much thought is given to how fast a simple "%comspec% /c program > file" can execute. What I really need for shell-based evolvers like MEVO, Maezumo, etc is a very simple command interpreter that only knows how to (quickly) run a program and redirect its console output to a file.

10/27/09 - To try to get to the bottom of the shell slowness problem when evolving nano warriors under Windows I wrote a little test program...

100 print "Shelling pmars nano 1000 times..."
110 print TIME$
120 for i = 1 to 1000
130 shell "bin\pmars -bks 80 -p 80 -c 800 -l 5 -d 5 -r 100 top.red > temp.fil"
140 next i
150 print TIME$
160 kill "temp.fil"
170 shell "pause"
180 system

I ran this under Blassic, FreeBasic and QBasic, and under Linux using Blassic and (patched) FreeBasic on my 1.4ghz AMD system. Results...

Windows Blassic: 48 seconds or about 21 shells per second
Windows FreeBasic: 48 seconds or about 21 shells per second
Windows QBasic: 23 seconds or about 43 shells per second
Ubuntu Blassic: 7 seconds or about 143 shells per second
Ubuntu patched FreeBasic: 7 seconds or about 143 shells per second

The most significant thing about this test is the problem is NOT with FreeBasic, Blassic turned in identical results. QBasic does better but not by as much as I thought. I think what's going on is shell has to load the command interpreter, cmd in the case of Win32 and command for QBasic, this appears to dominate the run time. Linux is based on programs shelling programs, the time it takes to load an sh shell is insignificant so it's clearly the OS of choice for shell-based evolving. Open Pipe in native FreeBasic is probably slightly faster since it doesn't have to write a temp file for scores.


So... [edited 10/28] the only practical code-based solution I can think of for faster nano evolving with MEVO under Windows is to modify pmars or exmars so that it writes a file, eliminating the need to load a command shell. Should be possible to execute the binary directly. Theoretically. Another option used by some evolvers is to embed the mars simulator in the program itself, totally eliminating an external binary. FreeBasic permits linking external libraries into the binary, but such a solution would not work for MEVO - the whole idea behind MEVO's multi-threaded approach is to tell the OS to run pmars as a separate task so it can be executed in another core on a multi-core machine. This is really multi-tasking as the way I understand it, multiple threads in the same task still only run on one core, at best using "HyperThreading" to speed up execution.

[but... since cmd.exe seems to be the holdup, a potential environment-based solution is to replace it with something faster rather than trying to change the program code]

10/13/09 - I packaged MEVO for Windows. Results are mixed - nano evolving is about 7 times slower under Windows XP on my hardware but it's probably OK for casual use and larger coresizes aren't impacted nearly as much. To keep the size down the included pmarsv and plotlog binaries are dos-based. The pmarsv binary can be easily replaced with the much better SDL version, but I had no luck making a true Windows version of the plotlog utility - could not get FreeBasic graphics to work (will probably take using OpenGL or other libraries, not QB methods). The dos version of plotlog (assuming it runs at all, works under XP) can only run full-screen, making it difficult to save the graphics output so added a bit of code to dump the graphics to a BMP file. Well... now I know what the limitations are. For obscure software like this it doesn't matter that much but occasionally I get called on to (quickly) write Windows programs so knowing what works and what doesn't and how to work around the issues is very useful for future reference.

10/5/09 - In further tests a tiny run without crossover has reached a Franz score of 175 by almost 2M iterations (mostly stones), whereas a run with crossover enabled is only up to a score of 136 by about 1.5M iterations (mostly papers). This is in contrast with previous results. Side by size nano runs didn't show a whole lot of difference, both produced warriors scoring around 150 with the run with crossover enabled doing a bit better against the nanobm07 benchmark and a bit worse against the nanob2 benchmark. A pair of coresize 8000 runs produced mostly stones (usually I get more papers), the run without crossover reached a Wilkies score of 84 and the run with crossover reached a score of 93. It appears that random forces have more of an effect on the quality of the evolved code than whether or not crossover is enabled (at least with the stock settings), confirming previous observations with other evolvers that this style of redcode crossover doesn't make a whole lot of difference. It'll take many more runs to draw statistically valid conclusions but it's fairly clear that it's not a magic bullet, just another kind of mutation to try that might or might not help.

10/3/09 - The "test" source and docs have been updated to the version with crossover [...]. The crossover scheme is fairly simple, picks another surrounding warrior of the same size and species, of the available candidates randomly selects the one with the most wins. The warrior that just lost is excluded from selection. Starting source is random. It only flips the source on lines, controlled by two rate variables. If no suitable mate is found it just mutates. Hmm... tiny evolving is much better now, in an overnight run reached a Franz score of over 160 versus about 155 without crossover after a much longer period of evolving [...]  the main tar.gz package is now updated with new code, docs and tools. If there are any problems, the previous 9/22/09 compiled package (without crossover and tools but fairly well-tested) is here.

10/2/09 - One easy way to make a standalone benchmark solution is just compile Blassic code with FreeBasic (the INI-based version similar to the one in Maezumo, obviously if compiled the source can't be edited to specify parameters). This provides the best of both worlds, standalone operation or to save space, or delete the binaries and run the programs interpreted. [...]

I'm not exactly thrilled with tiny warrior evolving performance... have to try really hard to get even 150 Franz points, only in the 90's against tinybm04. Not good. With RedMixer I had at least one run that produced warriors scoring in the 170 Franz range. So the question still remains - does crossover produce a tangible benefit? Or was it just luck? I'd say for nano it doesn't matter that much, but there might be an effect for larger coresizes. I am considering adding crossover to MEVO.

In RedMixer the only crossover mode that seemed to work well was crossing with the same species and choosing a surrounding warrior with the highest accumulated score, otherwise too much broken code was produced, making it worse than no crossover at all. If crossover is added to MEVO I want it to be simplified, avoiding options known (now) not to work well. Also to keep from cluttering the INI too much. Redmixer only permitted 50/50 (on average) mixing, I'd like to be able to control the proportions. Keeping track of accumulated score is probably unnecessary, for selecting the mate it's probably enough to simply track the number of wins and pick the surrounding mate of the same species with the most wins - high accumulated score often just means the warrior is surrounded by weak code. No real need to save the wins information, when restarting crossover might not be as effective for the first few cycles but I doubt that has a significant effect, at least not as much of an effect as changing the file formats which I'd like to avoid (INI doesn't count since if the settings are missing it'd just default to no crossover). RedMixer could cross both lines and individual items but the latter is dubious, numbers usually depend on the instruction using them so could go with line only. So... might be able to implement crossover using only settings for enablecross (yes/no), [edit] flipmaterate (chance of changing to mate) and flipbackrate (chance of flipping back).

9/29/09 - Updated plotlog with better error checking and an option to autodetect the X iteration and Y score settings from the log file rather than entering numbers (defaults used for the other variables) [updated again 9/30/09(B) with better default operation and to fix an issue with FB - doing SCREEN 0 then exiting was leaving a hung task so took out that part]. It's slightly more refined now but still crude and rude, some inputs will result in a messy (or no) display. Not a problem, proper inputs look fine. As far as MEVO itself, I'm fairly happy with its performance and have no immediate plans for changing anything in the program code itself (unless I find a bug). Could be a bit stronger (especially for tiny warriors), but there's a lot of experimenting left to do with the INI settings to try and improve strength before considering hacking in more stuff. There will probably be updates to the pre-compiled Linux archive to add in the log plotter, hopefully better INI files, and some kind of benchmarking solution. The Blassic-based nanotest package works (for nano) but I'd prefer a self-contained compiled menu-driven solution. Something that can automatically run a dual benchmark and report the warriors that do well against both test sets, and doesn't require another download to use.

9/28/09 - Here is plotlog.bas for graphing a MEVO log file. It's written in QBasic syntax but can be compiled using FreeBasic's -lang qb -e options [for Windows has to be run in QBasic, full-screen only]. The Linux-compiled version doesn't require a terminal, on my Ubuntu system I can double-click the binary and the current directory gets set right, but this isn't true of all versions of Linux so a script with cd `dirname $0` etc might be needed.

The following charts show the results from a couple of runs...



The X axis shows millions of iterations, the Y axis shows the benchmark score. The blue line shows the average peak score, the brown line holds the top score, the red dots show the average score (note that it oscillates), the purple dots show individual score results. The first run was sampled every 1000 iterations, with averages computed over 100 samples (100,000 iterations), the second run was sampled every 500 iterations with averages computed over 50 samples (25,000 iterations). The first run used the nanob2 test set (the top-20 Koenigstuhl warriors), the second run used the nanobm07 test set.

The program prompts for the details when starting...


Nothing fancy, just enter for the default or enter something to change. X iterations and Y score define the scale maximums, averaging size is the number of data points to integrate for the averages, the increms determine the chart lines and label points. [updated 9/30]

Here are the top-scoring warriors at the ends of those runs...

;redcode                                  ;redcode
;name 3_20 ;name 45_19
;author mevo ;author mevo
;strategy evolved ;strategy evolved
;generation 516 ;generation 532
;species 4659 ;species 1719
;origin 37_15 ;origin 32_12
;assert 1 ;assert 1
mov.f <-7,$-36 mov >23,$-36
spl.i #-42,>18 spl.ab #-5,<-40
mov.i *-3,{-2 mov.i >23,{-1
mov.i {-3,{-2 mov.i >-33,<-2
djn.i $-2,$-3 djn.i $-2,@-2
end end
;benchscore 156.37 (nanob2 142 rounds) ;benchscore 149.71 (nanobm07 142 rounds)

Both of these warriors show "focus" from re-insertion, scoring in the low 140's against the other test set. Not so great but it represents typical results - it usually takes several runs to produce a warrior able to place near the top of the SAL nano hill. Also, the top.red warrior is often not the overall best, in these runs other warriors in the save directory scoring slightly less against the evolving test set scored much better against the other test set - the findtop.bas program in the nanotest package (see 9/23 entry) compares scores from two benchmarks to find the "best" warriors. These look like they would do a little better (modified to add extra info to comments)...

;redcode                                  ;redcode
;name 7_3 ;name 57_6
;author mevo ;author mevo
;strategy evolved ;strategy evolved
;generation 456 ;generation 368
;species 2960 ;species 4926
;origin 37_15 ;origin 32_12
;assert 1 ;assert 1
mov.i }-9,$-33 mov <-1,$-33
spl.ba #-42,>18 spl.a #-6,<-40
mov.i @24,{-2 mov.i *19,{-1
mov.i {-3,{-2 mov.i >-28,<-2
djn.i $-2,$-3 djn.f $-2,$-3
end end
;09-23-2009,iter 1147286,save/181.red ;09-16-2009, iter 862022,save/162.red
;benchscore 154.01 (nanob2 142 rounds) ;benchscore 148.32 (nanobm07 142 rounds)
;otherscore 149.36 (nanobm07) ;otherscore 152.78 (nanob2)

The INI files used to make this are similar to the 9/15 nano INI's, the first run had 3 .i modifiers instead of a space replacing one of them, benchinterval set to 1000. The second run had rounds set to 150, modrate to .04 and swaprate to .01.

9/23/09 - Here's a possible benchmarking/selection solution for nano warriors - mevo_nanotest.tar.gz contains adaptations of the warbench program (requires Blassic) for testing the soup and save directories against the nanobm07 and nanob2 test sets. Another pair of scripts reads the bench reports and generates new reports to find warriors that score well on both test sets. One possible (but simple) way to do it. Although configured for nano warriors and MEVO's directories, it would be trivial to adapt these to other coresizes and apps.

9/22/09 - The 9/20/09 code seems pretty stable, no more major features are planned in the near term, so archived a pre-compiled Linux version of MEVO (or at least one way the program can be packaged). The archived directory tree contains pmars and pmarsv binaries and benchmark sets plus a couple of run scripts, one for running in xterm and another that tries other terminals. MEVO should work (perhaps with a bit of script editing) on most Linux systems but how easily depends on the distribution. Ironically the most complex aspect when trying some live-CD's was merely extracting the tar.gz file and creating a new tar.gz file from the run directory to copy the evolved results back to a thumbdrive. Other than archiver/OS oddities MEVO ran under Fedora, Mandriva and Slax, but did not work right under Puppy due to problems with color support in the terminal. I recommend Ubuntu which has a very easy-to-use archiver and doesn't mind files on the desktop. Recent Ubuntu live CD's don't work on my odd hardware but MEVO runs fine using an old Ubuntu 5.10 live CD. Not as fast as on my installed Ubuntu 8.04 OS but still by my measurements evolves nano warriors almost 3 times faster than under WinXP.

9/20/09 - This update improves random selection when picking battle pairs, edges are no longer always in the 2nd position, works with very small soups or a single row. For Linux, improved the visual battle function (keys 1-9), doesn't prompt to press a key to continue unless the battle is completed and scores are displayed. Added a displayopts setting to permit adjusting the minimum text display lines, colors for text and the soup border and optionally disable/enable unicode border characters.

I did some Windows testing - confirmed that the nice unicode border does not work under Windows (set both usewin and useuni to 0 when compiling for Windows, at least this is the case with XP). Evolving nano and tiny warriors under Windows is much slower than in Linux - under Linux I get over 1000 nano battles every 8 seconds, it takes about a minute to do the same thing under Windows, about 7 times slower. The difference seems to be a few dozen extra milliseconds when shelling pmars and opening the results file [or possibly also updating the display]. When evolving for coresize 8000 where battles might take a quarter of a second or more it doesn't make much difference, but when [millions of] nano battles are needed to make a strong warrior then a few extra milliseconds per shell is a big deal. QBasic is faster in this situation (assuming the OS can run a dos app), but requires spaghetti error handling and can't access enough memory to use an array soup except for very small evolvers using a relatively small soup, larger QBasic-based evolvers have to use individual warrior files [further benchmarking shows that FreeBasic or Blassic under Ubuntu is still about 3 times faster than QBasic under WinXP when shelling pmars].

9/19/09 - Had previously added a testdir INI option so a different bench set can be used for single-warrior testing via the interface, but forgot to add the code to actually use it. Fixed.

9/18/09 - Wrote up some docs, it's preliminary and might have typos etc but explains things in a lot more detail. Possibly too much detail but I'm compelled to explain things like why the simple pair-battle algorithm works reasonably well (and very well for nano) and why strict evaluation schemes usually don't work so well. In addition to documenting the program itself. But there are still many things to learn, new techniques to discover.  Or rediscover. I suspect that there are major improvements to be had not with more code but just better parameters.

Then there is crossover... although MEVO is mostly an improved version of RedMixer written to more modern standards, I didn't copy over the "mix" part as my previous implementation produced only modest improvement. Perhaps I eventually will (every little bit helps) but I'd rather see a better implementation of crossover that produces a more obvious improvement in strength. But before doing that, it probably would be wise to try to get as much performance as possible from the simple mutation scheme. Otherwise it is difficult to tell if improvement from adding crossover is really from crossover, or simply because it added more mutation.

Instead of worrying about fancy stuff, for now concentrating on refining the evolving framework itself. Time permitting of course - slam time approaches once again so I don't want to get too deep into it. Tonight's update is for a couple of simple nice things... now when starting and scanning the saved warriors it parses the filenames to determine the next number to save to, even if some of the warriors are missing, ignores warriors that aren't numbered so if I want I can name and dress warriors in the save directory without interfering, and saves the file number of the top-scoring warrior so the top score can be displayed as soon as the evolve process is started.

Here's a fairly strong nano warrior made using the 9/15/09 settings...

;redcode-nano
;name Creamed Corn
;author Terry Newton
;strategy Evolved by MEVO, gen 2706
;strategy NanoBM07=153.9 Nanob2=156
;generation 2706
;species 8922
;origin 58_12
;assert CORESIZE==80
mov >22,$10
spl.a #-6,>-38
mov.i <-1,{-1
mov.i {-2,<-2
djn.x $-2,<-34
end
;benchscore 153.94 (nanobm07 142 rounds)

...took a couple of days but got 3rd place on SAL Nano.


9/16/09 - Added logging - set enablelog to yes to write start and stop times and all benchmark results to a mevo.log file in the program directory. Start and stop actions record date, time and iteration count. Bench actions record the score, top score, generation, date, time, iteration count and if saved, the filename saved to. This data can be useful in a couple ways, can be used to more accurately gage performance and can be used to analyse strength increase over the course of a run. For performance, my Ubuntu desktop does 1100 nano battles (1000 iterations plus a 50-warrior bench every 500 iterations) about every 8 seconds with one thread, my HP 110 Mi does the same number of nano battles every 9 seconds with 4 threads, 10 to 11 seconds with a single thread. Pushing it to 8 threads gets it down to about 8 seconds. Hmm... feels faster but I guess not. Still, that's over 100 nano battles per second, haven't measured it yet but my little Asus 701SD seems to perform at about 1/2 to 2/3 that speed. Don't expect anything like this kind of speed with MEVO under Windows (the old QBasic evolvers can come close, but FreeBasic for Windows has a fairly slow shell command I've noticed). For larger coresizes the speed of Linux vs Windows will probably be closer to the same, less limited by the speed of the shell command.

Alternate nano settings that seem to be doing a bit better on my Asus 701SD...

rounds: 200
instrate: 0.02
modrate: 0.03
moderate: 0.03
datarate: 0.07
insrate: 0.01
delrate: 0.01
swaprate: 0.01
dupline: 0.3
incdec: 0.3
bignum: 0.5
instructions: mov mov mov mov mov spl spl spl djn djn
instructions: add sne seq slt jmn jmz mod
modifiers: .i .i .i .a .b .f .x .ab.ba
modes: $#@*<>{}

After a couple days of churning it's up to a nanobm07 score of over 157 and a nanob2 (Koenigstuhl top-20) score of around 150, whereas the other settings are up to 154 or so. Could be just statistical differences. I notice some focusing from using reinsertion to guide the soup, in other evolving experiments free-running evolution tends to produce warriors that score about the same on those two benchmarks.


9/15/09 - Here is a mevo.ini file for evolving nano warriors...

;mevo.ini file for nano (9/15/09) 
xsize: 77 ;width of soup
ysize: 21 ;height of soup
maxsize: 5 ;max evolved length
;pmars parameters...
coresize: 80 ;size of core array
processes:80 ;max processes
cycles: 800 ;cycles before tie
maxlen: 5 ;maximum warrior length
mindist: 5 ;minimum separation
rounds: 250 ;# of battle rounds
pmarsbin: pmars ;path/name of pmars binary
pmarsvbin: pmarsv ;path/name of pmarsv binary
pmarsvopt: -v 564 ;pmarsv view options etc
;mutation parameters...
instrate: 0.02 ;chance of instruction change
modrate: 0.03 ;chance of modifier change
moderate: 0.03 ;chance of address mode change
datarate: 0.06 ;chance of field value change
insrate: 0.01 ;chance of line insert
delrate: 0.01 ;chance of line delete
swaprate: 0.02 ;chance of line swap
dupline: 0 ;if insert, chance of dup line
incdec: 0.3 ;if data, chance of inc or dec
bignum: 0.5 ;if data, chance of big number
;code generation...
instructions: mov mov mov mov mov spl spl spl djn djn
instructions: add sne seq slt jmn jmz mod
modifiers: .i .i .a .b .f .x .ab.ba
modes: $$#@*<>{}
;bench parameters...
enablebench: yes ;yes to enable auto-bench/re-ins
benchrounds: 142 ;rounds used for benchmarking
savethresh: 95 ;percent of top score to save
reinsertmode: 2 ;0=none 1=top.red 2=from save
benchinterval: 500 ;bench every n iterations (0=none)
reinsertinterval: 5000 ;re-insert every n iterations (0=none)
benchdir: nanobm07 ;directory containing benchmark warriors
savedir: save ;directory to save warriors to
testdir: ;directory for single-test warriors (def.benchdir)
;other parameters...
initmode: 1 ;start warriors 0=first inst, 1=random
spthresh: 5 ;percent change before different color
infoline: evolved ;added to strategy line
threads: 4 ;# of processing threads
threadsleep: 0 ;# ms to sleep between threads
;end of ini file

Here's a warrior made with these settings... (not super-strong but it's early)

;redcode
;name 11_16
;author mevo
;strategy evolved
;generation 556
;species 9940
;origin 58_12
;assert 1
spl.x #-21,>-31
mov >17,{-1
mov.i >10,<-2
mov <-3,{-4
djn.f $-2,<-16
end
;benchscore 151.97 (nanobm07 142 rounds)

Most of the INI settings should be obvious if familiar with corewar and evolvers. The mutation rate options permit specifying the chances of changing various code elements and performing code rearranging operations. If auto-bench is enabled, it scans the save directory to determine the best score so far, and saves warriors within a percentage of the top score specified by savethresh. The reinsertmode option enables and specifies periodic re-insertion of either the top-scoring warrior or a randomly-selected saved warrior back into the soup. The interval settings specify the number of soup battles (approximately) between benchmarks and re-insertions. The initmode setting determines how the warriors start, 1 for random or 0 for zeroed arrays which with the above instruction settings starts new warriors with lines of nop.i $0,$0. Warrior length is variable, the maximum evolved length is set by maxsize. Warriors always begin executing at the first line. The spthresh setting determines the percentage of instructions that can change before picking a new display color, threads determines how many copies of pmars it attempts to run at once, threadsleep determines the delay between each thread launch and can be used to dial back CPU usage for better cooling or system response while evolving.

The evolving method used by MEVO is similar to my other evolvers...

initialize
do
randomly pick two nearby warriors
battle them in pmars and collect scores
copy the winner over the loser while making random changes
loop until stopped

The multi-threaded version just launches multiple copies of the key sequence with additional code to make sure they don't pick the same warriors or otherwise interfere with each other. This is a very simple way to evolve but it resembles how evolution occurs in nature and in my experience produces superior results compared to other relatively simple methods. With a grid-based soup, "nearby" means adjacent. When battling only the winner matters, in the case of a tie the program considers the first warrior to be the loser. When making random changes it's important to not change too much, even allow some perfect copies, but provide a chance of making many changes at once for more complex advancements in a single generation. The algorithm doesn't impose strict evaluation, so weak code has a chance to survive even if surrounded by stronger warriors. This is important as new forms (like papers) usually start out weak and usually require a few generations to be able to defeat simpler but initially stronger warriors (like stones). But if the form remains weak it will be consumed. Islands of similar code develop as warriors replicate, where they can work out strategies while other areas of the soup are working out different strategies. Eventually the soup will mostly fill with a dominant strategy but this algorithm improves diversity by providing an opportunity for different strategies to develop.

9/14/09 - whew! I had to pull the previous packages after discovering several unnoticed bugs... the settings in the included nano INI worked but other options including the coresize 800 defaults were broken. Hopefully it's straightened out now, seems to be better than ever with the new interface, but I don't dare package it yet.


Terry Newton (wtn90125@yahoo.com)