a Sub-1K Nano Warrior Evolver for Corewar
  =========================================

This little evolver has the following features...

Uses the standard pmars program to battle the warriors.
Uses a "fuzzy ring" topology with adjustable maximum distance.
Configurable strings to define instructions and address modes.
Adjustable additive mutation rates for instructions, modes and values.
Warriors are initialized to the first instruction and mode, value 0.
Field values are weighted to favor smaller offsets.
Can be stopped and restarted as needed.
Less than 1K-byte for the evolving code.
Runs under Blassic for Windows or Linux.
Easily modified for QBasic or GWBASIC.
Doesn't work all that well :-)

-------------- begin subk.bas (8/22/09 version) --------------------------
1 N=400:DIM A(N,5,5):O$="$#@*<>{}":RANDOMIZE'TIMER
2 OPEN "s" FOR INPUT AS #1:WHILE NOT EOF(1):W=W+1:FOR I=1 TO 5
3 FOR J=1 TO 5:INPUT #1,A(W,I,J):NEXT:NEXT:WEND:CLOSE:WHILE INKEY$=""
4 X=INT(RND*N)+1:Y=(X+(INT(RND*2)*2-1)*INT(RND*3+1))MOD N
5 Y=Y-(Y<0)*N:W=X:F$="a":GOSUB 14:W=Y:F$="b":GOSUB 14
6 SHELL "pmars -bks 80 -c 80 -p 800 -l 5 -d 5 -r 120 a b>c"
7 OPEN "c" FOR INPUT AS #1:INPUT #1,A$:A=VAL(A$)
8 INPUT #1,A$:B=VAL(A$):CLOSE:IF B>=A THEN T=X:X=Y:Y=T
9 FOR I=1 TO 5:FOR J=1 TO 5:Z=.02-.02*(J>1)-.03*(J>3):W=INT((RND-RND)*40)
10 T=-(RND<Z):A(Y,I,J)=(1-T)*A(X,I,J)+T*(-(W>0)*W-(W<0)*(W+80))
11 NEXT:NEXT:WEND:OPEN "s" FOR OUTPUT AS #2:FOR W=1 TO N
12 F$=MID$(STR$(-W),2)+".red":GOSUB 14:FOR I=1 TO 5
13 FOR J=1 TO 5:? #2,A(W,I,J):NEXT:NEXT:NEXT:CLOSE:SYSTEM
14 OPEN F$ FOR OUTPUT AS #1:? #1,";assert 1":FOR I=1 TO 5
15 ? #1,MID$("nop.amov.ispl.bdjn.f",(A(W,I,1)MOD 4)*5+1,5);
16 ? #1," ";MID$(O$,(A(W,I,2)MOD 8)+1,1);A(W,I,4);",";
17 ? #1,MID$(O$,(A(W,I,3)MOD 8)+1,1);A(W,I,5):NEXT:CLOSE #1:RETURN
-------------- end subk.bas ----------------------------------------------

Note... if copy/pasting files using Windows Notepad, put quotes around
the filename when saving to keep it from adding a .txt extension.

The program is written to run under Blassic: http://blassic.org/
To run under QBasic or GWBASIC replace the ' in line 1 with a space.
Requires pmars: http://www.koth.org/pmars/ or http://www.corewar.co.uk/pmars/
For Windows the SDL pmars-server.exe version works well (rename to pmars.exe).
For Linux my x86 Linux binaries are at: http://newton.freehostia.com/cw/
(rename the pmars binary to "pmars" and put in a path directory).
For Windows the pmars and blassic programs can be in the run directory
but it's best to put them in a path directory (C:\Windows for example)
to avoid wasted space duplicating for each run or having to edit the
programs to specify the full path.

To run manually an empty "s" file must be created using the command: >> s
then run blassic subk.bas to start the program (for QBasic: qbasic /run subk).
After awhile press any key to write 400 soup warriors to individual .red files
and write the soup to the s file. Run again to continue evolving.
Delete the "s" file to start over.

To make it easier to run, a starting script can be used that makes sure
the s file exists and starts the evolver in the BASIC interpreter.

For Windows...
-------------- begin subk.bat -------------------
@echo off
>> s
blassic subk.bas
-------------- end subk.bat ---------------------

For Linux...
-------------- begin subk.sh --------------------
#!/bin/bash
cd `dirname $0`
>> s
xterm -e blassic subk.bas
-------------- end subk.sh ----------------------

Note... Linux scripts must be in "unix" format, with the properties set
to allow executing. To make sure in the proper format, install the flip
package, very handy when copy/pasting scripts. To convert: flip -u filename

Also you'll need a benchmarking program that can test a directory of redcode
to find strong warriors in the soup. Possibilities that I've made include...

TEST (Dos/Windows XP): http://newton.freehostia.com/cw/test.zip
benchnano.bas (Windows/Linux): http://newton.freehostia.com/cw/warbench.txt

For best research results put the master copy of the subk.bas and run script
in a directory along with the benchmark program and test set directories, then
make a work directory under that for the evolving experiment, typically with a
constant name to keep from having to constantly edit the benchmark setup.
Copy subk.bas and subk.bat (or subk.sh if Linux) to the work directory,
edit subk.bas as needed to specify parameters. Rename the work directory
to preserve runs, making a new work directory for the next try.


About the subk.bas program
--------------------------

Many tricks were used to get it less than 1024 bytes. Variables were
eliminated from NEXT, options were combined in the pmars command line,
The ? shortcut was used instead of PRINT, and whenever possible CLOSE
was used instead of CLOSE #file. The save more space to permit adding
number weighting and separate mutation rates, conditional logic was used
to eliminate extra IF THEN expressions. In BASIC, if computed an expression
like (W<0) returns -1 if W is less than 0 or 0 if it isn't, this can be
used to put computations for multiple conditions in a single math equation.
For example in line 9 Z=.02-.02*(J>1)-.03*(J>3) is equivalent to
Z=.02:IF J>1 THEN Z=Z+.02:IF J>3 THEN Z=Z+.03 - a savings of 20 bytes.
In line 10, T=-(RND<Z):A(Y,I,J)=(1-T)*A(X,I,J)+T*(-(W>0)*W-(W<0)*(W+80))
(60 bytes) would otherwise require 10 more bytes (ln is a line number):
IF W<0 THEN W=W+80
ln IF RND<Z THEN A(Y,I,J)=W ELSE A(Y,I,J)=A(X,I,J)

The listed subk.bas program probably isn't optimal, for better results
experiment with various settings. No matter how it's set it likely won't
make as good of warriors as big evolvers like Maezumo, or even small
evolvers that use just a few more K of code, but who knows, might do ok.

In line 1, change N=400 to specify the number of soup warriors, when exiting
this many files will be created so don't go crazy, if misused you can clog
up your computer with warriors. Larger soups, up to 2000 or so, tend to
produce stronger code but also take longer to evolve and benchmark.
The string after O$ specifies the address modes, add duplicates to select
certain strings more often (like extra $). Make sure that the numbers
after MOD in lines 16 and 17 equal the length of the address mode string.

In line 4, the 3 in the segment RND*3+1 determines the maximum distance
between warriors doing battle, i.e. 3 means select warriors that are
separated from each other by 1 to 3 positions. Lower numbers produce more
grouping of similar code into islands, larger numbers permit more rapid
spread of strong warriors throughout the soup, but if too much spread
early on the soup will likely converge to a weaker form and prevent
stronger forms from appearing.

Line 4 can be modified to produce other topologies, the following
produces a grid-shaped topology with an X dimension of 40...
4 X=INT(RND*N)+1:Y=X:WHILE Y=X:Y=(X+INT(RND*3-2)+INT(RND*3-2)*40)MOD N:WEND
This puts the program over 1K, so if that matters change these lines...
6 SHELL "pmars -bks 80 -c 80 -p 800 -l 5 -d 5 -r 99 a b>c"
9 FOR I=1 TO 5:FOR J=1 TO 5:Z=.03-.05*(J>3):W=INT(RND*80)
...this produces flat number weighting and a separate mutation rate
only for the field values, instructions and modes use the same rate.

In line 6, the 70 in the segment -r 120 determines the number of rounds
used when battling the warriors. Lower numbers are less accurate and permit
weaker code to win more often, which can increase diversity, larger numbers
result in more accurate battles and often stronger code. For perfect battle
scoring (with pmars 0.9.2 or above) replace -r 120 with -P to activate the
permutate option. Hint... start a run with a fairly low setting (say 70),
then once a desirable form like SPL MOV MOV MOV DJN appears, increase
the number of rounds (to say 200) or use -P to strengthen the warriors.
It might take a few tries to get the better form, usually the dominant
form appears after about half an hour or so of evolving.

In line 8, the stock IF B>=A segment causes warriors that tie to be counted
as the loser, replace with IF B>A to count warriors that tie as winners.

In line 9, the floating point numbers after Z= determine the mutation rates
for instructions, address modes and values. The numbers add, so with the
stock settings of .02 .02 and .03 the instruction rate is .02 (2%), the
address mode rate is .04, and the value rate is .07. To mutate all elements
with the same rate, set the first .02 to the desired rate (say .05) and
change the 2nd and 3rd floating point values to 0.

Also in line 9, the expression after W= determines the number weighting.
W=INT((RND-RND)*40) produces +/- numbers from -40 to 39, prefering numbers
centered around 0 (all negative numbers get 80 added when writing to wrap).
For plain flat weighting replace with W=INT(RND*80). The "40" number makes
a big difference, see below for more exploration of number weighting.

The instructions are specified by the string in line 15, make sure the
number after MOD equals the number of instructions in the string, and
the two 5's equal the length of each instruction field in the string.
For example for modifier-less instructions weighted towards mov change to:
15 ? #1,MID$("nopmovmovspldjn",(A(W,I,1)MOD 5)*3+1,3);


Fun with number weighting
-------------------------

Various formulas can be used in line 9 to affect number weighting,
a general form to use is W=INT((RND^power1-RND^power2)*multiplier+offset).
Omit the ^powers if 1. Multiplier should be between coresize/2 and coresize
and determines the occurence of numbers around coresize/2. Offset is usually
negative rather than positive as shown to encourage the formation of loops.
In this simple evolver the same numbers are also used for instruction and
address mode selection so that should also be taken into account.

Here's a program that simulates and charts various formula parameters...

100 PRINT "calculate INT((RND^power1-RND^power2)*multiplier+offset)"
110 INPUT "no.of core locations: ",nc
120 INPUT "no.of instructions: ",ni
130 INPUT "no.of modes: ",nm
140 INPUT "power1: ",p1
150 INPUT "power2: ",p2
160 INPUT "multiplier: ",m
170 INPUT "offset: ",ofs
180 DIM n(nc), in(ni), mo(nm)
190 FOR i=1 TO 1000000
200 z=INT((RND^p1-RND^p2)*m+ofs)
210 IF z < 0 THEN z = z + nc
220 n(z+1) = n(z+1) + 1
230 in((z MOD ni)+1) = in((z MOD ni)+1) + 1
240 mo((z MOD nm)+1) = mo((z MOD nm)+1) + 1
250 NEXT i
260 ma=0:FOR i=1 TO nc:IF n(i)>ma THEN ma=n(i)
270 NEXT i
280 FOR i = 1 TO nc
290 c$="":cn=INT((n(i)/ma)*50):FOR j=1 TO cn:c$=c$+"*":NEXT j
300 PRINT i-1,n(i),c$
310 NEXT i
320 PRINT
330 ma=0:FOR i=1 TO ni:IF in(i)>ma THEN ma=in(i)
340 NEXT i
350 FOR i = 1 TO ni
360 c$="":cn=INT((in(i)/ma)*50):FOR j=1 TO cn:c$=c$+"*":NEXT j
370 PRINT i,in(i),c$
380 NEXT i
390 PRINT
400 ma=0:FOR i=1 TO nm:IF mo(i)>ma THEN ma=mo(i)
410 NEXT i
420 FOR i = 1 TO nm
430 c$="":cn=INT((mo(i)/ma)*50):FOR j=1 TO cn:c$=c$+"*":NEXT j
440 PRINT i,mo(i),c$
450 NEXT i

Example run corresponding to W=INT((RND^1.7-RND^1.7)*60-2)...

calculate INT((RND^power1-RND^power2)*multiplier+offset)
no.of core locations: 80
no.of instructions: 4
no.of modes: 8
power1: 1.7
power2: 1.7
multiplier: 60
offset: -2
0       17438   ***************************************
1       16833   **************************************
2       15848   ************************************
3       15222   **********************************
4       14726   *********************************
5       14298   ********************************
6       13568   ******************************
7       13312   ******************************
8       13098   *****************************
9       12640   ****************************
10      12008   ***************************
11      11879   ***************************
12      11497   **************************
13      11140   *************************
14      10805   ************************
15      10509   ***********************
16      10365   ***********************
17      10093   **********************
18      10378   ***********************
19      10715   ************************
20      10867   ************************
21      11018   *************************
22      11034   *************************
23      11203   *************************
24      11355   *************************
25      11345   *************************
26      11236   *************************
27      11433   **************************
28      11444   **************************
29      11388   *************************
30      11328   *************************
31      11353   *************************
32      11202   *************************
33      11420   **************************
34      11279   *************************
35      11342   *************************
36      11557   **************************
37      11355   *************************
38      11298   *************************
39      11349   *************************
40      11460   **************************
41      11262   *************************
42      11486   **************************
43      11261   *************************
44      11451   **************************
45      11440   **************************
46      11270   *************************
47      11177   *************************
48      11388   *************************
49      11346   *************************
50      11328   *************************
51      11118   *************************
52      10985   *************************
53      11092   *************************
54      10787   ************************
55      11086   *************************
56      10721   ************************
57      10339   ***********************
58      10031   **********************
59      10415   ***********************
60      10653   ************************
61      10931   ************************
62      11244   *************************
63      11535   **************************
64      11842   **************************
65      12202   ***************************
66      12556   ****************************
67      12812   *****************************
68      13069   *****************************
69      13748   *******************************
70      14190   ********************************
71      14615   *********************************
72      15287   **********************************
73      15935   ************************************
74      16684   **************************************
75      17442   ***************************************
76      19023   *******************************************
77      21640   *************************************************
78      21951   **************************************************
79      19020   *******************************************

1       249428  *************************************************
2       251465  **************************************************
3       250309  *************************************************
4       248798  *************************************************

1       124156  ************************************************
2       123415  ************************************************
3       122834  ***********************************************
4       123639  ************************************************
5       125272  ************************************************
6       128050  **************************************************
7       127475  *************************************************
8       125159  ************************************************

This formula favors smaller offsets by about 2 to 1, especially the
values 77 and 78 corresponding to -3 and -2. Selection of instructions
and address modes remain mostly flat. Here's a chart of the stock
W=INT((RND-RND)*40) formula...

calculate INT((RND^power1-RND^power2)*multiplier+offset)
no.of core locations: 80
no.of instructions: 4
no.of modes: 8
power1: 1
power2: 1
multiplier: 40
offset: 0
0       24703   *************************************************
1       23859   ***********************************************
2       23422   ***********************************************
3       22979   **********************************************
4       22295   ********************************************
5       21535   *******************************************
6       20960   ******************************************
7       20068   ****************************************
8       19641   ***************************************
9       19236   **************************************
10      18518   *************************************
11      17947   ************************************
12      17244   **********************************
13      16633   *********************************
14      15869   *******************************
15      15293   ******************************
16      14712   *****************************
17      14116   ****************************
18      13384   **************************
19      12796   *************************
20      12015   ************************
21      11573   ***********************
22      10929   *********************
23      10343   ********************
24      9595    *******************
25      8954    *****************
26      8323    ****************
27      7858    ***************
28      6954    *************
29      6584    *************
30      5766    ***********
31      5321    **********
32      4713    *********
33      4047    ********
34      3528    *******
35      2817    *****
36      2217    ****
37      1559    ***
38      961     *
39      309     *
40      346     *
41      933     *
42      1604    ***
43      2227    ****
44      2797    *****
45      3459    ******
46      4032    ********
47      4669    *********
48      5216    **********
49      5880    ***********
50      6671    *************
51      7216    **************
52      7861    ***************
53      8415    ****************
54      9121    ******************
55      9613    *******************
56      10286   ********************
57      11130   **********************
58      11885   ***********************
59      12186   ************************
60      12658   *************************
61      13424   **************************
62      14187   ****************************
63      14625   *****************************
64      15439   *******************************
65      16031   ********************************
66      16509   *********************************
67      16898   *********************************
68      17871   ***********************************
69      18231   ************************************
70      19020   **************************************
71      19683   ***************************************
72      20216   ****************************************
73      20849   *****************************************
74      21675   *******************************************
75      22114   ********************************************
76      22750   *********************************************
77      23494   ***********************************************
78      24309   ************************************************
79      24894   **************************************************

1       249529  *************************************************
2       249942  *************************************************
3       250673  **************************************************
4       249856  *************************************************

1       124867  *************************************************
2       125035  *************************************************
3       125519  **************************************************
4       125038  *************************************************
5       124662  *************************************************
6       124907  *************************************************
7       125154  *************************************************
8       124818  *************************************************

Despite the extreme reduction of numbers around coresize/2 the
instruction and mode selections remain flat to within 1%.

Traditionally in most of my evolving experiments rather than using a
single formula, I used a parameter that determines the chance of
small numbers from +/- length or big numbers covering the whole core,
then used a flat distribution within each group. That method seems to
work but is too much code for a sub-K evolver. The single-formula method
might actually be more versatile but so far this research hasn't made
the sub-K evolver produce better code, it seems to top out at around
145 NanoBM07 points (and worse on other test sets) no matter how
carefully I pick the numbers. The real question is why? Once a viable
form such as SPL MOV MOV MOV DJN has been found, it should be equivalent
to other bigger evolvers even though it lacks mutations that rearrange
code, but it is weaker and I'm not sure why. Could be a bug but if it is
I'm not seeing it. One theory is the increased production of rearranged
code in more sophisticated evolvers gives the warriors more "practice"
defeating other kinds warriors so the warriors evolve to be less focused
on defeating the dominant form, thus becoming stronger in general.


Modified version of SubK
------------------------

I've been playing around with the code in an attempt to figure out why
the populations of nano warriors evolved by SubK aren't as strong as
warriors evolved by "fancier" programs - once a preferred form is found
then inserts, deletes, swaps etc should not be needed. SubK has no problems
finding the potentially strong SPL MOV MOV MOV DJN form but for some reason
it seems to have trouble taking the form to competitive strength. Why?

One theory relates to the "headless chicken" genetic phenomena, which is
essentially the production of broken code. Official name, look it up.
Normally it happens during crossover when one of the parents is random
or consists of an incompatible code sequence. Why it improves the overall
strength of a population is unclear but my theory is a certain amount of
broken code gives weaker members a chance to mutate into stronger members
rather than being suppressed by locally stronger members that aren't
necessarily superior. It is the same reason why when evolving for larger
core sizes, fewer rounds for less accurate battles results in paper-type
warriors whereas more accuracy results in boring stone-type warriors
taking over and suppressing the production of more interesting warriors.
New strategies tend to start out weak, for them to take hold in the
soup if that is their destiny, there needs to be a certain amount of
forgiveness in the process. Having a few semi-broken warriors hanging
around might provide "practice" for newly-formed strategies, allowing
them to replicate and gain a foothold in the soup. Theoretically.

There isn't a whole lot that can be done with just 1K of code, but
if the idea is to break code, well that shouldn't be too hard to do
in this context, just introduce non-halting "bugs". What used to be the
bane of evolving (when the flawed version works better than the fixed
version) can be turned around to provide a benefit. One such opportunity
presents itself in the grid topology code - omit the check for self then
about 1 out of 9 battles will be against self, resulting in a warrior
mutating in place rather than making a mutated copy. This doesn't always
produce a broken warrior but at least it puts a limit on how long a
locally strong warrior can spew forth copies of itself, in effect it
puts a lifetime limit on any particular warrior.

Here is the code I'm currently experimenting with...
(11/17/09 version)

1 N=1600:DIM A(N,5,5):O$="$#@*<>{}":RANDOMIZE'TIMER
2 OPEN "s" FOR INPUT AS #1:WHILE NOT EOF(1):W=W+1:FOR I=1 TO 5
3 FOR J=1 TO 5:INPUT #1,A(W,I,J):NEXT:NEXT:WEND:CLOSE:WHILE INKEY$=""
4 X=INT(RND*N)+1:Y=(X+INT(RND*3-2)+INT(RND*3-2)*80)MOD N
5 Y=Y-(Y<0)*N:W=X:F$="a":GOSUB 14:W=Y:F$="b":GOSUB 14
6 SHELL "pmars -bks 80 -c 80 -p 800 -l 5 -d 5 -r 99 a b>c"
7 OPEN "c" FOR INPUT AS #1:INPUT #1,A$:A=VAL(A$)
8 INPUT #1,A$:B=VAL(A$):CLOSE:IF B>=A THEN T=X:X=Y:Y=T
9 FOR I=1 TO 5:FOR J=1 TO 5:Z=.02-.03*(J>3):W=INT((RND^2-RND^2)*70-2)
10 T=-(RND<Z):A(Y,I,J)=(1-T)*A(X,I,J)+T*(-(W>0)*W-(W<0)*(W+80))
11 NEXT:NEXT:WEND:OPEN "s" FOR OUTPUT AS #2:FOR W=1 TO N
12 F$=MID$(STR$(-W),2)+".red":GOSUB 14:FOR I=1 TO 5
13 FOR J=1 TO 5:? #2,A(W,I,J):NEXT:NEXT:NEXT:CLOSE:SYSTEM
14 OPEN F$ FOR OUTPUT AS #1:? #1,";assert 1":FOR I=1 TO 5
15 ? #1,MID$("nop.amov.imov.ispl.bdjn.f",(A(W,I,1)MOD 5)*5+1,5);
16 ? #1," ";MID$(O$,(A(W,I,2)MOD 8)+1,1);A(W,I,4);",";
17 ? #1,MID$(O$,(A(W,I,3)MOD 8)+1,1);A(W,I,5):NEXT:CLOSE #1:RETURN

Like the original the "s" file must be initially 0 bytes, created
using the >> s command manually or in the startup script.
This code is set to a fairly large soup so it'll only run under
Blassic, set N to 400 or less to run under QBasic or GWBasic
(also uncomment 'TIMER to properly randomize).

Other changes include adding another MOV to the instruction string,
a modified number weighting formula, and removing the extra separate
mutation rate for the address mode to keep the program under 1024 bytes.

The modified version doesn't necessarily peak any higher than some
runs of the original SubK, but it seems to be more reliable at producing
soups containing members scoring over 140 points against typical benchmark
test sets with less or no degeneration once the benchmark score peaks.
The original tends to climb to about 135 points or so then quickly go
downhill as inferior warriors (that happen to do well in the soup)
displace the higher-scoring warriors (that lose to other soup warriors).

The improved stability should make it easier to fine-tune the parameters.
Items of interest are soup size (N in line 1), X size (the last number in
line 4), battle rounds (the number after -r in line 6), mutation rates and
the number weighting formula (line 9), and the instruction selection string
(line 15 with the number after MOD reflecting the number of entries).

With N=1600, X size=80, rounds=99, mutation rates Z=.02-.03*(J>3)
number weight formula W=INT((RND^2-RND^2)*70-2), and instructions
"nop.amov.imov.ispl.bdjn.f" (the original SubK v2 settings),
the following warrior was produced after about 2 hours...

;assert 1
spl.b #54,>13
mov.i >62,{79
mov.i <73,{41
mov.i @26,{79
djn.f $78,<28

Scores about 138 against the nanobm07 test set and 144 against the
top 20 warriors from the 6/09 Koenigstuhl infinite nano hill (nanob2).

Here's another sample after about another hour of evolution...

;assert 1
spl.b #50,>15
mov.i >63,{79
mov.i >75,{33
mov.i @56,{79
djn.f $78,{38

This warrior scores 146 against the nanobm07 test set and 139 against
the nanob2 test set, which is close to the strength needed to compete
on the SAL nano hill. Need to improve the scores by about 5%.

--------------------------------------------------------------------------

Credit... Many of the techniques used in the Sub-K evolver were inspired
by John Metcalf's TEV program. In particular, the (RND-RND)*constant method
to produce weighted random numbers, but also to a degree the coding style.
It was TEV that got me back into primitive line-numbered BASIC programs.

Legal... The software presented here is public domain, do whatever you
want to do with it. This software is provided as-is and without warranty.

--------------------------------------------------------------------------
Last modified 11/17/09 by Terry Newton (wtn90125@yahoo.com)