Newton's Off-Line Core War Hills

Hill size is limited to 25 entries each.
Coresize 8000 hills run with 100 rounds fixed sequence.
Coresize 800 evolved hill run with 250 rounds fixed sequence.
Coresize 8000 benchmarks use the Wilkies benchmark run with 250 rounds fixed sequence.
Coresize 800 benchmarks use the tinybm04 test set (available here), 1000 rounds fixed sequence.

If you'd like a warrior added (or your warrior removed) send email to wtn90125@yahoo.com
Specify which hill(s). Multiple warriors in one email OK, if more than a few zip/rar/etc them.

For reference, here is a evolved 800 hill (benchmarks and listings) and a evolved 8000 hill (benchmarks and listings) hills based on the top 20 evolved warriors on each of the Koenigstuhl tiny and 94nop hills.

Procedures...

The evolved hills are for the output of evolving programs. No hand-tweaking except for comments.

Hills are updated by adding all new entries (up to 10 warriors at a time), running the hill batch, deleting warriors below 25th place then re-running the hill batch again to rank the surviving warriors. In other words the hill temporarily grows to accommodate all new warriors, then is trimmed back to include only the top 25 warriors.

In the event of a mistake a hill can be rolled back (erasing results and code) if caught before the next battle, "kill" requests only remove the warrior(s) on the current hill, leaving results and code in the archives.

These are not automated hills, they're updated manually when I receive entries or find something interesting to add. If a response is desired indicate so in the email, it might take a day or so to update the results (longer if I'm away).

Procedures are subject to change.

This is an informal "fun" thing, mostly for my own personal research and amusement (so I've seeded the hills with warriors I selected from published, publically available or my own warriors to beat up or get beat up by) but all are welcome to play by sending me warrior(s) or links to warrior(s).

Hill files generated using the following batch files...
http://newton.freehostia.com/cw/testbats.txt

The report listings are just as the batch spits them out, including truncating benchmark warrior names in the performance charts (I widened the name field for the hill result charts showing who beat who, but the benchmark reports use a shorter name format to fit within 80 characters). The hill results files are edited to look nicer and include the pmars command line and run date.

The above was last updated 5/24/10.

- Terry


Notes...

5/24/10 - Now that Maezumo is around, the mixed hill's previous max-score rule of 130 Wilkies doesn't make much sense, need a new rule - something like anything goes so long as it remains a mixed hill. Ran the hill with a new entry along with M84a from the evolved 8000 hill.

7/9/09 - Maezumo has been released. I played a couple of warriors I made with it while testing on the evolved 8000 hill, very strong results! Maezumo is a totally different kind of evolver that uses hints to automate the production of scanner, paper and stone warriors. Not exactly a pure way to evolve but loads of fun! The hints aren't simply a fixed fill-in-the-blanks framework, rather they generate a wide variety of warriors within each category using code that (sort of) emulates a human coder's thought process, then uses a hill to select the strongest warriors while running a more conventional evolver in parallel to further mutate the warriors. This kind of changes things... pure evolvers never were very strong for anything besides nano, requiring separate evolved-only or strength-controlled hills to turn in a half-decent showing for larger coresizes. Now they need to pump up the strength to compete even on their own contrived hills.

5/13/09 - Restarting the notes - thoughts change as new ways of doing things emerge that uncover potential issues in previous methods. To recap - the goal is to find procedures designed to ensure that the mixed hill for evolved and hand-coded warriors actually remains mixed. The original idea was to use the Wilkies benchmark and let the highest-scoring evolved warrior set the maximum permitted score on the hill. Nice idea, since hand-coded warriors generally do better than an equivalent-scoring evolved warrior. The issue was some evolved warriors incorporate hand-written seed code and score quite well, simulations suggested that if such warriors were permitted on the hill then the hill could become almost completely hand-coded except for a couple of exceptionally strong warriors. So the procedure was changed to permit only "pure" evolved warriors to set the max. That didn't last long, the issue becomes how to define what that means, purity is a matter of opinion and there's no practical way to tell the difference between a warrior evolved from hand-tuned code and one that arises purely from chance. Then there's the "problem" of how to consider a warrior produced by using pure evolution to re-evolve a warrior produced by a framework-based redcode generator. Probably not pure, but after hundreds of generations there might be little left of the original seed warrior. Another general issue of having any hill warrior set the maximum score is what to do if that warrior gets pushed from the hill. So the original scheme was an interesting thought but not really practical.

The strongest purely evolved warrior I know of (v4_260) scores 128 Wilkies, so the new procedure is to simply only permit warriors on the mixed hill that score 130 Wilkies or less, as benchmarked by the stock pmars program set to 250 rounds fixed sequence (-r 250 -f). Much better... accomplishes the same thing but without ambiguity. Note that this doesn't mean the hill will become composed of warriors scoring only near the maximum, Wilkies performance does not determine hill ranking! Right now the strongest-benching warrior on hill (Boom) scores about 100 Wilkies, the current king of the hill benchmarks at 72 Wilkies, 2nd place gets 63 Wilkies, a scanner getting 18 Wilkies is presently in 9th place, and evolved papers that score over 88 Wilkies are near the bottom of the hill.

Performance of a warrior on a hill can be predicted using the TEST benchmark program and the current hill contents - set TEST to coresize 8000, 100 rounds, fixed sequence, self battles enabled. The resulting score numbers should closely correspond with the posted hill scores. TEST is designed to benchmark a directory of warriors against a benchmark set in another directory making it easy to test several warriors at once to see which ones will do the best on the hill. To determine Wilkies scores test using 250 rounds fixed sequence, self battles disabled, using the Wilkies warriors as the benchmark set. To exactly predict performance put the selected warrior(s) in the same directory as the current hill warriors, run hill8000.bat (from testbats.txt), rename or remove the warrior(s) past 25th place and rerun hill8000.bat. This is how I do it but other methods can be used to achieve similar results. For a more point-and-click experience CoreWin can be used - set up the parameters (coresize 8000 etc), add all the warriors to the list and run a "round-robin" tournament. Rounds should be set to 100 rounds or more. CoreWin doesn't support fixed sequence so the results will differ somewhat from what I come up with and will vary from run to run, but should be fairly close.