I'd like to propose rulesets for the COG and AIIDE StarCraft AI tournaments, in service of these goals:
- Differentiate the two tournaments
- Improve bots towards beating top human players
- Produce legible artifacts for communicating our efforts
- Involve more and new participants
This proposal assumes the existing AIIDE tournament process and rules as the starting point.
- COG: Emphasizing participation and outreach
- AIIDE: Emphasizing continuity and competition
COG would fulfill some of the role that SSCAIT used to play in public legibility and onboarding new developers.
AIIDE would continue largely as it has been, with some competitive changes that people have been interested in seeing.
Human tournament play is oriented around winning series, best-of 1, 3, 5, or 7. This produces different competitive incentives than trying to maximize a win percentage over many games. It may also affect balance, as some matchups are considered inherently higher variance on a per-game basis than others (ZvZ most notably).
In order to align bot incentives with human competition, we will score the round robin based on number of head-to-head matchups won. A bot that wins 51% against all opponents would win the round robin. For each set of bots that tie with an equal number of head-to-head matchup wins, the tiebreaker will be win percentage in games between the tied bots.
Some advantages of this scoring system include:
-
Increased motivation to expand the ceiling of a bot's capabilities against top competition, rather than trying to eke additional wins against weaker competition
-
Reduces the motivation for "seal clubbing" strategies that often carry over to games against humans and reduce perceptions of bot strength
-
Encourages closer head-to-head matchups, preserving motivation for less experienced authors
What comprises good StarCraft play is strongly conditioned on map features, and some aspects of modern professional play depend on map features that only became expected and near-universal in "recent" years. Limiting the map pool to these newer maps allows authors to optimize bots for features that can usually be expected in modern competition, and invest less time preparing for rarely seen map characteristics.
The map pool for COG should be selected exclusively from maps that were published or used in ASL/KSL since [year TBD].
Select a random pool of six maps, including:
- Two 2-player maps
- One 3-player map
- Three 4-player maps
to be announced two months before the submission deadline.
In a round robin tournament, each game is largely inconsequential. This makes it difficult to cast games with an exciting frame, or tell a story about the outcome from a human-watchable number of games.
So after the round robin event, the top two bots will play a best of 7 series to determine the overall champion. Read directories will be carried over from the round robin.
The vast majority of bots have public or open source code, either voluntarily by the authors or as part of AIIDE participation requirements. But some authors are unable to publish source code or prefer not to.
Omitting the public source code requirement from COG would potentially expand the field of participants.
Historically the same few authors have monopolized the top spots in competition. It is very hard to compete against their very established bots. This likely discourages less established authors from participating or maximizing their efforts.
The Rising Star award acknowledges the author with the highest rank in the round robin event, who has not previously had a top-three placement in the COG or AIIDE tournaments. Participants affirm their eligibility in the registration form.
The Student award acknowledges the author with the highest rank in the round robin event, who is a current student in primary, secondary, undergraduate, or postgraduate education. Participants affirm their eligibility in the registration form.
At the start of the round robin, replace each bot's name with a random string. This discourages opponent-specific pre-training and hardcoding builds, incentivizing better generalization.
Encourage authors to generalize over map features by allowing selection of oddball modern professional maps. Some examples of oddball maps include Inner Coven, Monty Hall, Gold Rush, and Roaring Currents.
A third of the maps played should be oddball maps.
Best-of-ones against a totally unknown opponent incentivize you to use your one best strategy every time. That's how I always played ladder and suspect many others do too.
But if you do that a thousand times against an opponent doing the same, you get an uninteresting series. I think it would diminish the event. Ladders are not the mark we usually use to determine who the best human players are. Series are.
In AIIDE it would still not quite be best-of-ones, because you do have access to learned data. I think that creates a tension where you are incentivized to make use of that learned data in very gamey ways, like trying to doxx your opponent. The initial randomization incentivizes that too, but not as strongly. Without pretraining you can learn what to do fairly quickly. With fully randomized names you need to doxx to learn anything.