December 03, 2022, 06:14:03 am
News: Celebrating 30 years of Star Control 2 - The Ur-Quan Masters

The Ur-Quan Masters Discussion Forum
The Ur-Quan Masters Re-Release
General UQM Discussion (Moderator: Death 999)
OCD
Topic: OCD
« on: June 11, 2004, 03:46:59 am »


Out of a sense of sheer boredom, obsessiveness, and curiosity, I have run a detailed analysis on which computer controlled ship is best, and which is worse.

Ideally this information could be used on "Enhanced" versions of UQM to improve the point scale, improve the computers strategic AI (ie, which ship to send against which ship in super-melee), and possibly to make for a more challenging AI.


Two fleets of ships were sent against each other for each match.  Each fleet would contain 14 ships of a single species.  Both fleets were controlled by an "Awesome Computer" player.  Version 0.3 of UQM was used running on a 1100MHZ Windows 2k computer.

Each species would fight all other species, for a total of 600 matches.  Each match would contain between 14 and 27 battles.

I recorded for each match the winner, and how many ships the winner still had surviving at the end of the match.  Fractional ships were only used in the cases where the winner still had all fourteen ships alive.  Fractions were calculated by the ratio of crew alive to crew available to the ship.

A few key notes:

With the shofixi, if the computer won the battle by self destruction, the shofixi score would not include the ship that self destructed, even though UQM displays that ship as being alive.

When a pkunk fleet wins a match with all of its ships intact but one ship damaged, it is not clear that a fractional score should be used.  Fortuneately this case never happened.

When a mycon or a syreen fleet wins a match with all 14 ships intact, fractional scores were not counted, since in both cases the ships have the capability to regain crew.

All matches were fought and recorded "as-is", and no match was re-fought because of  bad luck (planet crashes, bad initial placement, etc.)

(Data omitted.  Suggestions for posting?)

Data observations:
No ship won all of its battles.  The Utwig won all but one against the yehat, with yehat having two ships to spare.  The Chmmr and the Ur-Quan had two and three defeats respectively.

No ship lost all of its battles.  The ZoqFotPik only beat the Umgah, with four ships to spare.  The Umgah and the Thraddish had two and three victories respectively.

There are some oddities in the data.  For example, the syreen match against the Chmmr had the Chmmr win with all of it's ships intact, and one crew alive on it's first ship.  The syreen got the chmmr to this point after only 6 or so ships.  The syreen was never able to damage the chmmr with it's primary cannon.  

It was my original hope that I could come up with a clear-cut way to assign new scores to all ships s/t with two computer controlled ships of equal score, the end score would on average be zero.  I have yet to come up with a satisfactory method for this.

It is fairly easy to calculate between any two species how many of one kind of ship it would take, on average, to defeat another:

If (won) then (14-# of ships left)/ 14
else (14/(14-# of opponents left))

There is a bit of difficulty with ships that won with all ships intact and no damage sustained.  Obviously, as near as the data can tell, it would take an infinite number of
ships to defeat the one ship.

To get around this, the number of ships was weighted to 14.1 in the above formula.

Simply averaging the numbers resulted in some oddities... the Syreen, an overall average ship, was the best because it defeats the pkunk, the suppox, and the zoqfotpik consistently without damage.

Instead I went with the geometric mean.  This was multiplied by a weighted number s/t the sum of all the points resulted in the same total as before, 437 points.  This results in
the following point scale:

Mmrnmhrm 12.31306437
Mycon      12.75849905
Orz      12.77911675
Slylandro      14.05374101
Arilou      14.67790547
Androsyn      16.24530828
Chenjesu      21.53414373
Kohr-Ah      29.14688345
Ur-Quan      31.17697022
Yehat      41.5607365
Utwig      54.05230958
Chmmr      83.31581938

If I then declare that 30 is the highest possible score, and redistribute the points so that the total is still 437, I get:

New score:Old Score:

Error analysis:
There is insufficient data to do error analysis on this experiment.  Some matches, such as Shofiki vs Arilou (both fleets annihilated) are likely to have very low or insignificant standard deviation, whereas others may be very high.  I would estimate that between 5 and 20 matches would have to be run for each pair to understand the cumulative error.

Not counting fractional ships below 13 will cause some anomalies.  If I were to re-do the experiment, I would record fractional ships down to 10, or even 7.  Below 7 I believe is noise.

My data shows that most of the cheaper ships are extremely over-valued by the current point scale, with notable exceptions of species like the shofixi and the ilwrath.

The computer's tactics leave much to be desired.  For the ships at the top of the scale, it is clear that they are adequate, while ships at the bottom should probably be carefully examined.

(comments about what is wrong with the AI is reserved for another thread.)

« Last Edit: June 11, 2004, 04:06:44 am by wafath »
« Reply #1 on: June 11, 2004, 05:54:38 am »

Dude, you've got some time on your hands.

What sound does a penguin make?
« Reply #2 on: June 11, 2004, 06:54:21 am »

I wonder how the pointcosts would change if you used a broad sample of human players.  A few predictions:

- Inhuman-reaction-time ships would lose a bunch of points.  (I'm looking at you, Yehat.  And Arilou.  And Utwig.)

- Ships where the AI is openly suicidal would gain a bunch of points.  (Mainly the Thraddash, but the Supox and Earthling would get a few points too.)

- I'd say that ships the AI simply doesn't know how to deal with would lose some points, but the Avatar is so far ahead of the pack it probably wouldn't change much Smiley
