The OpenDocument XML.org web site is not longer accepting new posts. Information on this page is preserved for legacy purposes only. For current information on ODF, please see the OASIS OpenDocument Technical Committee.

An Antic Disposition

Syndicate content
Thinking the unthinkable, pondering the imponderable, effing the ineffable and scruting the inscrutable
Updated: 1 hour 39 min ago

Analysis of World Chess Champion Opening Repertoires

Wed, 2015-02-25 14:00

A quick test run of the FactoMineR package for R.   This package focuses on multivariate exploratory data analysis, such as Principle Components Analysis (for numerical data) and Correspondence Analysis (for categorical data).

In an earlier blog post I took a look at a large collection of chess games and tried to quantify the “first move” advantage in chess, in terms of ratings.   This time I’ll use the same large database of chess games, and look at opening repertoires.  A chess opening is a set of moves that a player uses at the start of the game in an attempt to steer the game to positions familiar to the player, and which align with that player’s style and preferences.  Such openings have descriptive, often colorful names, like King’s Gambit, Sicilian Poisoned Pawn, or Nimzo-Indian Defense, as well as a standard code, from the Encyclopedia of Chess Openings, like B07, C44 and E80.   There are 500 such “ECO” codes, from A00 to E99.

I extracted games from all World Chess Champions, from Steinitz (1866) to Carlsen (2014) and calculated the percentage of the games for each player in each ECO code.   So each player’s opening repertoire is represented as a vector of 500 weights, summing to 1.0.   I then used FactoMineR’s PCA() method to extract principle components from this dataset.     The first two components extracted together represent around 42% of the total variance.

Plotting the Champions against these two dimensions shows some intriguing patterns, bringing together players by era:

wch

Further insights can be gleaned by plotting how these two components weight the various openings.   To make it easier to read I grouped some of the ECO codes and used descriptive names for the better-known openings.   From this we see that the first component appears to distinguish the player’s use of open games (1.e4 e5) in the positive direction versus semi-open and closed games in the  negative direction.   I’m having a harder time reading a real-world meaning into the second component.  Maybe a reader sees something here?

weights

Something to remember in all of this is that the choice of opening in a game is a result of the moves of both players.    Players try to influence the opening, steer the game toward their advantages and preparations and against those of their opponents.   But neither player has 100% control over the opening, aside with some fringe moves like 1. h4.   However,  players, especially world-class caliber players, do specialize in certain opening systems, and it is fair to speak of their repertoires.

Related posts:

  1. First Move Advantage in Chess
  2. The World Ends on May 1st, 2010
  3. The Duel: A curious mathematical puzzle

Categories: Blogs
XML.org Focus Areas: BPEL | DITA | ebXML | IDtrust | OpenDocument | SAML | UBL | UDDI
OASIS sites: OASIS | Cover Pages | XML.org | AMQP | CGM Open | eGov | Emergency | IDtrust | LegalXML | Open CSA | OSLC | WS-I