INTRODUCTION & SUMMARY

In this post, I try to measure recent team over/underperformance using FIFA World Rankings and the The Guardian’s List of the World’s 100 Best Players. I do this by 1) calculating each team’s level of “superstardom,” and 2) correlating this variable with FIFA World Rank measures. From this analysis, I investigate whether each World Cup team with at least one player in the Top 100 has exceeded or failed to meet expectations with its endowed talent base. A few highlights:

  • Even with a team full of superstars, Spain has exceeded expectations
  • Uruguay, Italy, and the Netherlands have performed in-line with expectations
  • France has underperformed relative to both expectations and teams with comparable talent
  • England seems to have (surprisingly) exceeded expectations

I find the last bullet point particularly interesting, especially considering the scrutiny that the media tends to place on the English national team. Perhaps these findings suggest that there is a disconnect between “expectations” defined by objectively measured standards and those based on public sentiment.

 

THE DATA

For my analysis, I draw on the Guardian’s List of the World’s Best Players and the most recent FIFA World Rankings (released in December 2013). Because I rely on interval (i.e., based on a point tally) rather than ordinal measures (i.e., integer rank), it’s worth taking a look at how both measures are calculated by The Guardian/FIFA:

 

THE ANALYSIS

Using the Guardian’s 100 Best Players list, I start by defining each national team’s “Superstar Index” as the aggregate point tally from players of that particular nationality. Figures 1a and 1b illustrate the Superstar Index (in descending order), along with the Player Average (plotted on the secondary y-axis), for the World Cup teams with at least one player in the Top 100:

Figure1a_SSIbyCountry1Figure1b_SSIbyCountry2

On their own, these figures may be interesting, but they don’t seem to mean much (I found myself asking the “So What” Question immediately after putting the charts together).

The Superstar Index, however, becomes useful if we want to know if a team has successfully translated its stock of superstardom into desired match results, and ultimately, a high World Rank. Because the FIFA ranking system is based on recent match results, the natural next step is to correlate each team’s Superstar Index with its corresponding World Rank Point Tally (see Figure 2). This ultimately allows us to determine whether a team has over- or underperformed given its endowed talent base.

Figure2_FIFAWRbySSI

Looking at Figure 2, we find a positive correlation between Superstar Index and FIFA World Rank. In other words, teams with a higher stock of talent tend to be ranked higher. No surprises there.

The chart, however, also suggests that the Superstar Index alone does not fully explain the variation in World Rank among the countries listed. If it were, we would expect to see the bubbles align perfectly with the dotted regression line. To be more precise, we can refer to the of R-squared statistic (you can learn more about the statistic here). That is, with an R-squared statistic of 0.633, the Superstar Index explains 63% of the variation in World Rank Points. Note that at this point, we can only speculate on what drives the remaining 37%–possible factors include management, coaching, and team chemistry.

The dotted regression line becomes extremely useful as it illustrates the expected number of World Rank Points for a given Superstar Index. For instance, if a team had a Superstar Index of 1,400, we would expect it to have a World Rank Point Tally of 1,300.

Taking this concept further, we can argue that the teams with bubbles lying above the dotted line have accrued more World Rank Points than the number forecasted by the Superstar Index, suggesting that these teams have overperformed relative their respective talent base. On the other hand, teams with bubbles lying below the dotted line have accumulated fewer World Rank Points than expected, suggesting that these teams have underperformed. With these explanations in mind, we can generate the following insights:

  • Spain has surpassed expectations… Even with a team brimming with talent, Spain is rated higher than its Superstar Index would suggest. This makes sense, however, if we keep in the mind that the team has both a top-notch manager and excellent team chemistry (a majority of members play together at the club level–think Barcelona and Real Madrid)
  • …and so has Portugal. While management and team chemistry may have played a role, a more compelling argument seems to be that Cristiano Ronaldo has been the driving force behind Portugal exceeding expectations. Notice how Portugal’s World Rank Tally is in line with Argentina’s, even though Argentina has far more attacking options in Messi, Aguero, Higuain, di Maria, and Tevez
  • Uruguay, Italy, and the Netherlands have performed in-line with expectations. While these teams have their fair share of superstar talent, it’s worth noting that a lot of it is skewed towards their offense (Suarez-Forlan for Uruguay, Balotelli-Rossi for Italy, and van Persie-Robben for Holland). As a result, while these teams may receive a great deal of public attention, they don’t seem to be as well-rounded as teams such as Spain and Germany)
  • France has failed to meet expectations. It’s hard to fault the French for feeling let down by their national team. According to the Superstar Index, France has a talent base comparable to that of Uruguay, Italy, and the Netherlands, but has a ranking in-line with that of Bosnia-Herzegovina and the Ivory Coast. It looks like Didier Deschamps will need to find a way to get more out of his players in 2014
  • England has surprisingly exceeded expectations. This finding is surprising if we consider just how much scrutiny the public places on the national team. If we are to take the Guardian’s Top 100 list as an objective measure of team talent, we find that England’s talent base is far from impressive. Perhaps these findings suggest that there is a disconnect between “expectations” defined by objectively measured standards and those based on public sentiment

 

LIMITATIONS

Of course, there are inherent limitations in the measures used in this analysis. I elaborate on these limitations below:

  • FIFA World Rankings. Point weightings by confederation and “match significance” can be considered arbitrary or outdated, especially if there has been a recent structural shift in the relative quality of play by variable (e.g., the AFC may have become more competitive in recent years)
  • Superstar Index. That the index only accounts for players in the Guardian’s Best 100 makes it impossible to factor in the relative quality of unranked players by team. Assuming that all non-ranked players are the same seems unrealistic

INTRODUCTION & SUMMARY

In this post, I take a closer look at The Guardian’s recently posted list of the 100 best footballers (i.e., soccer players) in the world. By classifying each player by his position and nationality, I offer a few predictions on what to expect from some of the top teams next year. A few takeaways:

  • Spain will continue to dominate the midfield
  • Brazil will emerge as an elite defensive team
  • Italy will redefine itself as an attacking team
  • Belgium will rely on its defenders to win matches

Note that Brazil has historically been characterized an attacking team, while the Italian team has long prided itself on its defense. Yet with their latest composition of superstars, these two teams should have the chance to reinvent themselves for the 2014 World Cup.

 

THE DATA

I refer to the raw dataset provided by The Guardian for my analysis (link below):

https://docs.google.com/spreadsheet/ccc?key=0AonYZs4MzlZbdFFMUkFIZ0NmZUo2djJDR2ppbmJyUmc&usp=sharing#gid=0

Other Guardian links worth noting:

 

THE ANALYSIS

Figure1_T100byPosCountry

Figure 1 illustrates the 100 Best Footballers based on two variables–position and nationality (note that countries with fewer than three players in the Top 100 are classified under “Other”). The figure above can be broken down into three components:

  • The column labels segment the data by position (GK = Goalkeeper, DEF = Defender, MID = Midfielder, FWD = Forward)
  • The column widths represent the share of players by position. For instance, with 44 in the Top 100, midfielders make up 44% of the list
  • The stacked bars illustrate each country’s share of players by position. For example, we see that Brazilian defenders represent 29% of all players for that particular position

A few notes about each team:

  • Spain (ESP) Findings Expected. No surprises here. With Spain characterized by its dominance in midfield since Euro 2008, it shouldn’t be a shock to see that Spanish midfielders make up a quarter of all ranked players for that position
  • Brazil (BRA) | Findings Unexpected. Though Brazil has historically been known as an offensive team, the team’s defenders have received the most attention in the Top 100. With Brazilian defenders making up 29% of all players for that position, expect to see Brazil emerge as an elite defensive team
  • Germany (GER) | Findings Expected. Like Spain, Germany has an enviable distribution of ranked players across all positions. At the same time, with less dominance in midfield and attack, Germany can be considered a “Spain-lite”
  • Argentina (ARG) | Findings Expected. With a forward line consisting of Carlos Tevez, Gonzalo Higuain, Sergio Aguero, and of course, Lionel Messi (15% of all ranked forwards), Argentina remains a team defined by its offensive capabilities. Expect a high goals per game ratio from this team in 2014
  • Italy (ITA) | Findings Unexpected. While the Italian national team has long prided itself on its defense, it finds itself with no defenders in the Top 100. Instead, with 12% of all ranked forwards, Italy has the chance to reinvent itself as an attacking team in Brazil
  • France (FRA) | Findings Expected. The modern day French national team has always had a fair share of superstars, but has regularly failed to transfer its talent base into results. Note how France looks a lot like Argentina, but with fewer ranked forwards
  • Belgium (BEL) | Findings Expected. Home to 14% of all ranked defenders, Belgium will enter the 2014 World Cup as a “defense first” team. A dearth of ranked forwards, however, is cause for concern, suggesting that the team’s young, superstar midfielders (i.e., Eden Hazard, Axel Witsel) will need to step up during the tournament
  • Netherlands (NED) | Findings Expected. Even in the 2010 World Cup, the Dutch were characterized as a scrappy team. Expect little to change in 2014, but note that the team still has two of the most talented superstars in Robin van Persie and Arjen Robben

INTRODUCTION & SUMMARY

Though my previous posts have drawn from recent FIFA World Rankings to assess the degree of competition in each 2014 World Cup group, it’s worth noting that the analyses were founded on the following, underlying assumption:

Higher ranked teams are “better” than lower ranked teams.

In this post, I test this assumption within the context of Group Stage results from the past five World Cups (1994-2010). More specifically, I investigate whether pre-tournament rankings published by FIFA did, in fact, accurately predict whether a team advanced past the Group Stage. Two takeaways worth noting:

  • FIFA rankings are useful predictors among the higher ranked teams. Unsurprisingly, teams with a pre-tournament ranking between 1 and 10 had a higher chance of advancing than those at 11-20 (and also scored more points)
  • FIFA rankings are poor predictors among lower ranked teams. Teams ranked between 21 and 30 did not have a higher chance of advancing than those at 31-40 or 41+. In fact, teams in the lower ranked buckets actually advanced at a higher rate

What does this all mean? If we define the likelihood of advancing past the Group Stage as a metric for team quality, I argue that the FIFA ranking system (1) is relatively useful for separating the elite teams from the good teams, but (2) fails to accurately rank all other teams in a meaningful order. This should be encouraging news for fans supporting some of the “underdog” teams in 2014.

 

THE DATA

I refer to Group Stage results from the last 5 World Cups (1994, 1998, 2002, 2006, 2010), specifically gathering the following information:

  • Pre-Tournament Rank: Equivalent to the FIFA World Rankings released in May of the tournament year (e.g., May 2010 for the 2010 World Cup)
  • Group Stage Point Tally: The total number of points accrued by each team in the Group Stage (note that the two teams with the highest number of points advance to the Round of 16)
  • Intra-Group Rank: The relative rankings within a group, based on the pre-tournament rank–each group has a Highest Ranked, Second Highest Ranked, Second Lowest Ranked, and Lowest Ranked team

From the five World Cups, the dataset consisted of results from 228 games, 152 teams, and 38 total groups.

 

THE ANALYSIS

Figure1_LoAbyPTRank

Figure 1 illustrates the historical rate of group stage advancement based on pre-tournament ranking ranges. When comparing teams in the 1-10 and 11-20 buckets, what we find is unsurprising–teams with a pre-tournament rank between 1-10 score more points, and are more likely to advance as either the group leader or runner-up than those at 11-20.

Though we find a similar relationship between the 11-20 and 21-30 buckets, we don’t observe the same trend between the 21-20, 31-40, and 41+ teams. If the FIFA ranking system was meant to be an accurate predictor of group stage advancement, we would expect to see shorter bars as we go from left to right. Instead, we find that historically, teams ranked at 41+ actually had a higher chance of advancing than those ranked at 21-30 and 31-40, even with fewer points!

Figure 2 provides a more granular look of the Group Stage precedents, breaking down the likelihood of advancing by point tally (click to zoom):

Figure2_PointDistbyPTRank

Let’s start by focusing on the “7+” column. We find that in the past five World Cups, 39% of teams coming in with a pre-tournament rank of 1-10 accrued 7 or more points in the Group Stage. It’s also important to note that among all teams with 7 or more points, all of them topped their respective groups, while 93% progressed as runner-ups. Comparing the 1-10 teams with the 11-20 teams, it’s no surprise that that the 1-10 teams accrued 7 or more points at nearly twice the rate of 11-20 teams. Again, the FIFA ranking system seems to do a decent job of predicting group stage advancement rates among the relatively high-ranked teams.

The “5-6″ column, however, tells a different story. With a 94% chance of advancing, and 31% chance of advancing as the group leader, teams accruing 5 or 6 points in the Group Stage should consider their Group Stage campaign a success. What’s strange, then, is that teams ranked at 41 or worse (41+) were twice as more likely, and over 50% more likely to gain 5 or 6 points than teams at 11-20 and 21-30, respectively. Even when adding in the likelihood that teams finished with 7 or more points to these percentages, the 41+ teams fared better than their higher ranked counterparts.

And finally, to account for the fact that the distribution of ranking ranges might not have been equal for all groups (e.g., one group might have had two 1-10 groups and 11-20 groups, while another had teams at 1-10, 11-20, 21-30, and 41+), I analyzed the likelihood of advancing based on intra-group rank–essentially a ranking order within each group. With the lowest ranked teams advancing at a higher rate than the second lowest ranked teams, Figure 3 suggests that this trend still holds.

Figure3_LoAbyIGRank

 

CONCLUSION 

For critics of the FIFA ranking system, these findings shouldn’t be a surprise. Sure, it shouldn’t be too much of a challenge to conclude that Spain (1) is a better team than, say, Greece (12), but the comparison gets harder once we start comparing countries such as Costa Rica (31) and Japan (47). A big part of the confusion may stem from comparing teams in different confederations, since there are such few inter-confederation matches outside of the World Cup and non-competitive friendlies.

INTRODUCTION

When assessing how exactly a soccer fan should feel about the results from the World Cup Draw, s/he usually asks the following question:

How good/bad does my team have it relative to those in other groups?

In order to answer this question with confidence, I argue that a fan should consider two variables:

  • Group Quality. This variable denotes the average team quality in a group. For instance, Group G (Germany, Portugal, USA, and Ghana) has a higher group quality than Group H (Belgium, Russia, Algeria, South Korea)
  • Group Predictability.  I define predictability as the power concentration within a group. A group with high power concentration usually has one team that is considerably better than the others, while a group with low power concentration usually means that all four teams are of comparable quality. Considering that the results from a group with low power concentration are generally harder to predict, I use power concentration as a proxy for predictability

To quantify these variables, I refer to the point values used in the FIFA World Rankings to calculate the average number of points in each group, and draw on the concept of the Herfindahl Index, a measure commonly used in economics, to measure power concentration.

Given the attention that media has placed on the issue of travel in the 2014 World Cup, I also include the average number of miles that teams in each group have to travel, which in turn allow fans to make inter-group comparisons.

 

THE DATA

In order to conduct my analysis, I’ve collected/calculated the following information:

2x2_Dataset

I’ve also included a brief description of each variable below:

  • Group Point Average (GPA): This column is captures the average team quality in each group. The values were calculated by taking the average of the FIFA World Ranking points for each group
  • Herfindahl Index (HI): A measure used by economists to calculate market concentration (i.e., how much market power/dominance one firm has over others in a defined market), the Herfindahl Index allows us to calculate the power concentration within each group (see http://www.investopedia.com/terms/h/hhi.asp for more information). In my analysis, a low HI suggests that there is an even distribution of team quality in a group, while a high HI suggests that the quality gap is relatively large
  • Average Group Travel: Considering the high amount of travel that some teams will face, analysts predict that teams with less travel will have the edge. To include this variable in our analysis, I took the average number of miles that each team will need to travel (via plane) for each group
  • %∆ GPA: This value is calculated by taking each group’s GPA, and then calculating the percent difference between each of those values and the overall GPA for all teams in the World Cup
  • %∆ Herf. Index: This value is calculated by taking each group’s HI, and then calculating the percent difference between each of those values and the overall HI for all groups
  • %∆ Average Team Travel: This value is calculated by taking each group’s average travel value, and then calculating the percent difference between each of those values and the average travel for all teams in the World Cup

 

THE ANALYSIS

We can visualize the dataset above in the form of a bubble chart:

2x2_ExhibitThe x-axis denotes relative average group quality, and that average group quality increases as you go from left to right. The y-axis, on the other hand, denotes power concentration, which is characterized via the Herfindahl Index. Power concentration increases from bottom to top. And finally, the size of the circle denotes average group travel, with a green circle representing travel less than the average amount for all teams, and a red circle representing travel greater than the average amount. The size of the circle denotes the magnitude of relative travel (e.g.., a darker, larger, green bubble travel suggests less group travel than a lighter, smaller, green bubble).

Taking our analysis further, we can divide the chart into four quadrants:

  • Top-Right (High Quality, High Predictability). Though the average team quality is high, a high Herfindahl Index suggests that there is one clear leader in the group that falls in this quadrant. That there is a dominant team suggests that the results should be relatively predictable. Group B (Spain, Netherlands, Chile, Australia) is in this quadrant
  • Top-Left (Low Quality, High Predictability). The average quality of teams in this quadrant is relatively low, but our analysis suggests that results should be relatively predictable. Group F (Argentina, Nigeria, Iran, Bosnia-Herzegovina) is in this quadrant
  • Bottom-Right (High Quality, Low Predictability). Consisting of very strong, evenly matched teams, results will be hard to predict for groups in this quadrant. It’s no surprise that Groups D (Uruguay, Italy, England, Costa Rica) and G (Germany, Portugal, USA, Ghana), the two Groups of Death, fall in this quadrant
  • Bottom-Left (Low-Quality, Low Predictability). The group in this quadrant consists of teams that are of relatively low quality. Moreover, compared to other groups, there is relatively little that separates these teams. Group H (Belgium, Russia, Algeria, Korea) is in this quadrant

Note that Groups A, C and H don’t clearly fall into one quadrant. Considering that the FIFA ranking system tends to underestimate the quality of host countries, it may make sense to consider Group A (Brazil, Cameroon, Mexico, Croatia) to have a higher average team quality and power concentration).

From our analysis, we can echo some of the insights from my previous post:

  • Spain and Argentina have it easy. -Despite the existence of high quality teams in its group, Spain is blessed with 1) a team that is simply much better than the other teams in its group, and 2) a relatively light travel schedule compared to teams in other groups. Argentina is even more fortunate in that it faces less threatening teams in its group
  • US fans have a right to be sadder than England fans. That is, Group G is more of a “Group of Death” than Group D. While both Groups are characterized by a low Herfindahl Index and thus less predictable results, Group G consists of better quality teams than Group D. This should make it even more difficult for teams such as the US to progress past the Group Stage

INTRODUCTION & SUMMARY

Having had nearly two weeks to think about the World Cup Draw (which took place on December 6), soccer fans have started to think about the challenges, opportunities, and threats that await teams in each group. In this post, I introduce one way of assessing the dynamics both within and between groups, and suggest that this may be a useful framework when predicting how your team will fare in the Group Stage. A quick summary:

  • In each group, the four teams were assigned one of four names–the “Highest Ranked,” “Second Highest Ranked,” Second Lowest Ranked,” or “Lowest Ranked” team
  • Assignments were made based on FIFA World Ranking data, presented via both ordinal (based on simple rank) and interval (based on the point values that lead to rankings) variables
  • Among the Highest Ranked Teams in each group, there seem to be three tiers based on team quality, with Spain the sole occupant of Tier 1, Germany, Argentina, and Colombia in Tier 2, and Uruguay, Switzerland, and Belgium in Tier 3 (we leave out Brazil due to shortcomings in the FIFA ranking system for host countries)
  • As observed by many, there appear to be two Groups of Death in 2014, which include Group G (Germany, Portugal, USA, Ghana), and Group D

 

THE DATA

Each month, FIFA updates its ranking of national teams. Though they provide the information ordinally (i.e., provide simple integer ranks), they come up with the rankings through interval data, which means that they calculate rankings based on a point system (see http://www.fifa.com/worldranking/procedureandschedule/menprocedure/). By assigning each team within a group as either the “Highest Ranked,” “Second Highest Ranked,” “Second Lowest Ranked,” and “Lowest Ranked Teams,” and then referencing this information against the FIFA World Rankings (in both ordinal and interval form), I’ve created the following exhibits::

Rankings_Exhibit1a

Rankings_Exhibit1b

 

THE ANALYSIS

Exhibits 1a and 1b allow us to accomplish two things–they allow us to compare the dynamics both within (i.e., intra-team) and between (i.e., inter-team) teams. For those of us with a favorite team, this allows us to answer the following questions:

  • Does my team have a shot of making it past the Group Stage?
  • How good/bad does my team have it relative to those in other groups?

To answer the first question, we need to focus on the dotted lines that connect teams in each group. A steeper line between two teams suggests that there’s a larger quality gap between these teams, while a flatter line suggests that the teams are close in quality. And by drawing a line between the Highest Ranked and Lowest Ranked teams in each group, we can get a better sense of the overall quality gap. For my analysis, I refer to Exhibit 2b (I assume that the point system is a viable way to calculate the magnitude of quality that separates teams), a come up with the following insights:

  • Spain (FIFA Rank 1) should have no trouble making it to the Round of 16. While the Netherlands (9) and Chile (15) are formidable foes, that Spain was drawn with the worst ranked team in the entire tournament–Australia (59)–should make the task relatively easy. You’ll also find that the quality gap between Spain and the Netherlands is rather large
  • Group G can be considered the Group of Death. Just by observing how flat the lines between teams are in this group, you can see that there’s very little separating Germany (2), Portugal (5), USA (14), and Ghana (24)
  • Group H can be considered another Group of Death. What makes this group particularly tough is that 1) Uruguay (6) and Italy (7) have nearly the same number of points, and 2) England (13) is the Second Lowest Ranked Team in the group

Answering the second question requires us to 1) compare the slopes between teams across groups, and 2) assess the relative quality of teams within each ranking classification (i.e., intra-rank). A few highlights from this analysis:

  • Not all Highest Ranked Teams are alike. Exhibit 2b suggests that there are essentially three tiers within the Highest Ranked Team classification. Spain (1) seems to be leaps and bounds above the other Highest Ranked Teams, and is the sole occupant of Tier 1. Germany (2), Argentina (3), and Colombia can be classified as Tier 2 teams, while Uruguay (6), Switzerland (8), and Belgium (11) occupy Tier 3. We do not include Brazil in our analysis as host countries tend to be underestimated in FIFA rankings (this is because host nations do not play qualifying matches, and qualifying matches are weighted more than friendly matches when calculating points)
  • England fans should be happier than US fans. Though both are Second Worst Ranked Teams in their respective groups, England is blessed with a much narrower quality gap between the the top three teams in its group. That is, just 90.1 points separate England and the highest ranked team in the group (Uruguay), while that value is 298.8 points for the US. That Nigera (24), the Worst Ranked Team in USA’s group, can easily slot into the Second Highest Ranked Team position in a few other groups highlights the challenge that the US has waiting in 2014
  • Korea fans should rejoice. As a South Korea (54) fan, I could not have asked for a better draw. Having been drawn with Belgium (11),  Korea is drawn with the worst Highest Ranked Team across all groups. Korea has also been drawn with the worst Second Highest Ranked Team in Russia (22), and the second-worst Second Lowest Ranked Team in Algeria (26). Though Korea is coming into the tournament with a very low rank, an upset against Belgium or Russia should be enough to make it through the Group Stage
Follow

Get every new post delivered to your Inbox.

Join 1,085 other followers