Friday, February 12, 2021

Football Predictions with Poisson Distribution

In a previous article (Football Predictions Overview) we have discussed how to use past data to predict the outcome of football (soccer) games; we’ve seen how to adjust outcome probabilities and calculate goal estimates. In this article, we will learn to compute scoring probabilities and improve the predictions with the Poisson model.

The stats below show the number of goals scored and allowed in the Spanish league Primera Division (La Liga) during 2019-2020 (source data: Football Scores in Excel). When playing home, teams scored a total of 546 goals (1.44 goals per game) and when playing away 396 goals (1.04 per game). The chart underneath shows the exact goal distribution as frequency for each goal outcome. Home teams did not score any goal in 88 games, scored 1 goal in 132 games, 2 goals in 99, 3 in 38, etc. Away teams did not score in 136 games, scored 1 goal in 134, 2 in 81, etc. The goal distribution chart has a characteristic shape and it is known to follow very closely the Poisson distribution.


Poisson Distribution

The Poisson distribution is a discrete probability distribution that describes the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event (from Wikipedia). In simple words and applied to football, it shows the probability for each goal outcome when knowing the average number of goals scored. The chart below shows the Poisson distribution for home goals in the Spanish league Primera Division 2019-2020 season (based on the stats above). The vertical axis displays the probability of scoring 0, 1, 2, 3, 4, and 5 goals (horizontal axis).

The Poisson model requires the number of expected goals as input and computes the probability for each goal outcome as output. The number of expected goals is commonly determined as a combination of the average number of goals scored and allowed by each team, and the average league goals scored and allowed. Let’s have a look at how that is calculated in more detail.


Goal Expectancy

As we have seen in a previous article (Football Predictions Overview), Real Madrid scored an average of 2.11 goals per game and allowed 0.58 when playing home in 2019-2020 Spanish Primera Division (La Liga). Barcelona scored 1.79 per game and allowed 1.16 when playing away. With those figures, we can determine what is generally known as the attack and defensive strength of a team as the average number of goals scored/allowed relative to the overall league/season goals. In 2019-2020, the average number of league goals scored and allowed were 1.44 and 1.04 respectively (see stats above). Thus, Real Madrid’s attack and defensive strength playing home would be:

Real Madrid’s attack strength = 2.11 / 1.44 = 1.47

Real Madrid’s defensive strength = 0.58 / 1.04 = 0.56

 

Similarly, we can calculate Barcelona’s attack and defensive strength playing away as follows:

Barcelona’s attack strength = 1.79 / 1.04 = 1.72

Barcelona’s defensive strength = 1.16 / 1.44 = 0.81

 

Now we can calculate the goal expectancy for Real Madrid (home) vs Barcelona (away) as the product of the attack strength, the defensive strength, and the average number of league goals.

Home goals = Home attack strength * Away defensive strength * Average season goals scored

Away goals = Away attack strength * Home defensive strength * Average season goals allowed

 

Real Madrid’s goal expectancy = 1.47 * 0.81 * 1.44 = 1.70

Barcelona’s goal expectancy = 1.72 * 0.56 * 1.04 = 1.00

 

These goal estimates give a rough idea of the expected number of goals and possible final score, but they don’t provide any further useful information by themselves. However, they are key as input to the Poisson equation, which subsequently gives probability estimates for different goal outcomes (see next).

It is also important to highlight that additional variables are often added to the calculations in order to account for the impact of external factors on goals/score, and ultimately get more accurate Poisson probabilities to make better predictions. We have discussed some of those factors in a previous article (Football Predictions Overview). For the sake of this article thought, and in order to explain how the Poisson distribution works, we will stick with goal expectancy estimates calculated above as input to the Poisson model.

 

Poisson Probabilities

We can easily calculate Poisson scoring probabilities in Microsoft Excel using the Poisson built-in function (POISSON.DIST). The function accepts the following three arguments: the goal outcome we search for, the goal expectancy estimate (calculated earlier), and a logical factor to determine the calculation method (cumulative probabilities when TRUE and probability mass function when FALSE). Here’s how we calculate in Excel the probability of Real Madrid scoring 0, 1, 2, 3, etc. goals against Barcelona:

=POISSON.DIST(0, 1.70, FALSE)                  ‘probability of scoring 0 goals – equals 0.18

=POISSON.DIST(1, 1.70, FALSE)                  ‘probability of scoring 1 goal – equals 0.31

=POISSON.DIST(2, 1.70, FALSE)                  ‘probability of scoring 2 goals – equals 0.26

=POISSON.DIST(3, 1.70, FALSE)                  ‘probability of scoring 3 goals – equals 0.15

And here’s how we get the probability of Barcelona scoring x number of goals against Real:

=POISSON.DIST(0, 1.00, FALSE)                  ‘probability of scoring 0 goals – equals 0.37

=POISSON.DIST(1, 1.00, FALSE)                  ‘probability of scoring 1 goal – equals 0.37

=POISSON.DIST(2, 1.00, FALSE)                  ‘probability of scoring 2 goal – equals 0.18

Now we can get the probability for any specific goal outcome by multiplying the calculated Poisson probabilities for each team. For example, the probability for Real Madrid (home) vs Barcelona (away) to end 2-1 would be:

Poisson probability Real Madrid scores 2 goals = 0.26 (26%)

Poisson probability Barcelona scores 1 goal = 0.37 (37%)

Poisson probability for Real-Barcelona 2-1 = 0.26 * 0.37 = 0.097 (9.7%)

 

Similarly, we can build the following table to calculate the probability of plausible combinations for Real Madrid (vertical values) vs Barcelona (horizontal values). We include probabilities from 0 to 5 as more than 5 goals is very unlikely, but the table can be expanded as much as needed.


According to the Poisson model, Real Madrid is more likely to score 1 goal (31%) or 2 goals (26%) against Barcelona, while Barcelona is more likely to score 1 goal (37%) or no goals at all (37%). The most probable score would be 1-0 (11.5%), then 1-1 (11.4%), and then 2-0 (9.7%).

We can also calculate the probability of a draw by adding the Poisson probabilities for 0-0, 1-1, 2-2, 3-3, 4-4, 5-5,…, or the probability of the Home team win by adding 1-0,2-0,2-1, 3-0, 3-1, 3-2, etc. We can also calculate the O2.5 or U2.5 probabilities and any other possible betting combinations the same way.

Let’s calculate outcome probabilities for Real Madrid (home) vs Barcelona (away) based on Poisson figures:

 

Real Madrid win probability=0.1145+0.0973+0.0971+0.0551+0.0550+…=0.5303 (53%)

Real–Barcelona draw probability=0.0673+0.1143+0.0485+0.0091+…=0.2402 (24%)

Barcelona win probability=0.0672+0.0336+0.0570+0.0112+0.0190+…= 0.2209 (22%)

 

Note that the sum of the probabilities is 99% and not 100%. That’s because we did not calculate probabilities for 6 or more goals; that would make up the remaining 1%.

It is important to highlight that there are some limitations to the model, and they need to be taken into consideration. The Poisson model assumes the events (goals) occur independently, which usually is not the case as teams can change strategy/playing style upon scoring/allowing goals. Another assumption of the Poisson distribution is that events occur with a known constant mean rate, which is the average number of expected goals. As explained earlier, this can actually change due to various factors and should be added to the calculations when possible. Despite these limitations, the Poisson model is a powerful tool that can help make better predictions and improve betting strategies in football.


No comments:

Post a Comment

Popular Posts