Predictive Modeling: Who Will Win UEFA 2024?

Artificial Intelligence
Jun 26, 2024
Predictive Modeling: Who Will Win UEFA 2024?

At Luzmo, we love regularly taking out our toolbox and hacking together data-driven apps. And rumor has it that the Luzmo team shares another passion besides data… We loooove football (or soccer - for the Americans). ⚽️

With the UEFA 2024 championship happening this year, of course we weren’t going to pass up the opportunity to build something fun on top of all that football data. What started as “a fun idea for a social media campaign” quickly escalated to our founders coding an app together.

The result is a Euro2024 AI pundit that predicts the final score for each upcoming game, winning odds for each team in every stage of the championship, based on historical data and performance. All of that rich data is visualized in interactive charts and commented upon by our witty resident octopus.

We didn’t expect the app to explode, with thousands of visitors in less than a week, and coverage on multiple press outlets. So today, we’re giving you a closer look behind the scenes, and show you how we built the predictive model and its resulting app.

A closer look at UEFA 2024 through data visualization

For this app, we used 3 different data sources:

  • Performance of each team and its players during the qualifier rounds (through UEFA)
  • Market value estimations for each player in the squads (through Transfermarkt)
  • EA Sports FC24 performance ratings of all players in the squads (through EA)

We added a simple calendar interface of all games, which lets you browse through a bunch of statistics related to the championship, updated in real time.

UEFA 2024 app calendar

Let’s have a closer look at the insights you can find here.

Head to head

In the “head to head”, you can compare how two competing teams stack up against each other. Compare their win probability, market value, player rankings and more to determine which team has the best scorecard to win the game.

Embedded data visualizations on UEFA 2024 championship

You can also analyze more specific metrics regarding attack, defense and goalkeeping performance by browsing the different tabs. We used Luzmo to create the data visualizations and embed them in our Euro 2024 application.

To top it off, each head-to-head has a prediction for the final score, proclaimed by the all-seeing eye of our AI octopus, but more on that later!

Individual player stats

While browsing the head-to-head comparison, you can further drill down into player stats. For example, let’s say you’re looking at goal stats for France. You can click on Mbappé (or any other player) to open up a separate dashboard with individual stats for that player.

Kylian Mbappé player stats, dashboard visualized in Luzmo

Winning predictions

In the above visualization, we presented the facts: historical data about player and team performance. But the beauty of a football tournament is in the speculation. Football is a sport with high variability, because goals are relatively rare (~2.8 goals per match during UEFA 2020). Even clearly advantaged teams regularly lose against a lesser side.

To predict who will win the championship, we used historical data in a predictive model that estimates the odds for each team to advance in each stage of the championship. 

Predictive modeling on winning odds for UEFA 2024

Besides data from the qualifiers, we’re updating the data and model every game day, which means the odds will shift as the tournament progresses. With a slider at the top, you can see how the odds for your favorite team are progressing during the tournament. For example:

  • Belgium’s odds of winning the group rounds reduced drastically when they lost their first game on June 17
  • By June 23, Belgium’s convincing performance and Ukraine’s win on Slovakia put Belgium in the top spot once again

With each game, we get more information, which will shift the odds for any team. Let’s have a closer look at the mechanics behind this predictive model.

Predictive modeling for UEFA 2024

Our predictive model relies on two main data points:

This means our model is heavily weighted towards recent performances, which causes a few surprises. For example, it initially hurt winning probabilities for fan favorites like England and Italy, as they didn’t perform great during the qualifiers. 

Match predictions

For each team, we use the play-by-play data to calculate normalized defensive, midfield, offensive and keeper ratings. We then calculate an expected goals (xG) Poisson distribution of a team against every other team.

Each position rating affects every other position, but to varying degrees. For example:

  • Defense affects offense greatly
  • Offense only has a small effect on opposite offense: if one side dominates, the other one likely has to assist in defensive tasks

We then calculate win/draw/loss and per-scoreline chances from the 2 Poisson distributions. Finally, we also apply a correction factor to align the overall average goals per game and draw percentage with historic data.

Tournament predictions

While we can use raw probability distributions for match probabilities, this is far less evident for tournament progression odds, because of the complex progression rules.

Instead, we use Monte Carlo simulation to play out the entire tournament 1 million times and tabulate how far each team progresses. For example, while France wins in 22 out of every 100 simulations, Georgia only ended up taking the win in 7 of the million simulations.

Monte Carlo methods are computationally less efficient, but lead to a richer 'data bycatch'. For example, we can also tabulate:

  • the most likely opponents
  • how likely a team is to reach the round of 16 as one of the best thirds
  • or even how many goals it's expected to score in the group stages conditional on becoming Champions

It's also easier to model effects like fatigue, and measure the sensitivity of the model to these. In our model, a team that has to play highly rated teams does build up some fatigue that lowers their ratings in later games. This effect is smaller if the team has a high-quality bench, as they can more easily cycle through players to rest them. The tournament is well-balanced though, as this only shifted chances by up to 2 percent.

Data analysis: using LLMs for pundit commentary

We now have our probability calculations, but of course we wanted to present them in a fun and engaging way. That’s where large language models come in.

AI pundit by Luzmo predicting UEFA 2024 scores

We used GPT-4-Turbo to interpret the position ratings and probability distribution. As we feed that data into the LLM, it understands the relative weaknesses and strengths of the various teams, and can identify surprising or outlier results. We chose to use GPT-4-Turbo because it proved to be superior at following instructions to GPT-4o.

We use the LLM to deliver a pundit’s commentary in a witty, humorous tone of voice. It’s a great example of how you can leverage generative AI to summarize insights from your data.

Wrapping up

Our Euro2024 app is just an example of how you could use predictive modeling, embedded analytics and generative AI to build a next-gen data product in only a few days.

Besides placing some sports bets with friends, you likely won’t use this particular app for any life-changing decisions. However, any developer could take the same building blocks, and build AI-powered insights for their software users. To name a few examples:

  • Build sales forecasting charts into your CRM application
  • Visualizing user behavior data and making churn predictions in a customer success platform
  • Predict energy demand based on historical usage in energy management software

In the context of your software application, these insights will empower your customers to make important business decisions faster. And more value for your customers means more loyal, happy users for your software.

Interested in adding AI-powered analytics to your software application? Look no further than Luzmo for stunning, interactive data visualizations. Hook up our platform to any AI model of your choice, and start building the analytics experience your customers desire.

Book a demo with our product experts today to learn more!

Haroen Vermylen

Haroen Vermylen

CTO and Founder

Haroen Vermylen is CTO and Founder at Luzmo. With over 12 years of experience and a degree in artificial intelligence, he's a rooted expert in data visualization and business intelligence. Passionate about technology and data, he loves building new data products and writes about them on the Luzmo blog.

Build your first embedded dashboard in less than 15 min

Experience the power of Luzmo. Talk to our product experts for a guided demo  or get your hands dirty with a free 10-day trial.