Share it with your network!
Help your friends to new knowledge
Dun & Bradstreet Group Analytics has joined forces with local analytics teams to see if data science and machine learning can be used to predict which national team will win football’s biggest competition. By leveraging all historical data about national team games over the past 4 years, Dun & Bradstreet has developed a model that estimates the probability of a win, draw or loss, as well as goal difference for any future game between the national teams involved in the competition.
“Our first objective was to develop a model able to predict the result of a single game, based on the teams’ characteristics,” says Pierre Deville, Head of Data Science and Analytics at Group Analytics. “The second objective was to determine the most likely scenario for the competition and other derived statistics by running large-scale simulations, considering the specifics of the tournament.”
The first model was based on historical data regarding type of tournament, the location of the game, the score, and so on. Team ratings were fed into an advanced machine learning model using a technique known as eXtreme Gradient Boosting to calculate probable outcomes for each game. Using this predictive model, Dun & Bradstreet Group Analytics ran simulations for the actual games that will take place.
“By generating millions of simulations, we derived probabilistic information associated to any team reaching any stage of the tournament,” says Goran Loncar, current Director of Group Analytics. “Our approach not only allowed us to estimate the probability of a team reaching a certain stage, but it also provided the most likely scenario for the entire tournament, as well as the overall chance for each team to lift the cup.”
Dun & Bradstreet USED MACHINE LEARNING, the subfield of computer science that “gives computers the ability to learn without being explicitly programmed”. Evolved from the study of pattern recognition and computational learning theory in artificial intelligence, machine learning uses algorithms that can learn from and make predictions based on data.
Dun & Bradstreet USED DATA MINING, an analysis technique that focuses on detecting patterns and trends in data, to feed the predictive model and rank the teams.