A project by E. Deusebio, L. Johnnes, A. Marchini, M. Menichinelli
About the dataset
The dataset consists mainly of raw data collected during a single football match. Particularly it includes:
/ Output of the Optical Tracking System used;
/ Position sampled for players, ball and referees time-stamped with milliseconds;
/ About 15 samples per second;
/ About 180k samples per match.
Moreover some additional relational databases were provided containing information about the match refined by the Data Sponsor (ball touch, match whistles, …).
Challenge provided by the Data Sponsor
The Data Sponsor is pretty interested in data analysis. Actually he is aware of the enormous potential still hidden in the data collected. Some of the challenges proposed to the Divers are:
/ Automatic Match analysis;
/ Real-time predictive analysis;
/ Pattern recognition;
/ Quality check;
/ Performance analysis and index;
/ Match “energy level”;
/ “Data Noise” filtering;
/ Triggering alert on statistics;
Execution by Divers
After an initial brainstorming the Divers team defined two different sub-projects:
A) The first one was a kind of predictive analysis. Particularly 2 models were investigated
a. A predictive model that aims at anticipating the highlights in the football match.
b. A model that would help the tracking system to prevent errors in tracking of the ball or of the players.
B) The second sub-project (definitely with a greater degree of development) was a study of the match like a Social Network. At the end, in fact, football is a team sport, so interactions among the players are crucial.
A web dashboard has been developed as final output. This dashboard might help managers or coaches to deepen why one of the team was so superior to the other in the match studied. Some interesting features of this dashboard were:
/ the chance to select part (time window) of the match;
/ the chance to see the position of the players on the fields;
/ the chance to see the most common patterns (1 or more passes);
/ the chance to see the most common patterns from the perspective of one particular player;
/ the chance to see how much the players are interconnected each others;
/ the chance to determine a kind of “community detection” inside the team;
/ the chance to see the goodness of a player (based on the number of balls that he gets VS the number of balls that he lose);
/ the chance to see the area of the field where there are more interactions.
/ Unfortunately the match was not so representative due to the final score (very unbalanced between the two teams).
/ Predictive analysis has an high potential but the results get during the project need more time and more work to be used.
/ Social network analysis provided many interesting information. Of course a larger set of data and different matches could help in obtaining more representative results.
A global leader for the sport business: digital, mobile, social, broadcast, results, content and professional services.
/ ENRICO DEUSEBIO
/ LIONEL JOHNNES
/ ANDREA MARCHINI
/ MASSIMO MENICHINELLI