Against all odds: What led to the most surprising results in the Bundesliga
This workflow analyzes soccer data of the recent German soccer league (Bundesliga season 2024/2025) and observes what factors might influence an unexpected result in a match.
The data used for this analysis are:
Betting market predictions and match statistics (provided by https://www.football-data.co.uk/)
The total market values of the teams at each match day (retrieved from https://www.transfermarkt.com/)
In the workflow, the data is first read and pre-processed, before creating various new features (feature engineering). Lastly, a logistic regression is implemented to identify the influencing factors of the target column "upset" (= whether a match resulted in the least likely outcome).
This project was worked on by Marta Couto who completed a 3-month internship at KNIME as part of the KNIME Students Challenge program.