Article Text
Abstract
Objective This paper presents an exploratory case study focusing on the applicability and value of process mining in a professional sports healthcare setting. We explore whether process mining can be retrospectively applied to readily available data at a professional sports club (Football Club Barcelona) and whether it can be used to obtain insights related to care flows.
Design Our study used discovery process mining to detect patterns and trends in athletes’ Post-Pre-Participation Medical Evaluation injury route, encompassing five phases for analysis and interpretation.
Results We examined preprocessed data in event log format to determine the injury status of athletes in respective baseline groups (healthy or pathological). Our analysis found a link between thigh muscle injuries and later ankle joint problems. The process model found three loops with recurring injuries, the most common of which were thigh muscle injuries. There were no differences in injury rates or the median number of days to return to play between the healthy and pathological groups.
Conclusions This study explored the applicability and value of process mining in a professional sports healthcare setting. We established that process mining can be retrospectively applied to readily available data at a professional sports club and that this approach can be used to obtain insights related to sports healthcare flows.
- Sporting injuries
- Data Science
- Recurrent
- Sports & exercise medicine
Data availability statement
The data supporting the conclusions of this article can be requested from the corresponding author upon reasonable request.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
WHAT IS ALREADY KNOWN ON THIS TOPIC
Athlete healthcare in professional sports has proven challenging, while this context provides a dynamic multidisciplinary environment that poses specific challenges. Process mining has emerged in healthcare as an effective tool for interpreting and improving complicated care processes. Its application has led to improvements in, among other things, the quality of care, patient safety, patient satisfaction and resource optimisation.
WHAT THIS STUDY ADDS
Our study demonstrates the feasibility of employing exploratory process mining to extract valuable insights into readily available data injury data in a professional sports setting.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
Understanding injury pathways could provide insights into where, to whom, and when we should focus resources. In doing so, process mining might help navigate some of the known implementation barriers in sports.
Introduction
In professional sports, injuries constitute a major concern, with severe repercussions for athlete performance, career longevity and team success.1–3 Hence, professional sports organisations’ capacity to prevent and respond to injuries proactively is of great significance. Over the years, research has provided a vast library of evidence to support and protect athletes’ health.4 However, effectively using this evidence in professional sports has proven challenging,5–7 while this context provides a dynamic multidisciplinary environment that poses specific challenges.8–10
Recent studies have explored the role of sports medicine professionals as members of a high-performance support team.11–14 This work has outlined that healthcare in professional sports is complex and requires flexible athlete care processes. This brings forward the concept of ‘process mining’ as an interesting opportunity to explore in professional sports healthcare. Process mining aims to improve operational processes (ie, processes with repeated execution of activities to deliver products or services) through a data-driven systematic analysis of event data.15 The outcome is a detailed identification of trends, patterns and bottlenecks of processes in practice.15–17 Process mining methods can encompass retrospective analyses, such as identifying the origins of a bottleneck in a process and prospective analyses, like forecasting the remaining processing time of an ongoing case or offering suggestions to reduce the incidence of failures. Both retrospective and prospective analyses have the potential to prompt interventions, such as implementing measures to resolve performance or compliance issues.
Process mining has been successfully applied in many fields, including healthcare, where it has emerged as an effective tool for interpreting and improving complicated care processes.18–20 These outcomes have improved, among other things, the quality of care, patient safety, patient satisfaction and resource optimisation.19–22 Given such positive results, exploring the applicability and value of process mining in a professional sports healthcare setting is only reasonable. Understanding injury pathways could provide insights into where, to whom, and when we should focus resources. In doing so, process mining might help navigate some of the known implementation barriers in sports.8 23–25
In this paper, we present an exploratory case study focusing on the applicability and value of process mining in a professional sports healthcare setting. We explore whether process mining can be retrospectively applied to readily available data at a professional sports club (Football Club Barcelona; FCB) and if it can be used to obtain insights related to care flows. Specifically, we were interested in investigating the relationship between the presence of clinical antecedents when players entered the club with subsequent injury risk, injury severity and return to play (RTP).
Methods
We used ‘discovery process mining’ for our study. This technique uses data mining to discover patterns, trends and insights about an existing process, specifically for our study, the longitudinal injury pathway of athletes after a Pre-Participation Medical Evaluation (PPME). Our discovery process mining entailed five phases, which we outline below. We describe the methodology for each phase below instead of providing an overarching methods section. The first four phases are graphically summarised in figure 1. The fifth phase concerns analysing and interpreting the discovered model, leading to suggestions to improve care processes.
Phase 1—data collection
In this first phase, we collected relevant data from our athletic FCB Medical Services management system, representing 12 different sports (COR Database V.1.0 FCB. Spain).26 This system electronically records all PPME data, periodic check-ups and pathological episodes of illnesses and injuries. The OSICS V.10 code was used to code all episodes in the system.27 Past encodings in IDC9 and IDC10 were transferred to OSICS V.10 to standardise the data. Our current analysis used data from all 2574 athletes over 25 months: from 1 January 2008 to 1 March 2020 (the start of the SARS-COVID-19 pandemic) (table 1). The findings of this observational study have been disclosed in conformity with the Helsinki Declaration.28 Before entering the club, all athletes completed a consent form.
Outcome measures
The athlete’s health status at the PPME is the independent variable in our further analyses. Athletes were considered healthy when no clinical antecedents were detected in the anamnesis at the PPME (‘Healthy’ group; n=1020). Athletes with pathology were characterised as having clinical history at the time of club entrance (‘pathology’ group; n=1534).26 As dependent variables in the analysis, we examined future injury.
We collected clinical information and data related to the type of injury, time loss (TL) or medical attention (MA) and the time to RTP. These metrics were obtained based on consensus definitions and data collection procedures suggested by the Union of European Football Associations.29 30 TL injuries included any injury during a training session or match that caused the player to be absent for at least the next training session or match. In contrast, MA injuries did not result in TL from training sessions and scheduled matches. We calculated RTP as the recovery time (in days) from the day of the injury until the player returned to training or competition after TL. Finally, we defined recurrent injury as any injury of the same type and at the same anatomic location as a previous injury in the same individual within 2 months after RTP.
Phase 2—preprocessing
Before further analysing the data, the database had to be preprocessed to ensure its quality and usability. As we found 21 981 records for 2559 individual pathological cases in our database, it was necessary to perform data filtering to derive useful findings. There were 1100 different OSICS V.10 codes among all accessible cases. This made us concentrate on the four most common OSICS code categories of location (Thigh, Knee, Foot and Ankle) combined with the four most common injury types (muscle, tendon, haematoma and joint). This yielded a data set for analysis in 1439 athletes (815 in a Pathology Group; PG, and 624 in a Healthy Group; HG).
Phase 3—event log creation
In this step, we transformed the preprocessed data into an event log format for further analysis. An event log reflects how a process has been executed and describes the sequence of activities that were performed, when they were executed, and by whom and for whom.31 Specifically, for our study, this step described the injury status and severity of athletes from their designated baseline groups (HG or PG) through follow-up. Both groups had similar dynamics, distribution and data flow during follow-up (figure 2).
Healthy group
In the HG, we registered 2511 injury episodes in 624 athletes. Athletes with injury were, on average, 25.2 (SD±12.1) days with an open injury case. Of these, among 542 athletes, 1447 injuries resulted in an average of 21.2 (SD±9.3) days of TL.
Pathology group
The PG included 2877 injury episodes in 815 athletes. Injury cases were open for an average of 24.9 (SD±10.6) days. Of these. 1636 injuries led to an average of 19.5 (SD±10.7) days of TL in 689 athletes.
Phase 4—process model discovery
Using Process Mining in Disco by Fluxicon,32 33 we analysed the event log to discover our process model. Using this mining algorithm, we filtered data progressively to establish injury patterns. The outcome of this process is graphically presented in a Directly-Follows Graph in figure 3. The Figure is primitive in that it presents all athletes’ data, regardless of their baseline PPME outcome. We can, as such, only observe pathways of injury. We also note that these outcomes are only representable for our club as these pathways are present under the influence of our care provided to the athletes. Consequently, these results cannot be extrapolated to other settings or situations outside our club or study.
Figure 3 highlights a pathway between thigh muscle injuries and subsequent ankle joint injuries. From the 1944 thigh muscle injuries recorded as the primary injury, we distinguished two main ‘fluxes’: one back to thigh muscle injury (n=821) and one to ankle joint injury (n=239). This should be interpreted as that just less than half of all thigh muscle injuries were linked to recurrent thigh muscle injuries, and about one-fifth to subsequent ankle injury. Similarly, for ankle joint injuries, out of 1271 cases, 236 were linked to subsequent thigh muscle injury and 322 to ankle joint injury. No fluxes were established from knee joint injuries. We must iterate that these fluxes do not imply causal relationships; they only show the injury pathway of athletes.
Our process model identified three loops with recurrent injuries that require further attention (table 2). The most commonly recurring injuries were thigh muscle injuries (362 cases; rate 6.7 %), followed by ankle joint injuries (279 cases; rate 5.2 %) and knee joint injuries (102 cases; rate 1.9 %). For none of these injuries, we found a difference in injury rates nor the median number of days to RTP between the HG and PG groups. Our process model revealed a difference in the cumulative duration of thigh muscle injuries (HG 22.3 years vs PG 33.1 years), but this is because the number of thigh injuries in the PG group was larger than in the HG group (362 vs 167). Such differences were not found for the ankle and knee.
Phase 5—optimisation
The main goal of process mining is to unravel processes’ performance, reveal bottlenecks and find areas for improvement.15 18 21 31 As mentioned, for our study, we were interested in the injury pathway of athletes from entry at FCB to their career at our club. We were particularly interested in the potential differing pathways of athletes with clinical antecedents as assessed through our PPME. For this latter interest, we can be brief in saying that during their stay at our club, the injury pathways between HG and PG did not differ. Hence, current care processes appear suited to both groups of athletes.
The high recurrence rates for specific injuries are of interest; in our process, these were related to the thigh, ankle and knee. Our results also found relationships and trends between different injuries, frequently leading to injuries in other anatomical regions. For example, we found a pattern of bijective correlation between posterior thigh muscle injuries and ankle joint injuries. Following these results, the central actions in optimising our care process would focus on the possibility of avoiding injury recurrences and preventing subsequent injuries.34–36
Discussion
This study explored the applicability and value of process mining in a professional sports healthcare setting. Process mining has proven to be a valuable tool for extracting meaningful insights into large and seemingly chaotic datasets, also in healthcare settings.18 We established that process mining can be retrospectively applied to readily available data at a professional sports club and that this approach can be used to obtain insights related to sports healthcare flows.
We should emphasise that these outcomes only say something about the process, not the relationships established in this process. This holds especially true for athletic injury and health data like ours, where previous research has established certain risk factors, like previous injury, that predispose to future injury.37–41 For example, our study focused on the relationship between clinical antecedents when players entered the club and subsequent injury risk, injury severity and RTP. We did not find any of such relationships to exist in our processes. That is not to say these athletes do not have an increased risk of injury; we can only conclude that this relationship was not found in the healthcare structure of FCB. Process mining is an approach that evaluates an operational process in which decisions are made and, in our case, healthcare is provided. A multidisciplinary philosophy strongly guides FCB Medical Services’ medical care for our athletes. Once an athlete entered FCB, individualised care was provided when the PPME detected the presence of clinical antecedents. As a result, the medical team, physiotherapists, physical trainers and coaches may have played a critical role in preserving athlete health and reducing injury risks. This could explain why the injury rates in our study did not differ between the HG and PG groups. Consequently, we concluded no optimisation in our processes is needed at the PPME. At the same time, this highlights that our study is limited to FCB’s specific situation and population, subject to specific monitoring conditions and prevention and maintenance mechanisms that differ in other contexts. Hence, our findings cannot be generalised to other clubs or populations; they can only be used to demonstrate the value of process mining in improving clinical care processes in a professional sports setting.
In our study, we employed process discovery as our process mining approach. This approach entails a discovery technique that produces a process model based on event log data without using additional information.15 This rudimentary approach only provides a descriptive outcome, that is, what happens in practice. Our outcomes do not indicate what is going well in the healthcare process at FCB and what elements demand improvement. This implies that process discovery is not a one-stop shop. We must continue to study the trends and patterns we discovered and understand why these exist at FCB so that we can implement improvements. Then, after processes are revised, we must monitor whether any changes improve these processes and not create bottlenecks elsewhere.
Process discovery is only one of four described basic process mining types. There is also conformance checking, process re-engineering and operational support. Conformance checking aims to detect differences and commonalities between a predefined process model and actual event logs, for example, to see if athlete injury pathways in practice follow the pathway (model) we expect.42 Process engineering resembles conformance checking. However, the goal is to change the process model instead of diagnosing differences, for example, how can we improve our understanding to reflect the observed athlete’s injury pathway better? Operational support aims to directly influence a process by providing warnings, predictions or recommendations, arguably coming close to health monitoring in sports healthcare.15 At first sight, these additional types of process mining hold value for a sports healthcare setting. However, this should be further explored and validated in various contexts.
Even though further work is needed, we want to draw attention to ‘operational support’ process mining. Other process mining types work on event data, that is, analysing events that have already occurred and only providing us with post hoc insights. Even though providing important insights, our understanding of care processes remains static when our insights are limited to only two points: the beginning (eg, screening) and the end (eg, injury). We are unaware of the dynamic processes that connect these moments and the data’s similarities and differences. This is important as the injury’s complex and dynamic aetiology has been well described.9 43 44 This complex aetiology consists of visible athlete-related factors that can be measured and ‘soft’ factors such as quality of care, leadership and communication strategies.10 13 45–47 This dictates that proactive measures to protect athletes’ health demand multidisciplinary prevention plans across a professional sports organisation to mitigate multiple risk factors effectively.7 10 48 49 In current professional sports, numerous data sources are regularly updated in (near) real time, and there is ample computing power to analyse events promptly.50–53 Consequently, ‘operational support’ process mining might be a powerful approach to provide the sports organisation with information on which athlete may benefit from what intervention, where in the process, this intervention should be provided, and by whom this intervention should best be provided.
Conclusion
This paper explored the applicability and value of process mining in a professional sports healthcare context. We presented a case study involving FCB PPME and injury data to gain insights into athletes’ injury pathways after they entered the club. Our study has demonstrated the feasibility of extracting valuable insights into readily available data in a professional sports setting.
Data availability statement
The data supporting the conclusions of this article can be requested from the corresponding author upon reasonable request.
Ethics statements
Patient consent for publication
References
Footnotes
X @ramonpi32, @evertverhagen
Contributors RP and GR developed the rationale behind this study. RP prepared the initial drafts of this manuscript. MB and EV supported the analyses and provided critical feedback and input on manuscript drafts. GR is the guarantor of the study.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests Ramon Pi-Rusiñol and Gil Rodas work at the FC Barcelona Medical Department. Evert Verhagen is the Editor-in-Chief of BMJ Open Sport & Exercise Medicine.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer-reviewed.