This is the first study exploring the potential for non-linearity in the relationship between training load and injury risk for football and handball. We found a J-shaped relationship between training load measured as the sRPE and probability of an injury on the same day in an elite youth handball cohort (figure 2A).
We also found that three methods were able to model the non-linear relationships between training load and injury explored in this paper: the quadratic model, FPs and RCSs, which managed to accurately recreate all simulated risk shapes (figure 4).
Evidence of non-linearity in training load and injury risk relationship research
All modelled relationships between training load and injury risk were either flat (no relationship) or non-linear. The results showed that the strength and direction of the relationship varied between training load—and injury—definitions in the handball population, while no relationships were found in the two football populations.
If we had assumed linearity and modelled the data accordingly, we would not have discovered these relationships. More grievously, we would have concluded there was no relationship between training load and injury risk for elite youth handball players for injury on the same day (linear model, p=0.24, type II error), when it was, in fact, a strong U-shaped parabola (RCS model, p<0.001, figure 2A). This may happen when a relationship is not only non-linear but non-monotonic. In monotonic relationships, the response variable Y (injury probability) moves only in one direction as X (training load) increases, while in non-monotonic relationships, Y sometimes increases and sometimes decreases when X increases.9
In 2013, Gamble5 theorised a U-shaped relationship between training load and risk of injury. Data presented by Blanch and Gabbett6 suggested a J-shaped relationship between ACWR and injury, although the methodology and interpretation of this finding have recently been questioned.7 Here, we reproduced a J shape between sRPE and injury occurring on the same day for elite youth handballers but not for the relative training load described by the ACWR in the same cohort. In Lathlean et al,29 a U shape was discovered between training load and the risk of future injury in an Australian football cohort. These findings might suggest that the training load and injury relationship is different for different sports and populations. Since non-linearity is possible in a training load and injury context, we recommend assuming the data have an unknown, non-linear relationship when conducting statistical analyses.
Methods for addressing non-linear relationships
As expected, standard logistic regression could not model the U and J shapes, as it assumes linearity. For the U shape, the RMSE was threefold higher for the linear model than all other models (RMSE=2.9 vs RMSE≈0.95, figure 4A), showing that violation of the linearity assumption causes major bias and can substantially alter conclusions based on the results. Misleadingly, the linear model had a great C-statistic score (>0.8) and comparable Brier scores. This happened because the sRPE values were highly skewed (online supplemental figure S4). Over 90% of the data points were congested in the left-hand side of the U shape (figure 3, online supplemental figure S4). The linear model, which only managed to model the left-hand side of the U shape, therefore predicted most of the values well, causing the impressive C-statistic. However, it could not predict the right-hand side of the U shape at all and therefore had high RMSE. Consequently, a researcher who measures model fit by predictive ability alone may be falsely assured that the linearity assumption holds true.
Categorisation has previously been explored thoroughly in Carey et al30 and proven a poor method for modelling non-linear relationships. The results were reproduced in our study using a football population, where the RMSE and coverage for categorisation were consistently outperformed by other methods (table 1). In addition, our results showed that categorising by quartiles was suboptimal for modelling non-linear relationships and also suboptimal when the relationship between training load and injury risk was linear (coverage of 25% vs >99% for all other methods).
Recently, some studies have added a quadratic term to the training load and injury model to test for linearity: if the term was non-significant, it was discarded for a linear model; if significant, they categorised the training load variable to handle non-linearity.31–33 If the quadratic term is significant, the researchers correctly choose other options over a linear model. However, the quadratic term only tests for a parabolic shape—not non-linearity in general. A significant quadratic term does not mean the relationship is quadratic (parabolic). It means that a quadratic shape fits better than a linear shape. If the quadratic term is not significant, it does not necessarily mean the underlying relationship is linear, either, only that a quadratic shape fits poorly. Furthermore, testing non-linearity with a quadratic term has been shown to inflate type I error rates by 50%.34
Blanch and Gabbett6 and Carey et al19 used quadratic regression assuming a parabolic relationship between training load and injury risk. In our study, quadratic regression modelled the U-shaped risk profiles with low degrees of error (figures 3 and 4A) and had the best performance for the J-shaped relationship (figure 4B). This is expected, as the J shape was initially constructed from a quadratic model in Blanch and Gabbett.6 Contrary to a real-life setting, however, we knew the risk profiles before analysing our data. Quadratic regression does not explore shapes but constrains the model to follow a specific pathway. We think it is only appropriate when strong evidence from previous studies support a parabolic relationship. We recommend assuming non-linearity of unknown shapes and using methods not to test for linearity but to explore and model non-linearity to discover the relationship. Based on our findings and previous research in other fields such as medical statistics,35 FPs and RCSs appear to be the best methods for doing this.
FPs modelled all risk shapes accurately (figure 4, table 1). FP has recently been used in a training load and injury risk study.29 This method requires minor subjective influence, and the results are intuitive, especially for users familiar with quadratic regression. Although it appears the superior choice at first glance, the method has a disadvantage: FPs are defined only for positive values, which means that an FP model is unable to model negative values and the value 0. In the context of training load and injury risk research, training load is (traditionally) never measured on a negative scale.36 If it can be justified, adding a small constant (such as 0.001, or whatever is considered small in the context of the measuring scale) to all training load values can solve the problem with 0 and allow the use of FPs.
RCSs performance depended on how knot locations (the points where the polynomials that make up cubic splines are joined, see online supplemental file 2 for details) were chosen. In the data-driven method, where knots were automatically placed by the default setting, RCS failed to model the U-shaped scenario (figure 3). When knot position was chosen based on the range of the training load variable, RCS modelled the U accurately (figure 3). However, the results were the opposite for the J-shaped relationship where the data-driven method was among those of lowest error, and the subjectively located knots had the highest amount of error (figure 4B). The default placement algorithm was by quartiles, and in the highly skewed distribution of the sRPE values used in the U-shaped relationship (online supplemental figure S4), it caused the knots to be placed tightly together (figure 3). Therefore, it could not model the shape, while the subjective version was created with the range of the values in mind. The ACWR values used in the J shape had a Gaussian distribution (online supplemental figure S4), and using quartiles was a feasible choice. This shows the importance of careful model calibration using clinical knowledge and knowledge of the data.
RCS produces effect sizes that are difficult to use in a practical setting, and results can only be interpreted in the form of p values and visualisation (such as in figure 2). RCS is less ideal than FP in causal research. Still, its disadvantages are not as relevant in predictive research where interpretability is of minor concern.25 We propose a guide for when FP is recommended and when RCS is recommended (box 1).
Box 1Recommended methods to model non-linear relationships between training load and injury risk
To model non-linear relationships, either Fractional Polynomials (FP) or Restricted Cubic Splines (RCS) can be used.
Fractional polynomials are easier to interpret. We recommend FP under the following conditions:
We recommend restricted cubic splines under the following conditions:
When the main objective is predictive research, RCS is preferred.
When the training load measures must have the value 0. This includes studies that wish to capture a change in the effect, regardless how small, going from no training load at all to any amount of training load.
When training load is included in the study merely to adjust for it as a potential confounder and is not the main variable of interest.
We do not recommend changing the study aims or the chosen measure to use FP, nor do we recommend using FP under certain conditions and RCS for other conditions in the same study.
A step-by-step guide to performing FP and RCS in R can be accessed on the primary author’s GitHub page.39 40
Limitations
A limitation of this study was the sample size, the number of injuries and consequently statistical power. Neither of the two football cohorts satisfied the recommendation of >200 injuries to detect a small to moderate effect.37 The elite youth handball data, despite having a sufficient number of injuries, had high amounts of missing sRPE values (64%), and this may have caused selection bias. We emphasise that the exploration of non-linearity in these data were for illustrative purposes and not to show causal inference.
We used statistical methods commonly used and recommended in the field to demonstrate how non-linear relationships can be ascertained with existing methods. We were consequently limited in the choice of methods. The ACWR model is under debate, and the pros and cons of the method have been explored extensively in recent publications.12 18 38 The purpose of this paper was not to provide additional insight into that discussion but rather to demonstrate how a continuous training load variable should be modelled to account for non-linearity. For this reason, we opted to use ACWR, as it is currently the most used training load method in the field of training load and injury risk research.4