Article Text

Download PDFPDF

Externally validated machine learning algorithm accurately predicts medial tibial stress syndrome in military trainees: a multicohort study
  1. Angus Shaw1,2,
  2. Phil Newman3,
  3. Jeremy Witchalls3,
  4. Tristan Hedger4
  1. 1Faculty of Health (Physiotherapy), University of Canberra, Canberra, Australian Capital Territory, Australia
  2. 2Physiotherapy, Matrix Physiotherapy & Sports Clinic, Queanbeyan, New South Wales, Australia
  3. 3Faculty of Health (Physiotherapy), Research Institute for Sport and Exercise (UCRISE), University of Canberra, Canberra, Australian Capital Territory, Australia
  4. 4Physiotherapy, Australian Defence Force Academy, Campbell, Australian Capital Territory, Australia
  1. Correspondence to Dr Angus Shaw; a.shaw680{at}


Objectives Medial tibial stress syndrome (MTSS) is a common musculoskeletal injury in both sporting and military settings. No reliable treatments exist, and reoccurrence rates are high. Prevention of MTSS is critical to reducing operational burden. Therefore, this study aimed to build a decision-making model to predict the individual risk of MTSS within officer cadets and test the external validity of the model on a separate military population.

Design Prospective cohort study.

Methods This study collected a suite of key variables previously established for predicting MTSS. Data were obtained from 107 cadets (34 women and 73 men). A follow-up survey was conducted at 3 months to determine MTSS diagnoses. Six ensemble learning algorithms were deployed and trained five times on random stratified samples of 75% of the dataset. The resultant algorithms were tested on the remaining 25% of the dataset, with models then compared for accuracy. The most accurate new algorithm was tested on an unrelated data sample of 123 Australian Navy recruits to establish external validity of the model.

Results Calibrated random forest modelling was the most accurate in identifying a diagnosis of MTSS; (area under curve (AUC)=98%, classification accuracy (CA)=96%). External validation on a sample of Navy recruits resulted in comparable accuracy; (AUC=95%, CA=94%). When the model was tested on the combined datasets, similar accuracy was achieved; (AUC=92%, CA=91%).

Conclusion This model is highly accurate in predicting those who will develop MTSS. The model provides important preventive capacity which should be trialled as a risk management intervention.

  • Running
  • Shin
  • Injury
  • Prevention
  • Risk factor

Data availability statement

Data are available upon reasonable request.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Medial tibial stress syndrome (MTSS) is a common musculoskeletal injury in physically active populations, but no reliable treatment(s) exist, and reoccurrence rates are high. Therefore, developing preventative measures are key to reduce injury burden.


  • Military institutions, clinicians and instructors are now equipped with a low cost and user-friendly decision-making model, allowing accurate and individual level risk predictions for future MTSS development.

  • The predictive power of the model was proven to be robust to population change, capable of determining MTSS risk within separate military populations.


  • Once an individual’s risk of MTSS is calculated, targeting the modifiable risk factors may serve as the strongest preventative measure for this difficult to treat condition.

  • Using this tool, interventions could be modelled and customised to reduce individual risk within their profile of modifiable and unmodifiable characteristics.


Medial tibial stress syndrome (MTSS) is a common cause of exercise-induced leg pain.1 Yates and White define MTSS as ‘pain along the posteromedial border of the tibia that occurs due to exercise, excluding pain from ischaemic origin or signs of stress fracture’.2 Pain is typically spread over a minimum of 5 cm, and is recognisable on palpation of this posteromedial tibial border.2 Symptoms are described as a dull ache following exercise, lasting for hours or days.2 In severe cases, pain may be provoked at rest and during activities of daily living.1 2

MTSS is frequently seen in active individuals, including runners, jumping athletes and military personnel.3 A 2012 systematic review, including 3500 runners, identified an incidence of 13.6%–20% over 12 months.4 This review highlighted MTSS as the most common running-related musculoskeletal injury, ahead of Achilles tendinopathy and patellofemoral pain syndrome.4 MTSS presents a significant medical burden among military populations, with up to 35% incidence reported across 10 weeks.2 Prospective data from 6608 British Army recruits found that MTSS is associated with the longest rehabilitation time, accounting for 19.8% of total recovery days.5 Australian Defence Force Academy (ADFA), a triservice officer cadet training facility, injury surveillance data from 2008 showed a mean of 57.5 days of incapacity per individual due to MTSS, which equates to $A6820 of wage costs for working days lost.6 7 In running populations, some runners may even take up to 300 days to recover sufficiently enough to complete an 18 min run.8

Considerable research has investigated risk factors for MTSS3 9 10 including leg length, ankle range of motion and chronic disease.9 11 Understanding the risk factors for MTSS is key, particularly in the absence of definitive treatment.1 10 Risk factor identification has the potential to be the foundation for designing individualised injury prevention programmes, to ultimately help decrease the incidence of MTSS.10 Garnock et al used a suite of risk factors from two independent systematic reviews with meta-analyses9 10 to produce a statistically significant model, Concordance statistic (area under curve (AUC))=0.81, for predicting MTSS development.7 Their study targeted a predominately male military population of Navy recruits, and thus, the generalisability of their predictive model is still unknown.12 The ADFA is known to have increasing rates of female participation.13 Given women are known to have increased risk of MTSS,10 the ADFA presents an ideal sample for MTSS risk prediction.

Thus, the aims of this study were threefold.

  1. To train and test a model to predict the individual risk of MTSS in first year ADFA officer cadets.

  2. To evaluate the accuracy of the model by external cross-validation on data from a separate military population.7

  3. To evaluate the accuracy of the model on the two datasets combined.


Study population

This research used a prospective cohort study design within a sample of volunteer first year ADFA cadets undergoing 3 months of initial military training. The study design replicated a previous prospective study at the Australian Navy Recruit School in Victoria, Australia.7 The results of this Navy study served as the cross-validation dataset.

The Officer Training College at ADFA was selected as the primary site for investigation, given its tri-service military population.

Participants were recruited face to face during induction day in January 2020. Included in the study were trainees, aged 18 years or above and gave voluntary consent to participate. Participants were excluded if they were currently experiencing shin pain or being treated for MTSS. Patients or the public were not involved in the design, or conduct, or reporting, or dissemination plans of our research.

Data collection

Once written consent was gained from eligible cadets, screening for MTSS risk factors proceeded and included a 4 min physical examination and a 5 min paper-based survey.

The physical examination included navicular drop, body mass index (BMI) and passive ranges of motion (PROM) for ankle plantarflexion (PF) and hip external rotation (ER). BMI was calculated after height and weight measurements using a pair of digital scales and a stadiometer (CPWplus 200M Floor Scales, Seca 206 Height Measure). The primary investigator conducted all PF PROM measures, while another investigator performed all hip ER measures to eliminate inter-rater bias. Hip ER and ankle PF were measured using a digital goniometer (Halo Digital Goniometer, Halo Medical Devices, Sydney, Australia). Digital goniometry has yielded high inter-rater reliability with interclass correlation (ICC) estimates of ICC=0.99 in Hancock et al (2018)14 and ICC=0.89–0.98 in Correll et al.15

The measurements of hip ER PROM followed the procedure by Garnock et al.7 The measurement of ankle PF was performed with the subject in long sitting. The goniometer stationary axis was aligned with the lateral malleolus, and the fibula and base of the fifth metatarsal served as the moving axis landmark.16

Navicular drop was calculated as the difference in height of the navicular tuberosity between relaxed stance and single-leg weight-bearing.7 This process yields moderate intrarater reliability (ICC=0.61–0.79)17 and a high inter-rater reliability (ICC=0.84).18

A paper-based survey included questions about previous history of MTSS, prior orthotic use and years of running experience. The additional variables of estimated average number of runs per week and distance per run in the last 6 months were included, as they had not been investigated in previous research.7 The survey was completed by participants within 2 weeks of the initial physical examination (figure 1).

Figure 1

Flowchart of participants (first year Australian Defence Force Academy (ADFA) officer cadets). MTSS, medial tibial stress syndrome.

At 3-month follow-up, a paper-based survey captured the incidence and severity of any MTSS cases. Survey items were taken from the MTSS Score, including questions on shin pain after exercise (including marching), while walking and at rest.19 A physical examination by a Physiotherapist was conducted to confirm any symptoms of MTSS. This examination followed a three-step process (pain history, location of pain and shin palpation).2 This study was not approved for imaging investigations. As such, a systematic procedure was then followed by the Physiotherapist to minimise the risk of other lower leg injury syndromes that may masquerade as MTSS, such as stress fractures.20 Focal pain areas of only 2–3 cm are thought to be more typical of stress fracture,2 and thus the >5 cm rule was emphasised to reduce the risk of more significant bone stress injuries being included.

Strict ethics restrictions by the Departments of Defence and Veterans’ Affairs Human Research Ethics Committee did not allow participant consent to cover clinical information obtained outside of the study surveys. This meant that the researchers were unable to control whether a cadet reported having their MTSS symptoms confirmed by a Physiotherapist, leaving the research team with the decision for an MTSS diagnosis to be based on one of the two options: (1) either a survey response and confirmation by a Physiotherapist with physical examination, (2) or a survey response only, indicating medial border distal tibia pain on self-examination as per Yates and Whites’ definition,2 associated with walking and/or running.

Statistical analysis examined which combination of the 10 MTSS risk factors produced the best predictive model for assessing the risk of future development of MTSS.7 Data were analysed using Orange (Data Mining Toolbox in Python developed by Bioinformatics Lab at University of Ljubljana, Slovenia).21 Variables input to the models were checked for covariance and information gain. Ensemble methodology was then deployed using logistic regression and Bayesian statistics (BS). Six machine learning (ML) methods (decision tree, support vector machine, logistic regression, K-nearest neighbour, naïve Bayes, random forest and calibrated random forest),22 23 were used to build predictive models. The use of BS over frequentist statistical approaches in applied ML is increasing.24 Bayesian methodologies view probability as related to the degree of confidence or belief in the occurrence of an event.24 Frequentist statistics is based on the null hypothesis, where the probability of an event can be defined as the long-term frequency of the event.24 BS treat uncertainty as a fundamental and inherent aspect of the problem and incorporates it explicitly into the model.24 BS provides several advantages over frequentist approaches in applied ML, including probabilistic modelling of uncertainty, improved accuracy in prediction, ability to handle complex models and better handling of missing data.24 Exploring the mathematical functions of each ML model is beyond the scope of this paper, with descriptions explained previously.22 25 26 Models were then trained using 5 random folds of 75% of the data, and resultant models were cross-validated on the remaining 25% of data in each fold. Each model was then tested for accuracy using the C-statistic/AUC, classification accuracy (CA), F1, precision, recall and visualised with confusion matrices.27 28 The best predictive model identified was tested on the dataset from Garnock et al’s Navy population.7 For visualisation of the data separation process, see Van Eetvelde et al.29

Using Garnock’s findings7 (AUC=0.81), power calculations were performed using MedCalc (MedCalc Software, Belgium). With an estimated MTSS prevalence of 30% MTSS, this study needed 216 participants, with 63 MTSS cases for adequate power.


During this study, COVID-19 physical distancing restrictions were enforced by the Australian Government. Email correspondence with Duntroon Health Centre Physiotherapists notified the research team that cadets were no longer participating in structured physical training (PT) sessions. Participants in the study were therefore limited to their ‘own’ training, which included self-directed running and body weight exercises. Nearby bushfires on the day of initial physical testing also limited availability of volunteers.

Of the 127 cadets that underwent physical screening for MTSS risk factors, a total of 107 (84%) cadets were available to complete the initial MTSS risk-factor survey. A total of 20 participants were lost to follow-up due to unavailability at the time of survey completion on base. This left 107 cadets (73 men, 34 women) to be followed prospectively across the 3 months of initial military training. At 3-month follow-up, 99 cadets (69 men, 30 women, mean age 19.3 (table 1)) remained for inclusion in statistical analysis, with 8 lost to follow-up (unavailable to complete survey).

Table 1

Characteristics of participants, first year Australian Defence Force Academy officer cadets.

During the 3-month training period, 35 cadets (35.5%) met diagnostic criteria for MTSS, 21 (30.4%) men and 14 women (46.6%). All variables were ranked by information gain. MTSS history yielded the strongest information gain for predicting future MTSS (0.08) (table 2). Covariance analysis showed a weak to very weak correlation between the variables, so none were excluded from modelling (table 3).

Table 2

Features ranked by contribution to combined calibrated random forest model (information gain)

Table 3

Feature correlations within combined calibrated random forest model

Calibrated random forest modelling (CRFM) trained and tested on the ADFA dataset yielded the highest AUC, CA and F1 in predicting a diagnosis of MTSS.

Cross-validation and testing 5 times on the remaining 25% of the dataset revealed an AUC of 0.98, CA 0.96 and F1 0.96 on average. Testing the CRFM on an unrelated Navy dataset7 n=123, 95 men, 28 women and a total of 30 MTSS cases, revealed comparable accuracy in predicting risk for MTSS (AUC; 0.95, CA; 0.94, F1; 0.88). When the CRFM was tested on the combined ADFA and Navy dataset, it maintained good predictive accuracy with AUC; 0.92, CA; 0.91 and F1; 0.85 (table 4). Confusion matrix analysis showed the model correctly classified 56 out of 67 cases of MTSS (figure 2).

Table 4

Accuracy of calibrated random forest models (CRFM)

Figure 2

Confusion matrix analysis for combined datasets (Australian Defence Force Academy and Navy Dataset)

The risk estimations of the CRFM were visualised via a nomogram produced within the Microsoft Power Business Intelligence interface (Microsoft Corporation, V.3220.30820.19513.0). Individual risk calculations (minimum, maximum and mean) were modelled by adjusting values of the variables (see figure 3, eg, dashboard).

Figure 3

Allows filtering medial tibial stress syndrome (MTSS) risk by men/women. Adjustable sliders allowing the dialling up/down of risk factor features. Green line represents group probability for MTSS risk (37%). Overall, 66.59 represents percentage probability for a target individual based on the dialling up/down of risk factor features.


Principal findings

This study has demonstrated that combining a suite of best-evidence risk factors using an ML approach leads to accurate and individualised risk prediction for MTSS. CRFM yielded the highest accuracy (AUC) when trained and tested on a triservice dataset. Importantly, the predictive power of this model appears robust to population change.12 When the model was cross-validated against an unrelated dataset from Navy recruits, the model demonstrated comparable accuracy.7 When we applied the model to the two datasets combined, the accuracy remained analogous (table 4).

Clinical implications

Having high predictive capacity should facilitate prevention of MTSS. If training, coaching, command staff or trainees themselves can identify their risk, then risks can be managed. The interactive nomogram (figure 3) allows individualised risk profiling for MTSS. The probability of MTSS development is calculated based on an individual’s risk factor scores. Once an individual’s risk of MTSS is calculated, targeting the modifiable risk factors may serve as the strongest preventative measure for this difficult-to-treat condition.10 Using this tool, interventions can be modelled and customised to reduce individual risk within their profile of modifiable and unmodifiable characteristics.10

This investigation has revealed an MTSS incidence of 35% within a first year ADFA population. This is consistent with previous prospective research in military samples with 24%7 and 35% incidence rates.2 However, the rates of reporting MTSS symptoms to Duntroon Health Centre Physiotherapists in this study were low. This is not the first time to have occurred within prospective studies of military personnel. In Yates and White’s investigation into 124 Naval recruits, only 30% of recruits who developed MTSS followed up with medical treatment.2 Furthermore, Almeida et al investigated gender differences in lower limb musculoskeletal injury reporting rates within a US military population.30 A total of 35.8% of injuries went unreported, with MTSS identified as the most common unreported diagnosis.30 These findings in combination with the present study may highlight a trend of reluctance to report injuries within the military to avoid being downgraded or medically restricted during their PT. Giving the cadets the option to have/not to have their MTSS symptoms confirmed by a clinician also likely influenced the rates of reporting. Another reason that may explain under-reporting of MTSS symptoms was the COVID-19 physical distancing restrictions. Structured PT sessions ceased. Participants in the study were therefore limited to their own PT, including self-directed running and body weight exercises. Consequently, there may have been a reduced desire for cadets to present to physiotherapy to have any underlying MTSS complaints assessed and treated. Out of the 10 features within our model, MTSS history yielded the strongest impact on the performance of the combined CRFM. A history of MTSS is a known risk factor for the condition.7 10 However, there were very weak correlations between the features, likely signifying the importance of interaction between all the risk factors combined.

Strength and limitations

This study has several strengths. The first is the high response rate to the 3-month follow-up survey (92.5%). This matches Garnock et al’s7 study and represents a highly controlled sample.15 This research confirms previous studies5 7 investigating MTSS risk in military personnel, with comparable MTSS incidence rates and proportions between sexes. The use of a self-reported survey appears to have increased compliance with reporting compared with the effort required to visit a Physiotherapist. This may have enabled this study to assess the prevalence of a ‘subclinical’ level of MTSS within this military population. Thus, the chosen approach has value in achieving broader surveillance of MTSS incidence.

The use of a self-reported MTSS diagnosis as the primary outcome is a key consideration when interpreting the results. The clinical criteria versus self-reported criteria involved the same essential steps; pain history and examination, and self-report has previously shown to reliably match clinician diagnosis of MTSS.31 The key difference was Physiotherapist-lead versus participant-lead palpation for the site of symptoms. The addition of a Physiotherapist examination ensures MTSS symptoms are ≥5 cm in length and along the posteromedial tibial border.2 Physiotherapist examination also assists in ruling out stress fracture or other coexisting injuries.2 31 The prevalence of MTSS within our study was consistent with previous studies where clinical diagnosis was performed, giving further confidence in the results. The ADFA first year sample did not reach adequate size for statistical power. Arrivals at data collection day were limited due to bushfire-related travel delays. Survey completion and PT were disrupted due to COVID-19 restrictions. However, the aggregated datasets from ADFA and the Navy trainees did reach adequate sample size with 222 individuals and 65 cases.

Future research

This study did not successfully capture week-to-week training loads, only the self-reported baseline running load. Investigation into risk factor screening combined with training load monitoring to predict future injury is not well established.32 Studies have typically used either analysis of risk factors alone, or training loads and their relationship with injury.7 33 It is proposed that those accustomed to increased training loads have reduced injuries compared with those unaccustomed individuals.32 In team sport, evidence exists supporting that sharp increases in training load and spikes in acute (7 days) to chronic workload (28 days) ratios (ACWR) are associated with higher injury rates.34 Research by Rossi et al35 has shed light on the accuracy of using global positioning systems (GPS) training data in combination with ML algorithms to forecast injury risk in team sport athletes. Rossi et al used GPS data to calculate individual ACWR profiles.35 However, rather than focusing purely on the ACWR, Gabbett36 recommend consideration of known moderators to the workload–injury relationship (eg, injury history and other factors known to influence the risk of injury). A moderator may either increase or decrease risk of injury at a given training load.37 Specific to MTSS, example moderators to the workload–injury relationship may include MTSS history and previous years of running experience. Importantly, Wang et al38 highlighted several potential limitations of using the ACWR. These authors suggest that the ACWR is vulnerable to sparse data bias, time-dependent confounding and recurrent injuries.38 An alternative may be the use of causal inference-based strategies, which account for time dependencies of activity and confounders (eg, training schedules).38 Future research using the ACWR may consider using both time-to-event and multilevel modelling.39 Nevertheless, the application of an ML methodology that targets risk factor profiling combined with training load monitoring is worth investigating within military settings.36

Risk factor profiling for MTSS only contributes to one piece of the puzzle. The next step to managing this difficult condition may be through modifying the modifiable risk factors.7 Once an individual has been identified as ‘at risk’, the design and implementation of injury prevention programmes may serve as the best approach to reducing MTSS incidence. Like other musculoskeletal conditions, risk prediction for MTSS is often complex and multivariate. In Lahti et als’ work in professional football players, multifactorial and individualised risk reduction programmes were prescribed based on the outcomes of risk factor screening.40 Such approach is worth investigating within a military sample.


The outcomes of this study demonstrate that an inexpensive model including a suite of evidence-based risk factors can accurately predict which military trainees will develop MTSS. The model maintains accuracy when externally validated in an unrelated military sample. These outcomes enable future research to develop individualised injury prevention programmes that address the modifiable MTSS risk factors. The application of predictive modelling methodologies targeting risk factor profiling combined with training load monitoring is worth investigating within military settings. Further understanding of how these variables interact and influence MTSS outcomes will help bridge the gap in reducing MTSS incidence.

Data availability statement

Data are available upon reasonable request.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants. Ethics approval was obtained from The Departments of Defence and Veterans’ Affairs (DDVA) Human Research Ethics Committee (HREC) (167-19) and the University of Canberra HREC (20193336). Participants gave informed consent to participate in the study before taking part.


We would like to acknowledge the commanding officer, all staff and cadets of the Officer Training College at the Australian Defence Force Academy for their willingness to participate and for their support throughout the study. In particular, the assistance provided by physiotherapy department at the Duntroon Health Centre during data collection was very much appreciated.



  • Contributors PN and JW contributed to the original study conception and design. Methodology: AS, PN and JW. Data collection: AS, PN, JW and TH. Statistical analysis: AS, PN and JW. Writing original draft: AS. Writing review and editing: AS, PN, JW and TH. Supervision: PN and JW. AS is the guarantor of the study and manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.