Aim To develop a standardised ultrasound imaging (USI)-based criteria for the diagnosis of tendinopathy that aligns with the continuum model of tendon pathology. Secondary aims were to assess both the intra-rater and inter-rater reliability of the criteria.
Methods A criteria was developed following a face validity assessment and a total of 31 Achilles tendon ultrasound images were analysed. Intra-rater and inter-rater reliability were assessed for overall tendinopathy stage (normal, reactive/early dysrepair or late dysrepair/degenerative) as well as for individual parameters (thickness, echogenicity and vascularity). Quadratic weighted kappa (kw) was used to report on reliability.
Results Intra-rater reliability was ‘substantial’ for overall tendinopathy staging (kw rater A; 0.77, 95% CI 0.59 to 0.94, rater B; 0.70, 95% CI 0.52 to 0.89) and ranged from ‘substantial’ to ‘almost perfect’ for thickness (kw rater A; 0.75, 95% CI 0.59 to 0.90, rater B; 0.84, 95% CI 0.71 to 0.98), echogenicity (kw rater A; 0.78, 95% CI 0.62 to 0.95, rater B; 0.73, 95% CI 0.58 to 0.89) and vascularity (kw rater A; 0.86, 95% CI 0.74 to 0.98, rater B; 0.89, 95% CI 0.79 to 0.99). Inter-rater reliability ranged from ‘substantial’ to ‘almost perfect’ for overall tendinopathy staging (kw round 1; 0.75, 95% CI 0.58 to 0.91, round 2; 0.81, 95% CI 0.63 to 0.99), thickness (kw round 1; 0.65, 95% CI 0.48 to 0.83, round 2; 0.77, 95% CI 0.60 to 0.93), echogenicity (kw round 1; 0.70, 95% CI 0.54 to 0.85, round 2; 0.76, 95% CI 0.58 to 0.94) and vascularity (kw round 1; 0.89, 95% CI 0.79 to 0.99, round 2; 0.86, 95% CI 0.74 to 0.98). Inter-rater reliability increased from ‘substantial’ in round 1 (kw 0.75, 95% CI 0.58 to 0.91) to ‘almost perfect’ in round 2 (0.81, 95% CI 0.63 to 0.99).
Conclusion Intra-rater and inter-rater reliability were ‘substantial’ to ‘almost perfect’ when utilising an USI-based criteria to diagnose Achilles tendinopathy. This is the first study to use the continuum model of tendon pathology to develop an USI-based criteria to diagnose tendinopathy.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
What are the new findings
A standardised criteria for the diagnosis of tendinopathy aligning with the continuum model of tendon pathology is presented.
The criteria is reliable in staging Achilles tendinopathy.
The proposed criteria aligns with accepted clinical terminology used to describe tendinopathy.
Tendinopathy refers to persistent tendon pain and dysfunction which is related to mechanical loading.1 Tendinopathy accounts for 30%–50% of all overuse injuries,2 with Achilles tendinopathy being among the most common.3–6 While there are multiple models to describe the pathogenesis of tendinopathy,7–10 the continuum model of tendon pathology proposed by Cook and Purdam in 2009,9 and updated in 2016,11 is widely used to clinically describe and diagnose tendinopathy.3 9 11 12 The continuum model of tendon pathology used both imaging and clinical features to characterise the different stages of tendinopathy.9 11 The continuum model proposed three key stages of tendon pathology; reactive tendinopathy, tendon dysrepair and degenerative tendinopathy.9 Although the model is described in three distinct stages, it is acknowledged that tendon pathology occurs on a continuum, with continuity between stages.9 11
A clinical diagnosis of tendinopathy is primarily derived from the patient history and clinical tests.11 13–21 Clinical tests have been shown to be sensitive for detecting tendinopathy, and they are not specific in identifying pathological change when compared with imaging22–24 and do not allow clinicians to stage their patient according to the continuum model of tendon pathology.11 Conversely, ultrasound imaging (USI) has been shown to be both accurate and sensitive for detecting pathological structural change within tendons25–27 but does not always correlate with pain and dysfunction.11 28 29 Although reviews have demonstrated both an association and dissociation between tendon structure, and function and pain,20 21 structural changes identified on USI can be considered a risk factor for the development of symptomatic tendinopathy.11 29–31 The relative risk (RR) for developing pain in asymptomatic (clinical tests negative with structural changes on imaging) Achilles tendinopathy has been reported as 5.45–7.33.30 31
While USI is useful for detecting tendon structural change, it is considered an operator-dependent modality, and standardised practices for both capturing and assessing ultrasound images can increase inter-rater reliability when assessing neovascularity, structural change, calcification (k 0.46–0.96) and tendon thickness (ICC 0.85–0.98) for tendinopathy in both the upper limb32 and lower limb.33 Although the standardisation of criteria for assessing tendon structural change improves reliability, there is significant heterogeneity of USI-based diagnostic criteria.31 34 This, in conjunction with the cross-sectional design of many imaging studies,30 and lack of a clinical gold standard with which to compare images,29 contributes to the uncertainty of the relevance of USI in the clinical diagnosis of tendinopathy.30
When assessing current methods for diagnosing tendinopathy using USI-based criteria, a recent systematic review31 reported that the most common parameters used to measure tendon pathology include tendon thickness, echogenicity and vascularity. Furthermore, while the stages of tendinopathy are distinguished by specific imaging features,9 11 no studies used the continuum model of tendon pathology to stage tendinopathy.31 Additionally, it was discovered that the use of a combination of the three most commonly reported parameters demonstrated a higher risk (RR 6.49) for developing symptomatic tendinopathy when compared with using two parameters (RR 3.66).31 Therefore, it was suggested that future USI-based criteria use a combination of three parameters (tendon thickness, echogenicity and vascularity) in addition to an ordinal-based USI criteria that better aligns with the continuum model of tendon pathology (‘normal’, ‘reactive/early dysrepair’ or ‘late dysrepair/degenerative’).31
The continuum model of tendon pathology provides a framework for describing the pathogenesis of tendinopathy and associated clinical and imaging features. The staging of tendon pathology may be beneficial for clinicians to target treatment according to the tendon structure.9 11 Tendons in the ‘reactive/early dysrepair’ stage may have the ability to regain normal structure with appropriate management35 and treatments aimed at inhibiting tendon cell response to prevent further tendon structural change may be prioritised.11 Similarly, treatments that may aggravate symptoms and facilitate further tendon structural breakdown (eg, heavy-loaded eccentrics, intra-tendinous injections, etc) may be avoided.11 Conversely, in ‘degenerative’ tendinopathy there is little capacity for the tendon structure to change, and tendon structure does not relate to clinical outcome.36 Thus, treatment should be aimed at improving the capacity of the tendon to handle variable loads rather than to stimulate a healing response in the degenerative portions of the tendon.11
Given the complex relationship between tendon structure, dysfunction and pain, there is scope to explore whether the development a USI-based criteria that aligns with the continuum model of tendon pathology is feasible and may help integrate clinical and imaging findings, allowing for more targeted treatment.18 31 34 Therefore, the primary aim of this study was to develop a standardised USI-based criteria for the diagnosis of tendinopathy that aligns with the continuum model of tendon pathology. Following the development of the criteria, the secondary aims were to assess both the intra-rater and inter-rater reliability of the criteria.
Materials and methods
This study followed the protocols for diagnostic procedures in manual and musculoskeletal medicine.37 It consisted of three distinct phases: (1) development of USI-based criteria, (2) sonographer education session and (3) inter-rater and intra-rater reliability study (figure 1). The study design established a methodological model for enhancing imaging procedures, assessment and consistency, and reducing rater subjectivity and recall bias. The aim of the criteria development phase was to establish a criteria that was evidence-based and clinically relevant. The reliability phase was separated into three key parts in order to ensure randomisation and blinding, and to reduce recall bias.
Phase 1: criteria development
Following a systematic review,31 an initial criteria was proposed that incorporated both the continuum model of tendon pathology9 11 and common parameters used to measure tendinopathy on USI.31 Additionally, as identified in previous studies,18 31 34 an ordinal scale was used that included measures of tendon thickness, echogenicity and vascularisation. The criteria were then sent to an expert panel (two musculoskeletal physiotherapists trained in USI, two radiologists and one sonographer) to undergo face validity assessment for feedback and recommendations regarding clinical relevance, design and ease of use. Subsequently, the criteria were further refined to reflect the expert panel feedback (figure 2).
Phase 2: Sonographer training and education
A 2-hour training session was conducted with the primary researcher (WM) along with two sonographers (raters A and B). The sonographers were provided with the USI criteria (figure 2), and an education sheet defining each USI parameter along with example images to guide the education session (table 1). The sonographers were educated on the overall criteria, as well as on each individual parameter’s key features at each of the three stages of tendinopathy. Once the criteria had been explained, additional archived ultrasound images that were not included in the retrospective data collection were used to review and discuss until there was a consensus agreement on the appropriate stage of tendinopathy. A key feature of the continuum model of tendon pathology is the possibility of tendons demonstrating structural changes from different stages of tendon pathology, therefore the likelihood of tendons displaying features from different stages of tendon pathology in each individual parameter was discussed. It was determined that in these instances, the sonographer would need to determine the overall stage of tendinopathy based on their clinical experience and the available data. Once the overall criteria and each individual parameter had been defined, discussed and consensus agreement reached, the sonographers discussed the best way to record the data from the USI analysis and a scoring table was then drafted and agreed on.
Phase 3: reliability study
A retrospective search for suitable ultrasound images of the Achilles tendon, collected as part of usual patient care and stored on a local radiology clinic database between January 2018 and May 2019, was conducted. Gatekeeper consent and approval was obtained prior to commencing the database search. Ultrasound images were deemed eligible for inclusion if they met the following criteria: adults (either gender) aged over 18 years; presenting with Achilles tendon pain; referred for imaging as part of usual care by health practitioner; images contained measurements of tendon thickness and vascularity (colour/Power Doppler). General exclusion criteria were history of surgery; history of systemic inflammatory disease; injection used as an intervention; ultrasound diagnosed tendon tear or rupture. Images were extracted in JPEG format by the primary researcher (WM) prior to being de-identified.
Retrospective analysis was performed of all USI images. In total, 31 Achilles tendon images were assessed. Images were digitally stored on an external hard drive in JPEG format and included both pathological and normal Achilles tendons. Once images were collected, one investigator (WM) independently de-identified and randomised the images. The images were then sent to two sonographers (raters A and B), who were blinded to the original diagnosis, for classification according to the USI criteria (figure 2). Each rater individually assessed each image to determine the tendinopathy stage.
The same images were then re-randomised and sent to the same two sonographers 8 weeks later for re-assessment using the same criteria. Sonographers were again blinded to their original diagnosis and tendinopathy stage. Memory recall bias has been shown to effect intra-rater reliability studies, yet the ideal duration between assessments is unknown,38 with ‘washout’ periods ranging from 24 hours to 4 weeks,39–41 therefore this time frame was doubled to reduce the chance of recall bias.
Participant characteristics were reported using mean (SD) and frequency. Statistical analysis was performed following the methods of previous reliability studies.32 33 Quadratic-weighted Cohen’s κ (kw) with 95% CI was used to calculate intra-rater and inter-rater reliability for all ordinal variables (tendon thickness, vascularity and echogenicity) and overall classification (normal, reactive/early dysrepair or late dysrepair/degenerative). Agreement was interpreted as ‘poor’ (kw ≤0.00), ‘slight’ (kw 0.01–0.20), ‘fair’ (kw 0.21–0.40), ‘moderate’ (kw 0.41–0.60), ‘substantial’ (kw 0.61–0.80) and ‘almost perfect’ (kw 0.81–1.00).42 A contingency table was produced to display the frequencies and percentages of agreement between the two raters. The observed proportions of agreement (Po) and the corresponding 95% CI for each category and the overall classification were also reported. All statistical analyses were performed using the SPSS software package (IBM SPSS Statistics for Macintosh, V.26.0, IBM) and VassarStats statistical software.43 A sample size of 30 or greater is recommended for reliability studies,44 thus we used 31 participants.
Patient and public involvement
It was not possible to involve patients in the design, or conduct, or reporting, or dissemination of our research. Patients were not invited to comment on the study design and were not consulted to develop patient relevant outcomes or interpret the results. Patients were not invited to contribute to the writing or editing of this document for readability or accuracy.
Participant characteristics are presented in table 2. Of the 31 included participants, ultrasound images for 15 left and 16 right Achilles tendons were assessed. Pooled data for intra-rater reliability were measured, with both raters A and B demonstrating ‘substantial’ agreement (kw 0.70–0.77) for the overall staging of tendinopathy (table 3). The intra-rater reliability of individual ultrasound parameters ranged from ‘substantial’ to ‘almost perfect’ for both raters. In addition to intra-rater reliability, inter-rater reliability was measured for both rounds of image assessment (table 4). Pooled data for overall inter-rater agreement for the first round of assessment were determined to be ‘substantial’ (kw 0.75, 95% CI 0.58 to 0.91), with this increasing to ‘almost perfect’ (kw 0.81, 95% CI 0.63 to 0.99) in the second round. For both rounds of assessment, individual parameter inter-rater reliability ranged from ‘substantial’ to ‘almost perfect’.
The contingency table demonstrates the frequencies and proportions of agreement for each category in the criteria (table 5). Reactive/early dysrepair tendinopathy was the most commonly diagnosed stage of tendinopathy by both rater A (round 1, 54.8%; round 2, 61.3%) and rater B (round 1, 41.9%; round 2, 67.7%). Late dysrepair/degenerative tendinopathy was diagnosed by rater A in 19.4% of participants in round 1 and 22.6% of participants in round 2. Rater B diagnosed late disrepair/degenerative tendinopathy 32.3% of participants in round 1 and 22.6% of participants in round 2. The overall observed proportion of agreements (Po) for the staging of tendinopathy using USI-based criteria were 0.74 (95% CI 0.55 to 0.87) in round 1 and 0.87 (95% CI 0.69 to 0.95) in round 2.
The intra-rater and inter-rater reliability ranged from ‘substantial’ to ‘almost perfect’ for identifying individual features (tendon thickness, echogenicity and vascularity) and determining overall staging of tendinopathy, using standardised criteria. Inter-rater agreement increased from round 1 (kw 0.75, 95% CI 0.58 to 0.91) to round 2 (kw 0.81, 95% CI 0.63 to 0.99), which may indicate that as raters became more acquainted with the criteria there was a learning effect which resulted in improved reliability.
Tendon thickness demonstrated the largest variation in intra-rater reliability, with rater A demonstrating ‘substantial’ agreement (kw 0.75) and rater B demonstrating ‘almost perfect’ agreement (kw 0.84). Similarly, there was variation in inter-rater reliability, with tendon thickness demonstrating the lowest agreement in round 1 (kw 0.65) and having had the largest improvement in agreement in round 2 (kw 0.77). While other studies have demonstrated good reliability of tendon thickness using continuous measures,33 45–47 the current study is the first to report reliability of tendon thickness measured in a categorical manner. The decision to use categorical criteria followed the face validity assessment, where the expert panel highlighted inaccuracies when measuring small changes in tendon thickness, and difficulty using either the distal part of the same tendon or contralateral tendon as a reference point due to the frequent presence of asymptomatic tendon thickening.
Previous studies have demonstrated ‘almost perfect’ intra-rater and inter-rater reliability when using the intraclass correlation coefficient (ICC) to assess Achilles tendon thickness using continuous variables.33 45 Although there is difficulty in comparing ICC and kappa statistics numerically, the grading of the results allows comparisons to be made. Hence, intra-rater reliability of these previous studies was comparable with the ‘almost perfect’ reliability of rater B and better than ‘substantial’ agreement achieved by rater A.33 45 Similarly, previous studies33 45 have demonstrated ‘almost perfect’ inter-rater reliability when using the ICC compared with the ‘substantial’ agreement demonstrated in this study. This variation in both intra-rater and inter-rater reliability is in line with a recent systematic review in which both the intra-rater agreement (ICC 0.65–0.94) and inter-rater agreement (ICC 0.65–0.84) ranged from ‘good’ to ‘almost perfect’.48
The scoring of echogenicity using the proposed criteria demonstrated ‘substantial’ intra-rater and inter-rater reliability. Intra-rater agreement (rater A, kw 0.78; rater B, kw 0.73) was consistent with results reported by Sunding et al,33 in which intra-rater agreement for the measurement of tendon echogenicity ranged from k 0.54 to 0.84. Inter-rater agreement (round 1, kw 0.70; round 2, kw 0.76) for the current study was higher than those reported in previous Achilles tendon studies (k 0.07–0.58).33 49 When comparing the current results with those of other studies that have measured echogenicity in different tendons, intra-rater agreement was lower than those reported for the patella tendon (k 0.78–0.87)33 and similar to those reported for the supraspinatus tendon (k 0.76–0.79).50 However, the proposed criteria used within the current study demonstrated better inter-rater agreement than that previously reported for the patella tendon (k 0.52–0.60)33 and supraspinatus tendon (k 0.51–60),32 and it is comparable with using sonoelastography to assess echogenicity for the supraspinatus tendon (k 0.71–0.81).50 Agreement on structural changes within a tendon is considered difficult, with previous studies using an ordinal scale (‘normal’, ‘mild abnormality’, ‘severe abnormality’) that stages tendinopathy according to subtle changes that may be difficult to distinguish from anisotropy.32 33 By utilising criteria that bases diagnosis on the distribution of structural changes (‘diffuse’ or ‘focal’ echogenicity), as suggested by the continuum model of tendon pathology,9 11 rather than just the presence of structural change, reliability may be increased.
Vascularity represented the most reliable of the USI-based parameters measured, with both intra-rater and inter-rater agreement being ‘almost perfect’. The intra-rater agreement (rater A, kw 0.86; rater B, kw 0.89) is improved than that reported by Sunding et al33 where intra-rater agreement for Achilles tendon vascularity ranged from k 0.64 to 0.78. When compared with studies that have examined vascularity in other tendons, the current results align with those reported for the patella tendon (k 0.79–0.86).33 Inter-rater agreement (round 1, kw 0.89; round 2, kw 0.86) in this study was also higher than that reported by Sunding et al33 where inter-rater agreement for the measurement of tendon vascularity ranged from k 0.59 to 0.87. Similarly, the current results are higher than those reported for the patella tendon (k 0.45–0.76) and similar to those reported for the supraspinatus tendon (k 0.96–0.98).32 The differences in reliability seen from the current study compared with those previously reported may be due to the design of the criteria used in the current study. While Sunding et al33 used a scale based on subjective interpretations of the quantity of blood vessels present (‘mild’, ‘moderate’ and ‘severe’), the current study used more objective measures such as number and size of blood vessels present. Similarly, Ingwersen et al32 used a more objective measure of the amount of Doppler activity present, expressed as a percentage of the region of interest (<25%, 25%–50%, >50%), hence the similarity of results to the current study.
The continuum model of tendon pathology
Overall, the proposed criteria demonstrated ‘substantial’ to ‘almost perfect’ intra-rater and inter-rater reliability for staging Achilles tendinopathy using the continuum model of tendon pathology. This study aligns with others in demonstrating that by using a standardised USI-based criteria to stage tendon pathology, both intra-rater and inter-rater reliability are increased.32 33 51 While many previous studies have staged tendinopathy using a variety of different criteria,25 30 32 35 52–59 no study has developed an USI-based criteria that utilises the continuum model of tendon pathology.31 34 This is the first study to propose an USI criteria that is aligned with the continuum model of tendon pathology which is widely accepted and utilised to diagnose tendinopathy based on specific clinical and imaging findings.9 11 Although the role of imaging in the diagnosis of tendinopathy is debated,11 13 17 54 60 61 imaging may provide additional information that may help stage tendinopathy and better direct treatment.11 Furthermore, there is scope to develop reliable and valid criteria that use accepted terminology to describe pathological tendon changes.1
While the proposed criteria demonstrates ‘substantial’ to ‘almost perfect’ intra-rater and inter-rater reliability for staging Achilles tendinopathy using the continuum model of tendon pathology, it is important to note that the diagnosis of tendinopathy is multifactorial.1 13 The role of USI in the diagnosis of tendinopathy remains challenging due to a lack of clinical gold standard to compare images.29 Therefore, while USI can be utilised to assist in the diagnosis of tendinopathy,1 13 the clinical diagnosis of tendinopathy does not require the presence of structural changes as shown on imaging.1 Rather, by standardising the method with which tendon pathological change is reported and aligning terminology with current conceptual models of tendon pathology, larger multicentre studies can be conducted to further investigate the clinical relevance of USI in the diagnosis and management of tendon pathology and they will allow for greater transferability between studies.
The main limitation in this study is the retrospective design of the study. This did not allow for control of imaging procedure, machine, sonographer or imaging settings. While standardised imaging procedures, settings and machines have been shown to increase reliability,32 33 46 51 there is a question as to whether there is transferability to the clinical setting, where images are captured using a variety of machines and imaging procedures.32 Standardised protocols can make reliability appear artificially high when compared with the clinical setting,32 however if reliability is low with a standardised protocol, reliability can be assumed to be poor and less relevant for the clinical setting.32 Additionally, included images were static in nature which may affect the external validity and reliability of the criteria as clinically USI is considered a dynamic investigation.62 Further study limitations include the small sample size, and assessment of only the Achilles tendon, which may make generalisation of the results difficult. While a larger sample size will give greater confidence in results, a sample size of 30–50 has been suggested as sufficient to determine clinical relevance.44 Although only the Achilles tendon was assessed in this study, other studies have shown that standardised criteria are reliable in assessing tendon pathological changes in the patella and supraspinatus tendons.32 63 64
Intra-rater and inter-rater reliability were ‘substantial’ to ‘almost perfect’ when utilising an USI-based criteria to diagnose Achilles tendinopathy based on the continuum model of tendon pathology. This is the first study to use the continuum model of tendon pathology to develop an USI-based criteria to diagnose tendinopathy. The proposed USI criteria aligns with accepted clinical terminology used to describe tendinopathy and provides an objective criteria to reliably stage tendinopathy. Future research should test both the intra-rater and inter-rater reliability of the criteria in a prospective manner within the clinical setting and in other tendons. Additionally, future research should explore the correlation of clinical assessment with the proposed criteria.
The authors would like to acknowledge Rohan de Carle and Michael Rayment for their help with analysing images. The authors would also like to acknowledge our expert panel for their assessment of the proposed criteria: Jeremy Lewis, Karen McCreesh, Scott Allen, James Linklater and Phil Lucas.
Contributors WM, RE, WH and JWF conceived and designed the study protocol. WM, RE, WH and JWF developed the criteria. WM secured access to database for ultrasound images. WM performed inclusion of ultrasound images, blinding and randomisation of images. WM conducted education session. WM and ER planned, coordinated and performed the statistical analysis. WM drafted the manuscript. RE, WH, JWF and ER contributed to the manuscript. All authors read and approved the final manuscript.
Funding This research was supported by an Australian Government Research Training Program Scholarship.
Competing interests None declared.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Patient consent for publication Not required.
Ethics approval Ethics approval granted by the Bond University Human Research Ethics Council (BUHREC). Approval number 15703.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement All data relevant to the study are included in the article or uploaded as supplementary information. No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.