Methods
Eligibility criteria
We used the recently published 2016 Cochrane Review evaluating the effect of manual therapy and exercise for reducing pain and improving function for patients with rotator cuff disease to identify RCTs for inclusion in this study.11 Page et al11 included RCTs that compared exercise to placebo, no treatment, usual care or another active intervention among adults (≥18 years) with rotator cuff disease. The term ‘rotator cuff disease’ was used in the review for disorders of the rotator cuff labelled and/or defined by the trial authors using terminology such as subacromial impingement syndrome, rotator cuff tendonitis or tendinopathy, supraspinatus, infraspinatus or subscapularis tendonitis, subacromial bursitis or rotator cuff tears. Trials could include interventions provided to participants in any setting (eg, outpatient, at home or in the community) and must have involved the prescription of a supervised or unsupervised exercise programme. The intervention could have been with or without the addition of other components (eg, manipulation, lifestyle modification or counselling).
We included 34 exercise trials reported up to March 2015 from Page et al’s Cochrane Review.11
Data extraction guidelines
We used previously described data extraction guidelines to standardise the information that was extracted from each included paper.23 Descriptive data were systematically extracted into a spreadsheet, checked for consistency and merged into one document. In order to ensure a similar understanding of the application of the CERT across five reviewers, all reviewers independently pilot tested the data extraction form using one study, which was not included in the final 34 reviewed. All reviewers discussed their CERT ratings on a video conference in pairs with DHM. We estimated the time of the familiarisation process to be approximately 1.5 hours.
Application of the CERT
Two reviewers independently scored each included study by applying the CERT.23 Five reviewers were involved in the application of the CERT (CF, RLJ, YR, MG and DHM). Three reviewers (CF, RLJ and MG) applied the CERT in five trials each; one reviewer (YR) applied the CERT in 19 trials; and another reviewer (DHM) applied the CERT in all included trials. The CERT includes 16 categories and 19 separate items considered essential in the reporting of reproducible exercise interventions listed under seven domains: what (materials), who (provider), how (delivery), where (location), when and how much (dosage), tailoring (what and how) and how well (compliance/planned and actual).23 The CERT domains include information about any equipment used for exercises, the exercise instructor, core procedural and contextual elements of the exercise intervention that are required for replication, information about participant motivation strategies and whether, and how well, participants complied with the exercise programme.
A more detailed description of the CERT items is available in the Explanation and Elaboration Statement.23 This statement was used to guide the scope and interpretation of each CERT item. Each CERT item was rated as ‘yes’ (criterion met, indicating item clearly reported), ‘no’ (indicating item not reported or not clearly described) or ‘unsure’, and an overall rating of the exercise description was also made. For no or unsure responses, detailed comments about what was missing or what was unclear were recorded. We summed the number of items rated as yes to compute a total score ranging from 0 to 19 (0=no items clearly described to 19=all CERT items clearly described).
If the authors specifically referred to published protocols, online appendices and supplementary data, the reviewers retrieved and extracted these additional data when relevant. The reviewers also recorded whether the study was published in an open access journal and how easy the intervention description was to access (ie, available in the published paper or required additional data from other sources and, if so, whether this was open access).
Following completion of the review by both reviewers, any disagreements were discussed. If agreement could not be reached, an independent arbiter from the research team was to be consulted.
Risk of bias assessment
Risk of bias assessments of the included trials, based on the Cochrane Risk of Bias Tool,25 were taken from the original Cochrane Review.11 The following domains were assessed: random sequence generation, allocation concealment, blinding of participants and personnel, and blinding of outcome assessment (subjective and objective). The risk of bias figure was prepared using RevMan V.5.3 (The Nordic Cochrane Centren, Copenhagen)
Inter-rater reliability
Inter-rater reliability of the CERT was assessed for each of the 19 CERT items (including subitems a and b for items 7, 14 and 16) using percentage agreement26 and the prevalence and bias adjusted kappa (PABAK) coefficient.27 While kappa statistics measures chance-adjusted agreement and is therefore more robust than simple percentage agreement, when the prevalence of one of the categories is much higher than that of the other, chance agreement will be high and kappa can have unexpectedly low values.26–28 For percentage agreement, a score of 70% or greater is considered acceptable and ≥80% is considered high.28 For PABAK coefficients, the strength of agreement is interpreted as follows: 0=poor, 0.01–0.20=slight, 0.21–0.40=fair, 0.41–0.60=moderate, 0.61–0.80=substantial and 0.81–1=excellent.28
Data analysis
Data were entered into SPSS V.22 and were analysed using descriptive statistics and narrative summaries. For each study, the total CERT score was presented together with the percentage of a maximum CERT score of 19. The bootstrapped median was calculated using STATA (Version 12. College Station, TX, United States of America). Bootstrapping is a statistical method based on simulation of random sampling from the available data. We have performed 10 000 repetitions of the sampling creating samples with the same statistical properties as the original data set. The estimate of the median and 95% CI were calculated directly from the simulated repeated sampling. In this way, we did not have to assume any statistical distribution for the median and achieved a higher level of precision when constructing the CI.