Research

Inter-rater reliability of the Shoulder Symptom Modification Procedure in people with shoulder pain

Abstract

Background Musculoskeletal conditions involving the shoulder are common and, because of the importance of the upper limb and hand in daily function, symptoms in this region are commonly associated with functional impairment in athletic and non-athletic populations. Deriving a definitive diagnosis as to the cause of shoulder symptoms is fraught with difficulty. Limitations have been recognised for imaging and for orthopaedic special tests. 1 solution is to partially base management on the response to tests aimed at reducing the severity of the patient's perception of symptoms. 1 (of many) such tests is the Shoulder Symptom Modification Procedure (SSMP). The reliability of this procedure is unknown.

Methods 37 clinician participants independently watched the videos of 11 patient participants undergoing the SSMP and recorded each patient's response as improved (partially or completely), no change or worse. Inter-rater reliability was assessed by Krippendorff's α, which ranges from 0 to 1.

Results Krippendorff's α was found to range from 0.762 to 1.000, indicating moderate to substantial reliability. In addition, short (3-hour) and longer (1-day) durations of training were associated with similar levels of reliability across the techniques.

Conclusions Deriving a definitive structural diagnosis for a person presenting with a musculoskeletal condition involving the shoulder is difficult. The findings of the present study suggest that the SSMP demonstrates a high level of reliability. More research is needed to better understand the relevance of such procedures.

Trial registration number ISRCTN95412360.

What are the new findings?

  • Deriving a definitive structural diagnosis for musculoskeletal conditions involving the shoulder is difficult.

  • Symptom improvement/correction/modification tests have been suggested by clinicians as one method of developing a management programme.

  • This study demonstrated inter-rater reliability of the Shoulder Symptom Modification Procedure.

How might it impact on clinical practice in the near future?

  • A graduated exercise programme is the most common form of management, for people with the majority of musculoskeletal conditions involving the shoulder.

  • If future research demonstrates that techniques used in the Shoulder Symptom Modification Procedure confer additional benefit when incorporated into a graduated shoulder exercise programme over exercises alone, then methods such as these may have a role in the management of musculoskeletal conditions involving the shoulder.

Background

As a group, musculoskeletal conditions are associated with the second highest number of ‘years lived with disability’.1 Within this group, conditions affecting the shoulder occur frequently in sporting and non-sporting populations,2–4 and their prevalence increases with age.5 Annually, 1–2% of the general population present to their general practitioner (family physician) with a first episode of shoulder pain,6–8 and of concern, these conditions are associated with high levels of morbidity lasting for 1 year or longer.3 ,8

To understand the basis of the presenting shoulder symptoms, clinicians typically perform a clinical examination, which usually includes: taking a history, collecting disability and impairment data and performing special orthopaedic tests that have been designed to incriminate pathology, such as that involving the rotator cuff tendons, subacromial bursa or glenoid labrum, or to rule in conditions, such as subacromial impingement syndrome.9

Although orthopaedic tests are commonly used,9 findings from narrative10–13 and systematic reviews14 ,15 and research investigations16 have consistently questioned the value of these procedures as a method of implicating the structures associated with the presenting symptoms. Imaging is commonly used to support the clinical assessment.17 Likewise, the certainty with which imaging findings support or confirm the clinical diagnosis is challenged by myriad studies reporting asymptomatic structural deficits, including full-thickness rotator cuff tears and glenoid labral tears, in populations including elite athletes.18–21 One implication of current clinical practice is that people with shoulder pain may undergo operations to repair tissues that are not related to their presenting symptoms.13

The findings of these clinical and radiological investigations have challenged the basis on which a structural diagnosis may be achieved.10 ,13 ,22 This has been recognised previously and researchers have suggested that assessment and management could be based on the presenting symptoms without the need for a definite structural diagnosis.23 ,24 One such model, known as the Shoulder Symptom Modification Procedure (SSMP), was first described by Lewis10 as a systematic approach to assess clinical variables that may be associated with shoulder symptoms, to determine their relationship with the presenting symptoms. Similar to the Mulligan and McKenzie et al approaches,23 ,24 procedures identified that partially or completely improve the presenting symptoms may be considered in patient management. By placing the individual patient at the centre of the assessment and management decision process, these methods are compatible with patient-centred practice, clinical reasoning and evidence-based practice.22 ,25 ,26

Shoulder Symptom Modification Procedure

The first stage of the SSMP is for the patient to identify the movements, activities or postures that reproduce symptoms. This may include symptoms experienced while sitting at a desk, lifting a pan or kettle, dressing, swimming, performing weight-bearing activities such as push-ups and in high-powered explosive activities commonplace in sport. Pain is the most commonly reported symptom, but symptoms may also include reduction in movement, instability and symptoms that may be associated with neurovascular compromise. Once defined, the component parts of the SSMP are then applied while the patient performs the symptom-provoking movements, activities or postures to determine if an immediate change is achievable. This type of ‘real-time’ process has been recommended previously,23 ,24 and evidence (albeit limited) suggests that procedures found to improve symptoms in the cervical and lumbar regions within a session may be useful in guiding treatment selection and may help predict between-session changes in symptoms.27–29

The SSMP comprises three main sections. The first section aims to assess the relationship between thoracic posture and symptoms, the second aims to evaluate the effect of scapular position on symptoms and the third aims to assess the effect of the relationship between the humeral head and scapula on symptoms. In reality, the assessment procedures do not isolate one structure. For example, reducing the thoracic kyphosis also relatively posteriorly tilts the scapula, changes length-tension relationships of muscles, tendons and related soft tissues and may influence joint biomechanics. As all procedures involve touch, another reason for perceiving a change in symptoms may be the experience of this sensation.30 Additionally, there is only very limited evidence that humeral head procedures actually influence humeral head position.31

The SSMP assessment form is detailed in figure 1. The specific assessment procedures have been described elsewhere.10 ,11 ,13 Following agreement between both parties, the person with shoulder symptoms informs the clinician if an individual procedure: partially or completely alleviates symptoms; has no change on symptoms; or makes the symptoms worse. Techniques may be combined; for example, if reducing the thoracic kyphosis and elevating the scapula independently partially reduce symptoms, then the clinician may assess the response of combining both these procedures. If the SSMP completely and consistently alleviates symptoms, then the procedures found to alleviate the symptoms are used to inform treatment.

Figure 1
Figure 1

The Shoulder Symptom Modification Procedure assessment form.

It is important for clinicians to appreciate that the SSMP is not a stand-alone procedure and if the SSMP does not change symptoms or only partially alleviates them, other rehabilitation based on the clinician's clinical reasoning and the patient's acceptance of that management need to be considered, such as advice, education, rotator cuff rehabilitation exercises,11 ,13 injection therapy or surgery.32 ,33

Although in clinical use,11 the reliability of the SSMP is uncertain. The primary aim of this investigation was to evaluate the intertester reliability of clinicians in determining how people with shoulder symptoms respond to SSMP procedures. The secondary aim was to investigate the differences in reliability between those that participated in long training (over 1 day) and short training (3 hours) in the SSMP. GRAAS recommendations for reporting reliability studies were used as a guide.34

Methods

Ethical approval and study registration

Ethical approval for the investigation was granted by the Faculty of Education and Health Sciences Research Ethics Committee, University of Limerick, Ireland (2015_12_13_EHS), and from the Health and Human Sciences Ethics Committee, University of Hertfordshire, UK. The investigation was registered—ISRCTN95412360.

Patients

A sample of convenience of 11 people with unilateral shoulder pain, recruited from community and clinical settings, consented to participate in the investigation. They were provided with participant information documentation and informed of their rights, including the right to withdraw from the investigation at any stage, without having to explain this decision. Prior to participation, all patients signed consent documentation, after which they provided demographic data and a rating of their present pain on a 0–10 scale (0, no pain; 10, worst imaginable pain). Once the videos were filmed, the patient participants' involvement in the study was complete.

Clinicians

A sample of convenience of 40 clinicians from physiotherapy and osteopathy were approached to participate in the investigation.

Clinician participants worked in variety of health settings, including the public and private sectors and in primary and secondary care. They had varied training in the SSMP. Some had previous experience with the SSMP and were using it in current clinical practice, while others were new to the procedure. A number of clinicians had participated in previous training (∼1 day) and to varying extents had incorporated the SSMP into their clinical practice. Others were recruited for the purpose of the investigation and received short training (3 hours duration). As such, clinician participants were not randomised into these long and short training subgroups. The clinicians were given consent documentation, and were made aware of their rights, including the right to withdraw from the study at any stage. Those providing consent also provided demographic data.

Procedure

Video analysis has been used in previous musculoskeletal conditions to investigate the reliability of assessing posture and movement,35–39 including shoulder research,40–42 and was determined to be the most appropriate method for the current investigation. The use of videos ensured that a large number of assessors were able to observe the patient's response to the SSMP concurrently from the same angle.

Video filming occurred in a clinical research room at the University of Limerick, Ireland. Videos were made of one of the investigators (JSL) conducting the SSMP on the 11 patient participants. All videos were filmed on the same day. The videos were filmed and audio recordings were made using two JVC Everio Camcorders (Model No. GZ-MS210BEK) cameras (Yokohama, Japan), mounted on extendable tripods positioned ∼1.5 m from the patient participants. To standardise the position, patients were instructed to stand on a cross taped to the floor in front of the cameras. To reduce distortion, the cameras were positioned as close to perpendicular to the patients as possible. Initially, the patient participants were requested to identify and demonstrate the movement that reproduced their symptoms. Following this, the SSMP assessment procedures were performed and the patients' responses filmed, and the patients were asked whether the symptoms were the same, worse or better. At the end of data collection, 167 unique video recordings were available for analysis. The video recordings were initially edited using Adobe Creative Cloud Premier Pro (http://www.abobe.com) and then converted to .avi files using PRISM video converter software (http://www.nchsoftware.com). These .avi files were played using Windows Media Player (http://www.microsoft.com). The duration of each audio and video clip ranged from 26 to 150 s, with most being under 1 min. The video recordings were uploaded onto a secure server located at the University of Limerick.

Clinician participants were provided with a unique log in and password to the server and independently watched the video clips and completed the data collection documentation. Each video was assigned a separate table on the documentation sheet and after watching each video, the clinician participants were required to record if the SSMP technique had produced no change, made the patient worse or resulted in either partial or complete improvement. The clinicians were informed that responses were to be informed by the responses provided by the patients and not by their own interpretations. Owing to technical constraints, the order of the video clips was not randomised and the clinician participants could choose to watch the video clips in any order. They were encouraged to carry out the task in their own time and in a quiet place without interruptions, and to take breaks as necessary. Clinicians were instructed that ideally they should only watch the video on one occasion but were permitted to watch on two occasions if they were unsure of the patient's responses. Confirmation of this type would occur in clinical practice in such cases of uncertainty. Clinician participants were instructed that they should record:

  • ‘no-change’ if the patient reported that the technique had not changed their symptoms,

  • ‘worse’ if the patient reported that the technique had increased their symptoms,

  • ‘partial improvement’ if the patient reported that the technique had partially improved their symptoms, which was defined as anything between 1% and 99% improvement, and

  • ‘complete improvement’ if the patient reported that the technique had completely alleviated their symptoms (ie, 100% improvement).

To reduce bias, clinicians' scores were entered into a database by a research assistant who was unaware of the purpose of the investigation. Once the data sheet was complete, the clinicians' involvement in the study was complete. At the end of the data collection period, to protect patient confidentiality, the videos were removed from the secure server and destroyed.

The focus of the analysis of the data was on inter-rater reliability. No attempt was made to assess intrarater reliability, for the following reasons:

  1. intrarater reliability can be assumed to be at least as good as inter-rater reliability, and as the primary practical concern is to assess the lower limit of reliability, a separate assessment of intrarater reliability is of little interest;

  2. a repeated assessment of the same videotaped technique by the same clinician would have little relevance to clinical practice;

  3. given that the SSMP aims to improve symptoms, if the technique had been videotaped twice, the technique itself might have altered, such that a subsequent test of the same procedure would not be testing the same response, violating a core assumption when assessing intrarater reliability.35 ,43 Other procedures that aim to modify symptoms have reported similar immediate responses.44 ,45 This phenomenon is clearly demonstrated in other symptom modification procedures (https://www.youtube.com/watch?v=Arkxz8rabGQ&utm_content=buffer6f7c4&utm_medium=social&utm_source=facebook.com&utm_campaign=buffer).

  4. asking participating clinicians to produce assessments within and between patients would have been onerous and might have discouraged their participation.

Description of techniques

Table 1 describes the techniques assessed in the current investigation.

Table 1
|
Description of techniques

Statistical analysis

The inter-rater reliability of clinicians' assessment of the response to SSMP procedures was calculated by analysing the responses provided by the clinicians (no change, worse, partial improvement, complete improvement) using Krippendorff's α.46 This statistic is a reliability coefficient suitable for analysing responses from multiple raters, and accommodates missing data. It ranges from 0 to 1, with 1 representing perfect agreement, and was calculated with the ratings defined as ordinal. A 95% CI for α and the probability of not attaining a coefficient of at least 0.800 were obtained through bootstrapping (10 000 samples). The 95% CI gives a range of plausible values for the ‘true’ reliability, such that we can be 95% confident that the true reliability is at least the lower limit of the CI, while the probability value indicates the probability of the ‘true’ reliability not attaining a minimum threshold of 0.800. Reliability was only calculated where at least three patients were assessed with any one procedure. The rate of missing values for each procedure was calculated as the number of times that a rater did not provide a rating, out of the total number of possible ratings (n of patients×n of raters). Analyses were conducted in SPSS V.23.

Sample size

There do not appear to be formal methods of calculating sample size for reliability studies with ordinal outcomes and multiple raters. However, methods for continuous outcomes, such as those described by Walter et al,47 may provide some guidance. For example, with 20 or more raters, 10 patients would provide at least 80% power to detect a coefficient of 0.800 as greater than a null value of 0.500, at a 5% significance level.

Results

Eleven patient participants consented to participate. Each presented with unilateral shoulder pain and was naïve to the SSMP procedure. The mean age was 53.7 years. Seven were men and six had symptoms involving the right shoulder. Patient participant demographic information is detailed in table 2.

Table 2
|
Patient participant information

Of the 40 clinicians approached (as a sample of convenience) to participate in the investigation, 37 (92.5%) provided responses. Of the three who did not respond, two cited insufficient time as being the reason for not completing the data sheets; the reason for the other clinician is unknown. There were 20 female and 17 male clinician participants (36 physiotherapists and 1 osteopath). Eighteen had participated in a short (∼3 hours) training programme to explain and practice the SSMP. Nineteen had participated in a longer training programme (∼1 day). Clinician participant demographic information is detailed in table 3.

Table 3
|
Clinician participant information

Response to the SSMP

In total, 19 procedures and combinations were tested, representing isolated procedures (eg, scapular elevation) and, when indicated, procedures tested in combination (eg, thoracic extension and scapular posterior tilt). The responses to the procedures are detailed in table 4. Responses to each of these procedures were assessed by all 37 clinicians, though the number of patients varied from 3 to 11. On 14 (10.4%) occasions, patients reported a worsening of symptoms and no change was reported 29 (21.6%) times. On 91 occasions (67.9%), participants reported a partial or complete reduction in symptoms. The intertester reliability of the clinicians' ratings is presented in table 5.

Table 4
|
Patient participant response to SSMP techniques
Table 5
|
Intertester reliability, whole cohort of clinicians

Nineteen clinicians had participated in longer training (over 1 day) and 18 clinicians over a shorter period (∼3 hours). The α coefficients for these two subgroups are presented in table 6. The mean difference in estimates of these coefficients (long training subgroup minus short training subgroup) was calculated as −0.001 (95% CI −0.052 to 0.0510). Figure 2 indicates the extent of the discrepancy in the reliability of assessments between clinicians in these subgroups. The ends of each horizontal bar indicate the value of α for each subgroup, such that the length of the bar indicates the magnitude of difference between these values.

Table 6
|
Intertester reliability of those participating in long and short training
Figure 2
Figure 2

Differences in α for raters undergoing either short (S) or long (L) training.

Discussion

Deriving a definitive structural diagnosis for an individual presenting with shoulder pain is fraught with difficulty. Suggesting care pathways based on the responses to orthopaedic tests and imaging may not correctly represent the mechanisms underlying the presenting symptoms. This is due to a poor correlation between structural changes and symptoms and poor accuracy of the clinical orthopaedic tests themselves. Clinical diagnosis is further challenged by the need to appreciate, for those presenting with pain as the main symptom, whether the symptoms have a peripheral nociceptive driver or occur as the result of altered central pain processing.48 Owing to these complexities, for many, clinical practice is currently based on assessing the response to techniques that do not require a structural diagnosis and using the responses of the assessment procedures to inform management.24 ,49 The SSMP falls within this category of clinical assessment. The findings of the current investigation suggest that clinicians are able to assess the patient's individual responses to the components of the SSMP with a good degree of reliability; the lowest point estimate of α was 0.762, for internal rotation in flexion, which is close to the threshold value for ‘substantial’ reliability of ≥0.810 proposed by Shrout.50 The estimates were generally similar for those clinicians who had undertaken longer training (1 day) and those who had undertaken shorter training (3 hours); the largest discrepancy was for internal rotation in flexion. Moreover, there was no consistent pattern in these differences; reliability was higher in the short training subgroup for nine techniques and was higher in the long training subgroup for eight techniques.

Although the number of raters was constant, the number of patients in whom the reliability of the assessment of each technique could be assessed varied from 3 to 11, and the precision of the estimates of α (as represented by the width of the associated 95% CI) varied accordingly. Nonetheless, owing to the large number of raters and the low rate of missing values, a reasonable degree of precision was obtained even for estimates based on just three patients.

In this investigation, reliability was assessed using video analysis playback. This was chosen as pilot work prior to this research clearly demonstrated that the response to a technique could substantially change the ‘baseline’ for the second tester and therefore confound the possibility of determining the reliability of assessment. The use of videotapes ensured that all clinicians were assessing the same response. Before the SSMP should be considered to be a reliable clinical assessment procedure, the findings of this investigation need to be repeated in a larger sample of patients, as well as testing other methods of reliability such as direct observation of patients being assessed clinically.

Our findings suggest that clinicians can learn the component techniques of the SSMP and reliably determine if they have influenced the patient's symptoms in a relatively short period of time, and there do not appear to be substantial clinical differences in reliability if training is conducted over a 3 hour period or over the course of 1 day. However, it should be remembered that clinicians were not randomly allocated to the two durations of training, and a conclusive comparison of the two subgroups cannot therefore be made. It is important to emphasise that although the findings of this investigation suggest that the SSMP is a reliable assessment process, there is no evidence to suggest that incorporating the techniques into management positively influences outcome over natural history or other treatment procedures.

In a recent large multicentre cohort study (1030 participants at baseline, 811 participants at 6-month follow-up) investigating prognostic factors for people with shoulder pain, psychosocial factors were identified as the major determinant. Of the range of biomechanical factors included in the investigation, ‘real-time’ improvement in symptoms associated with changes to scapular posture during active shoulder elevation10 ,51 was the most consistent positive biomechanical prognostic factor identified at 6 months.52 An improvement in symptoms and/or range of shoulder elevation was demonstrated during manual facilitation of the scapula in 41% (n=426) of participants and near-complete or complete reduction in pain and/or restoration of shoulder elevation in 12% (n=122) of participants.52

One of the potential benefits of assessment and management systems such as the SSMP is that demonstrating to a patient that symptoms are modifiable may give the individual confidence to move, due to the reduction or cessation of symptoms, which, in turn, may facilitate adherence to treatment.53 Poor adherence has been shown to compromise the effectiveness of treatment,53 ,54 and as self-management is required in most chronic conditions, finding a technique that reduces or alleviates symptoms may encourage the patient and facilitate the management process. Therapy-related factors are one of the five dimensions influencing adherence to treatment.54 Although there are many subcategories within this dimension, the immediacy of beneficial effect is cited as a factor influencing adherence (p. 30). From the patient's perspective, the perception that treatment is effective in ameliorating unpleasant symptoms is a precondition for continued compliance (adherence).55 Although there is no empirical evidence to support this contention, procedures such as the SSMP, which may demonstrate immediate improvement in symptoms, may support adherence to an agreed management plan. Of relevance, people with chronic low back pain preferred exercises that were individualised and made sense, and felt their individualised needs were addressed; they were less likely to engage with exercises that were boring or lacked challenge.56 Appropriate and balanced communication with patients is vital to frame the entirety of the management plan.

When asked if they attribute the cause of presenting symptoms to anything specific, people presenting with shoulder symptoms commonly implicate ‘poor posture’. Although deviations in posture (from an idealised norm) are frequently cited as the cause of shoulder pain and symptoms,57 ,58 this relationship has been repeatedly challenged,59–62 and this in turn calls into question the extent to which clinical reasoning should be based on static observation of posture. Components of the SSMP involve changing posture during symptomatic activities. If symptoms consistently change, then these changes can be incorporated into the management plan. Also of relevance is that for an individual who is convinced, or who has been convinced, that posture is a key factor underlying the presenting symptoms, demonstrating no change or a worsening in symptoms when changing posture may alter this perception and this may thereby facilitate the acceptance of alternative management strategies.

Of importance, the SSMP is not a stand-alone procedure and must be embedded within a complete patient care management programme that includes education, support, advice, consideration of lifestyle and psychosocial factors, general fitness and other local management strategies. If SSMP techniques do not positively influence symptoms, other treatments or interventions may need to be considered. These may include (but are not restricted to) graduated shoulder exercises aimed at the rotator cuff and shoulder muscles.11 ,63–66

Limitations

The findings of this investigation need to be interpreted in the light of certain limitations. Foremost of these is that the clinicians only viewed one physiotherapist performing the SSMP in video format. If the clinician participants had observed other clinicians performing the SSMP procedures, different estimates of reliability might have resulted. The use of videos was necessary owing to the large number of assessors and the need for them to observe the same responses to each technique; however, although not practicable in this study, it would be more clinically realistic to directly observe the clinician and the patient's responses.

The use of a larger sample of patients would have allowed the reliability of the SSMP to be evaluated over a wider range of clinical presentation. There was, however, a relatively large clinician sample, which provided precise estimates of the reliability coefficients. It also allowed the relative influence of short-duration and long-duration training on reliability to be determined, though this was not a randomised comparison and is subject to confounding by other factors. In addition, it is important to emphasise that the findings only relate to the reliability of clinicians' interpretation of the SSMP procedures; the consistency with which such procedures are applied is a separate issue. Finally, being a university laboratory, the environment where the procedures were conducted and filmed was a controlled environment that may not reflect the realities of clinical practice.

Future research

The purpose of this research was to investigate the intertester reliability of the SSMP. Suggestions to assess the influence of symptom modification in a systematic way have been made23 ,24 ,49 and the responses used to guide treatment. There is a pressing need to understand the relevance (if any) of these types of approaches in their ability to support patient management, not only in terms of clinical outcome (type of change, magnitude of change, duration of change), but also in terms of the mechanism(s), by which they may produce a change. There is need to determine if SSMP procedures, embedded within a framework of care (advice, education, graduated exercise), add any additional value to overall management. If they have contributed positively, their continued use should be considered and if not, concepts such as this should be abandoned. There would be benefit in qualitative research to better understand patients' perceptions of SSMP procedures.

Conclusions

Deriving a definitive structural diagnosis for a person presenting with a musculoskeletal condition involving the shoulder is difficult. Limitations have been recognised for imaging as well as for orthopaedic special tests. One solution is partially to base management on the response to tests aimed at reducing the severity of the patient's perception of symptoms. One (of many) methods is the Shoulder Symptom Modification Procedure. The findings of the present study suggest that the procedure demonstrates a good level of reliability. More research is needed to better understand the relevance and importance of such procedures.