Original Article
The minimal detectable change cannot reliably replace the minimal important difference

https://doi.org/10.1016/j.jclinepi.2009.01.024Get rights and content

Abstract

Objective

We compared the minimal important difference (MID) with the minimal detectable change (MDC) generated by distribution-based methods.

Study Design

Studies of two quality-of-life instruments (Chronic Respiratory Questionnaire [CRQ] and Rhinoconjunctivitis Quality of Life Questionnaire [RQLQ]) and two physician-rated disease-activity indices (Pediatric Ulcerative Colitis Activity Index [PUCAI] and Pediatric Crohn's Disease Activity Index [PCDAI]) provided longitudinal data. The MID values were calculated from global ratings of change (small change for CRQ and RQLQ; moderate for PUCAI and PCDAI) using receiver-operating characteristic (ROC) curve and mean change. Results were compared with five distribution-based strategies.

Results

Of the methods used to calculate the MDC, the 95% limits of agreement and the reliable change index yielded the largest estimates. In the patient-rated psychometric instruments, 0.5 SD was always greater than 1 standard error of measurements (SEM), and both fell between the mean change and the ROC estimates, on two of four occasions. The reliable change index came closest to MID of moderate change.

Conclusion

For patient-rated psychometric instruments, 0.5 SD and 1 SEM provide values closest to the anchor-based estimates of MID derived from small change, and the reliable change index for physician-rated clinimetric indices based on moderate change. Lack of consistency across measures suggests that distribution-based approaches should act only as temporary substitutes, pending availability of empirically established anchor-based MID values.

Introduction

The minimal important difference (MID) is the smallest change in a health-related outcome measure which, in association with minimal toxicity and cost, is large enough to trigger a change in management [1], [2]. The MID may be determined by the patients in quality-of-life measures, or by clinicians in disease-activity indices.

Investigators have used distribution-based and anchor-based methods to ascertain the MID [3]. Distribution-based approaches rely on statistical properties of the instrument without reference to an external standard. Anchor-based approaches use an external criterion (i.e., anchor) to interpret whether a particular magnitude of change is significant.

Crosby et al. summarized the statistical methods to establish the distribution- and anchor-based MID estimates, using a variety of cross-sectional and longitudinal anchors [4]. Distribution-based approaches include effect size statistics (including Cohen's effect size [5] and standardized response mean [6]), variations of the standard error of measurement (Wyrwich SEM [7] and Jacobson's reliable change index [8]), and Bland and Altman's 95% limits of agreement [9] (Table 1). Other proposed modifications of the reliable change index are not in common use [4].

The major role of the distribution-based strategies is in identifying the “minimum detectable change” (MDC, sometimes also termed the “smallest detectable difference”), which is the smallest change in score that can be detected beyond random error [22], [23], [24]. The MDC depends on the number of observers, the scoring methods, and the distribution of results in the population under study. Authors have suggested varying thresholds to characterize the MDC from the distribution-based methods (Table 1).

For the anchor-based approach, the most widely used external criterion is a global rating of change [25]. The original statistical strategy to ascertain the MID from a global rating of change, calculated the mean change of patients who rated themselves as having a small change [1]. More recently, the receiver-operating characteristic (ROC) curve analysis, instead of the mean change, is being used to determine the change score from the global rating of change [10], [26], [27], [28], [29], [30], [31], [32], [33].

The MID values may be derived from small change (e.g., 2–3 on a −7 to +7 point global rating of change [2], [25], [34], [35]) as originally proposed. However, investigators have also used a cutoff of moderate change to define the MID (e.g., 4–5 on −7 to +7 point global rating of change [30], [31], [36], [37] or 2 on a −3 to +3 point global rating of change [18], [29], [38], [39]). The optimal threshold is yet to be established. For this presentation, we have assumed that clinimetric indices constructed of only a few items, each representing a different aspect of the phenomenon, require a larger MID than psychometric instruments that are composed of a large number of closely related items. Indeed, this is how the original MID values were determined for the four included instruments in this study. A change in one or two items in psychometric scales would be diluted by the other many equally important items, thereby resulting in a smaller overall change as compared with the clinimetric index that each item (especially the highly weighted ones) contributed significantly to the overall score. Therefore, a small MID may prove significant in the psychometric scales but not necessarily in the clinimetric indices.

The major disadvantage of the distribution-based approaches in determining the MID is that they provide no guidance regarding the importance of the change [18]. Although the anchor-based approach is the preferred strategy to ascertain the MID, researchers may wish to use the MDC for estimating the MID in studies that did not use anchors. Here, we examine the relation between commonly used anchor- and distribution-based methods using four different health-related outcome measures (two disease-activity clinimetric indices, and two quality-of-life psychometric scales), to establish whether the MDC can reliably estimate the MID in the absence of external anchor of change.

Section snippets

Population

Four prospectively collected longitudinal data sets of the following well-established instruments were used: the Pediatric Ulcerative Colitis Activity Index (PUCAI) [38], the Pediatric Crohn's Disease Activity Index (PCDAI) [29], [40], the Rhinoconjunctivitis Quality of Life Questionnaire (RQLQ) [41], and the Chronic Respiratory Questionnaire (CRQ) [2], [42].

On the 7-point global rating of change (−3 to +3) used in the clinimetric indices (PUCAI and PCDAI), “0” was referred to as “no change,”

Results

The correlations between change in the relevant instruments/domains and the global ratings were 0.50 for the RQLQ, 0.61 for the dyspnea domain of the CRQ, 0.50 for the fatigue domain of the CRQ, 0.53 for the emotions domain of the CRQ, 0.72 for the PCDAI, and 0.85 for the PUCAI. All included indices and domains also satisfied criteria for the correlation of the baseline and the follow-up scores with the global rating of change.

Within the anchor-based methods, the ROC approach yielded values

Discussion

This systematic comparison of distribution- and anchor-based strategies has demonstrated that although there is some relation between the two approaches, the relation is inconsistent.

In interpreting the results, it is important to recognize two major differences between the psychometric instruments (CRQ and RQLQ) and the clinimetric indices (PUCAI and PCDAI). First, clinimetric and psychometric instruments are constructed in a very different way [47]. Classic psychometric measures are composed

Conclusion

The values of MID and MDC measure different concepts; the former measures important apparent change and the latter statistical distribution of margins of error. No distribution-based approach bears a consistent relationship with the anchor-based MID that should be regarded as the ultimate strategy to ascertain the MID. One could argue that, as a result, one should eschew the use of any distribution-based estimate of the MID. This, however, limits attempts to enhance the interpretability of the

References (58)

  • J.A. Cleland et al.

    Psychometric properties of the Neck Disability Index and Numeric Pain Rating Scale in patients with mechanical neck pain

    Arch Phys Med Rehabil

    (2008)
  • D. Turner et al.

    Development and evaluation of a Pediatric Ulcerative Colitis Activity Index (PUCAI): a prospective multicenter study

    Gastroenterology

    (2007)
  • A.J. Beurskens et al.

    Responsiveness of functional status in low back pain: a comparison of different instruments

    Pain

    (1996)
  • E.F. Juniper et al.

    Interpretation of rhinoconjunctivitis quality of life questionnaire data

    J Allergy Clin Immunol

    (1996)
  • H.J. Schunemann et al.

    A comparison of the original chronic respiratory questionnaire with a standardized version

    Chest

    (2003)
  • D. Turner et al.

    Using the entire cohort in the receiver operating characteristic analysis maximizes precision of the minimal important difference

    J Clin Epidemiol

    (2009)
  • J.G. Wright et al.

    A comparative contrast of clinimetric and psychometric methods for constructing indexes and rating scales

    J Clin Epidemiol

    (1992)
  • D.L. Streiner

    Clinimetrics vs. psychometrics: an unnecessary distinction

    J Clin Epidemiol

    (2003)
  • G.H. Guyatt et al.

    Methods to explain the clinical significance of health status measures

    Mayo Clin Proc

    (2002)
  • D. Cella et al.

    What is a clinically meaningful change on the Functional Assessment of Cancer Therapy-Lung (FACT-L) Questionnaire? Results from Eastern Cooperative Oncology Group (ECOG) Study 5592

    J Clin Epidemiol

    (2002)
  • H.J. Schunemann et al.

    Measurement properties and interpretability of the Chronic Respiratory Disease Questionnaire (CRQ)

    COPD

    (2005)
  • G.R. Norman et al.

    Relation of distribution- and anchor-based approaches in interpretation of changes in health-related quality of life

    Med Care

    (2001)
  • L.E. Kazis et al.

    Effect sizes for interpreting changes in health status

    Med Care

    (1989)
  • M.H. Liang et al.

    Comparisons of five health status instruments for orthopedic evaluation

    Med Care

    (1990)
  • N.S. Jacobson et al.

    Clinical significance: a statistical approach to defining meaningful change in psychotherapy research

    J Consult Clin Psychol

    (1991)
  • J.M. Bland et al.

    Statistical methods for assessing agreement between two methods of clinical measurement

    Lancet

    (1986)
  • J. Cohen

    Statistical power analysis for the behavioral sciences

    (1977)
  • D.A. Revicki et al.

    Responsiveness and minimal important differences for patient reported outcomes

    Health Qual Life Outcomes

    (2006)
  • D. Osoba et al.

    Interpreting the significance of changes in health-related quality-of-life scores

    J Clin Oncol

    (1998)
  • Cited by (274)

    View all citing articles on Scopus
    View full text