Discussion
This systematic review examined the current literature on clinical tests for the detection of cam or pincer morphology in individuals suspected of having FAI syndrome. Eight out of 4091 studies were included, and these reported on 17 clinical tests and two test combinations. Because of the insufficient number of studies per test, a meta-analysis could not be performed. There are three main findings: (1) there is only low-quality evidence; (2) no single test effectively rules in a cam or mixed morphology; (3) the FADIR, FPAW and the maximal squat test showed the best sensitivities and should be combined to cautiously rule out a cam or mixed morphology, but the validity of this combination should be tested with a multivariable regression model.
Specificity could only be calculated for nine tests. Overall results showed low specificity for all tests, ranging from 0.11 to 0.56. This indicates that these clinical tests might not be appropriate to rule in a cam or pincer morphology. High sensitivity was found for some pain provocation tests (the FADIR, FPAW, maximal squat) and for the FABER distance test. The interpretation of the FABER distance test, however, is questionable because the positivity criterion is a loss of distance between the lateral aspect of the knee and the examination table compared with the unaffected side. This requires the unaffected hip to be free of a cam or pincer morphology, but this can only be determined with imaging studies. Hence, this test is not applicable in physiotherapy practice. The lowest values for sensitivity were from a study for which we only had an abstract.21 No test reached a sensitivity above 0.75 in that study. Detailed information on test execution and criteria for a positive test were unavailable, and therefore findings from that study should be interpreted with caution.
LRs could be calculated for nine tests. The LRs only allow for small changes from pretest to post-test probabilities. However, the combination of three negative test results in the FADIR, FPAW and maximal squat tests yielded an LR− of 0.15. Unexpected results were obtained for the Stinchfield and FABER tests, where the LR+ was below 1 and the LR− above 1. These two tests were investigated in the same study,23 with a prevalence of FAI morphology of 10%. All subjects were suspected to have intra-articular pathology, which might explain the high false positive rate.
A higher suspicion of cam and pincer morphology may result in a higher sensitivity of clinical tests, because evaluators might rate the test as positive in cases where the result is less clear; this would in turn decrease the number of false negatives. Three studies21 22 24 included only patients with confirmed FAI deformities. In two studies,22 24 the raters were aware of this fact, while in one study the manuscript was unclear regarding the blinding status.
The reference test in most of the included studies was radiography, though one study23 used three different imaging techniques (MRI, MRA, X-ray) and one20 used MRI/MRA. Further, the included studies showed varying criteria for positivity of the reference test. They defined different alpha angle values for the diagnosis of cam morphology, as well as varying positivity criteria for pincer morphology, making them difficult to compare. Of the three tests proposed for our test combination, the maximal squat test was compared with MRI/MRA (head-neck offset <9 mm or alpha angle >55°), while the FPAW and the FADIR were compared with radiography (alpha angle >60°).
It is known that cam or pincer morphology can lead to labral and cartilage damage. Both types of damage are considered to be risk factors for early degenerative processes and osteoarthritis of the hip joint, due to reduced hip joint motions, elevated contact pressures and shear stress caused by cam and pincer deformities.3 28–31 It is important to recognise that lesions of the labrum can occur as a consequence of impingement but are not present in all cases.32 Thus, studies including participants who have only labral lesions are not appropriate for assessing the accuracy of tests for cam or pincer morphology. There is an association between cam morphology and the development of osteoarthritis, whereas pincer morphology (in contrast to acetabular dysplasia) does not seem to be a risk factor for osteoarthritis.31 33
It is not possible to make a general statement on whether sensitivity or specificity is more important. This depends on the context in which we apply a test. In the context of professional athletes, the sensitivity of a test should be high, so as not to miss potential cam or pincer morphologies. A diagnosis of such morphologies will have consequences on the athlete’s training or competing behaviour and might even have an impact on the pursuit of his career. In contrast, in a general population screening process we want the specificity to be high, so as not to have too many false positives, the impact of missing one case in that population is less serious.
Strengths of this systematic review include the facts that wherever possible, we clearly stated the types of clinical tests investigated—ROM, pain provocation or imaging—and precisely described the positivity criteria. We considered only studies that included symptomatic participants. This was done to meet the official definition of FAI syndrome, where symptoms are mandatory and asymptomatic individuals are not diagnosed with this condition.12 The guidelines of the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy and the PRISMA-DTA checklist were followed to ensure sound scientific practice. Data extraction, estimation of risk of bias and grading of evidence were performed by two independent reviewers. The findings of this review were presented visually in forest plots to provide a simple, quick and informative overview of the test accuracy of clinical tests. Furthermore, a test combination was designed to help practitioners apply the findings to clinical practice. Figure 4 permits the quick identification of post-test probabilities for different prevalences and tests. In comparison to a review published by Reiman et al,10 this report has several advantages. First, data of two additional studies were analysed. Second, only studies with symptomatic participants were included. Third, FAI deformities were clearly differentiated from labral tears alone.
A limitation of this review is that the proposition of chaining clinical tests might result in an overestimation of the post-test probability, if the combined tests are not fully independent.34 The value of this test combination should be evaluated in a new study with a multivariable logistic regression model. Additionally, three21 22 24 out of eight studies included only cases and hence, there was no clinical uncertainty, which introduces high risk of bias.
There were several limitations of the included studies. Most of the studies had a high risk of bias and rather low statistical precision. Different diagnostic criteria were used for the radiographic definition of cam or pincer morphology, as mentioned above, and in some cases, there was no clear statement of the diagnostic criteria. A further limitation is that the diagnostic test accuracy was not reported separately for cam, pincer or mixed morphology. In our proposed test combination, we included one test (maximal squat) from a study that diagnosed cam morphology, and two tests (the FADIR and FPAW) from a study that diagnosed FAI, defined as cam, pincer or mixed morphology. The inclusion of patients with only pincer morphology would probably lower the diagnostic test accuracy of the tests (see ref 35). Therefore, our suggestion is valid for the detection of cam or mixed morphologies. We cannot make a recommendation for the detection of pure pincer morphology.
There is a need for studies with larger numbers of participants, clear definitions of the diagnostic criteria of the reference tests and clear distinctions between patient subgroups (ie, those with cam morphology only, pincer morphology only or mixed-type morphologies) and between those with or without labral tears. Symptomatic patients with acetabular labral tears alone should not be considered as having FAI syndrome. Future studies should always include cases and non-cases so that sensitivity and specificity can be calculated, and the risk of bias should be reduced, especially by blinding the assessors concerning the patient’s morphology.
There is only low-quality evidence that negative test results reduce the post-test probability of cam or mixed morphologies to a moderate amount and that consecutive testing with the FADIR, FPAW and maximal squat tests might be used as a clinical test combination. Due to the low specificity of clinical tests, we would not recommend their use to confirm the diagnosis of FAI syndrome. But so far, we do not have strong information about the interpretability of these test results, that is, there is too high uncertainty due to low-quality evidence and high risk of bias. Therefore, further adequately designed studies in larger populations and with different patient settings are required.