Diagnostic test evaluation methodology: A systematic review of methods employed to evaluate diagnostic tests in the absence of gold standard – An update

Chinyereugo M. Umemneku Chikere; Kevin Wilson; Sara Graziadio; Luke Vale; A. Joy Allen

doi:10.1371/journal.pone.0223832

Abstract

Objective

To systematically review methods developed and employed to evaluate the diagnostic accuracy of medical test when there is a missing or no gold standard.

Study design and settings

Articles that proposed or applied any methods to evaluate the diagnostic accuracy of medical test(s) in the absence of gold standard were reviewed. The protocol for this review was registered in PROSPERO (CRD42018089349).

Results

Identified methods were classified into four main groups: methods employed when there is a missing gold standard; correction methods (which make adjustment for an imperfect reference standard with known diagnostic accuracy measures); methods employed to evaluate a medical test using multiple imperfect reference standards; and other methods, like agreement studies, and a mixed group of alternative study designs. Fifty-one statistical methods were identified from the review that were developed to evaluate medical test(s) when the true disease status of some participants is unverified with the gold standard. Seven correction methods were identified and four methods were identified to evaluate medical test(s) using multiple imperfect reference standards. Flow-diagrams were developed to guide the selection of appropriate methods.

Conclusion

Various methods have been proposed to evaluate medical test(s) in the absence of a gold standard for some or all participants in a diagnostic accuracy study. These methods depend on the availability of the gold standard, its’ application to the participants in the study and the availability of alternative reference standard(s). The clinical application of some of these methods, especially methods developed when there is missing gold standard is however limited. This may be due to the complexity of these methods and/or a disconnection between the fields of expertise of those who develop (e.g. mathematicians) and those who employ the methods (e.g. clinical researchers). This review aims to help close this gap with our classification and guidance tools.

Citation: Umemneku Chikere CM, Wilson K, Graziadio S, Vale L, Allen AJ (2019) Diagnostic test evaluation methodology: A systematic review of methods employed to evaluate diagnostic tests in the absence of gold standard – An update. PLoS ONE 14(10): e0223832. https://doi.org/10.1371/journal.pone.0223832

Editor: Gianni Virgili, Universita degli Studi di Firenze, ITALY

Received: June 3, 2019; Accepted: September 29, 2019; Published: October 11, 2019

Copyright: © 2019 Umemneku Chikere et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its supporting information files.

Funding: This work is supported by the Newcastle University Research Excellence; the School of Mathematics, Statistics and Physics Newcastle University; the Institute of Health & Society Newcastle University; and the National Institute for Health Research (NIHR) [NIHR Newcastle In Vitro Diagnostics Co-operative]. The view and opinions expressed are those of the authors and do not necessary reflect those of the NIHR Newcastle In Vitro Diagnostics Co-operative, Newcastle University and Newcastle upon Tyne NHS Foundation Trust, the NHS or Newcastle Research Academy. The views expressed are those of the authors and not necessarily those of the NIHR, the NHS or the Department of Health and Social Care. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interest exist.

Introduction

Before a new medical test can be introduced into clinical practice, it should be evaluated for analytical validity (does the test work in the laboratory?), clinical validity (does the test work in the patient population of interest?) and clinical utility (is the test useful–can it lead to improvement in health outcomes?) [1, 2]. Clinical validity studies, also called diagnostic accuracy studies, evaluate the test’s accuracy in discriminating between patients with or without the target condition (disease) [3]. The characteristics of the test (e.g. sensitivity and specificity) may inform what role the index test (the new test under evaluation) plays in the diagnostic pathway; is it a triage, add-on or replacement test? [4] Sensitivity (the proportion of participants correctly identified by the index test as having the target condition e.g. those with the disease) and specificity (the proportion of participants correctly identified by the index as not having the target condition) [5–7] are basic measures of the diagnostic accuracy of a test. Other common measures are predictive values, likelihood values, overall accuracy [8, 9], receiver operating characteristic (ROC) curve, area under the ROC curve (AUROC) [10], ROC surface, and volume under the ROC surface (VUS) [11–13]. These measures are obtained by comparing the index test results with the results of the best currently available test for diagnosing the same target condition in the same participants; both tests are supposedly applied to all participants of the study [14]. The test employed as the benchmark to evaluate the index test is called the reference standard [15]. The reference standard could be a gold standard (GS), with sensitivity and specificity equal to 100%. This means that the gold standard perfectly discriminates between participants with or without the target conditions and provides unbiased estimates of the diagnostic accuracy measure of the index test as describe in Fig 1. The term “bias” in this review is defined as the difference between the estimated value and the true value of the parameter of interest [16].

Download:

Fig 1. Classical method of evaluating the diagnostic accuracy of a medical test with binary test result and dichotomized disease status.

https://doi.org/10.1371/journal.pone.0223832.g001

It is also expected that when evaluating the diagnostic accuracy of a medical test, the participants undertake both the index and reference tests within a short time-period if not simultaneously. This is to avoid biases caused by changes in their true disease status, which can also affect the diagnostic accuracy of the index test.

In addition to the common aforementioned diagnostic accuracy measures, there are other ways to evaluate the test performance of an index test. These include studies of agreement or concordance [17] between the index test and the reference standard and test positivity (or negativity) rate; that is the proportion of diagnostic tests that are positive (or negative) to the target condition [18].

In practice, there are deviations from the classical method (Fig 1). These deviations are:

Scenarios where the gold standard is not applied to all participants in the study (i.e. there is a missing gold standard) because it is expensive, or invasive, or patients do not consent to it, or the clinicians decided not to give the gold test to some patients for medical reasons [19, 20]. Evaluating the new test using data only from participants whose disease status was confirmed with the gold standard can produce work-up or verification bias [21].
Scenarios where the reference standard is not a gold standard (i.e. it is an imperfect reference standard) because it has a misclassification error or because there is no generally accepted reference standard for the target condition. Using an imperfect reference standard produces reference standard bias [22, 23].

Several methods have been developed and used to evaluate the test performance of a medical test in these two scenarios.

Reviews of some of these methods have been undertaken previously. The reviews by Zhou [24], Alonzo [25] and the report by Naaktgeboren et al [26] focused on methods when the gold standard or reference standard is not applied to all participants in the study; Van Smeden et al [27] and Collins and Huynh [28] focused on the latent class models (LCMs); and Hui and Zhou [29], Trikalinos and Balion [30] and Enoe et al [31] focused on methods employed when the reference standard is imperfect. Zaki et al [32] focused on the agreement between medical tests whose results are reported as a continuous response. Branscum et al [33] focused on Bayesian approaches; and the reviews by Walsh [23], Rutjes et al [14] and Reitsma et al [34] focused around methods for evaluating diagnostic tests when there is a missing or imperfect reference standard.

The existing comprehensive reviews on this topic were published about 11 years ago [14, 34]; knowledge, ideas, and research in this field has evolved significantly since then. Several new methods have been proposed and some existing methods have been modified. It is also possible that some previously identified methods may now be obsolete. Therefore, one of the aims of this systematic review is to review new and existing methods employed to evaluate the test performance of medical test(s) in the absence of gold standard for all or some of the participants in the study. It also aims to provide easy to use tools (flow-diagrams) for the selection of methods to consider when evaluating medical tests when sub-sample of the participants do not undergo the gold standard. The review builds upon the earlier reviews by Rutjes et al and Reitsma et al [14, 34]. This review sought to identify methods developed to evaluate a medical test with continuous results in the presence of verification bias and when the diagnostic outcome (disease status) is classified into three or more groups (e.g. diseased, intermediate and non-diseased). This is a gap identified in the review conducted by Alonzo [25] in 2014.

The subsequent sections discuss the method employed to undertake the review, the results, the discussion of the findings and guidance to researchers involved in test accuracy studies.

Methodology

A protocol for this systematic review was developed, peer-reviewed and registered on PROSPERO (CRD42018089349).

Eligibility criteria

The review includes methodological articles (that is papers that proposed or developed a method) and application articles (that is papers where any of the proposed methods) were applied.

Inclusion.

Articles published in English language in a peer-reviewed journal.
Articles that focus on evaluating the diagnostic accuracy of new (index) test when there is a missing gold standard, no gold standard or imperfect reference standard.

Exclusion.

Articles that assumed that the reference standard was a gold standard and the gold standard was applied to all participants in the study.
Books, dissertations, thesis, conference abstracts, and articles not published in a peer reviewed journal.
Systematic reviews and meta-analyses of the diagnostic accuracy of medical test(s) for a target condition (disease) in the absence of gold standard for some or all of the participants. However, individual articles included in these reviews that met the inclusion criteria were included.

Search strategies and selection of articles

The PRISMA statement [35] was used as a guideline when conducting this systematic review. The PRISMA checklist for this review, S1 Checklist, is included as one of the supplementary materials. The following bibliographic databases were searched: EMBASE, MEDLINE, SCOPUS, WILEY online library (which includes Cochrane library, EBM), PSYCINFO, Web of Science, and CINAHL. The details of the search strategies at reported in the S1 Appendix. The search dates were from January 2005 –February 2019. This is because, this review is an update of a review by Rutjes et al and Reitsma et al whose searched up to 2005. However, original methodological articles that proposed and described a method to evaluate medical test(s) when there is a missing or no gold standard published before 2005 were also included in the review. These original articles were identified by "snowballing" [36] from the references of some articles. All articles obtained from the electronic databases were imported to Endnote X8.0.2. The selection of articles to be included in this review were done by three people (CU, AJA, and KW). The sifting process was in two-stages: by title and abstract and then by full text against the inclusion and exclusion criteria. Any discrepancies between reviewers were resolved in a group meeting.

Data synthesis

A data collection form was developed for this review (S1 Data), which was piloted on seven studies and remodified to fit the purpose of this review. Information extracted from the included articles were synthesized narratively.

Results

A total of 6127 articles were identified; 5472 articles were left after removing the duplicated articles; 5071 articles were excluded after sifting by title and abstract; 401 articles went forward to full text assessment; and a total of 209 articles were included in the review. The search and selection procedure are depicted using the PRISMA [35] flow-diagram (Fig 2).

Download:

Fig 2. PRISMA flow-diagram of articles selected and included in the systematic review.

https://doi.org/10.1371/journal.pone.0223832.g002

The articles included in this review used a wide variety of different study designs, like cross-sectional studies, retrospective studies, cohort studies, prospective studies and simulation studies.

The identified methods were categorized into four groups based on the availability and/or application of the gold standard to the participants in the study. These group are:

Group 1: Methods employed when there is a missing gold standard.
Group 2: Correction methods which adjust for using an imperfect reference standard whose diagnostic accuracy is known.
Group 3: Methods employed when using multiple imperfect reference standards.
Group 4: “other methods”. This group includes methods like study of agreement, test positivity rate, and considering alternative study design like validation.

Methods in groups 2, 3 and 4 are employed when there is no gold standard to evaluate the diagnostic accuracy of the index test; while methods in group 1 are employed when there is a gold standard to evaluate the diagnostic accuracy of the index test(s). However, the gold standard is applied to only a sub-sample of the participants.

A summary of all methods identified in the review, their key references and the clinical applications of these methods are reported on Table 1.

Download:

Table 1. Summary of classification of methods employed when there is missing or no gold standard.

https://doi.org/10.1371/journal.pone.0223832.t001

Methods employed when gold standard is missing

Fifty-one statistical methods were identified from the review that were developed to evaluate the diagnostic accuracy of index test(s) when the true disease status of some participants is not verified with the gold standard. These methods are divided into two subgroups:

Imputation and bias-correction methods: This includes methods to correct for verification bias while the disease-status of the unverified participants are left unverified. Forty-eight statistical methods were identified in this group. These methods are further classified based on the result of the index test (binary, ordinal or continuous), the number of index tests evaluated (single or multiple), the assumptions made about verification (ignorable or missing at random–MAR) or non-ignorable or missing not at random–MNAR), and the classification of the diagnostic outcomes (disease-status). The identified methods in this subgroup are displayed Figs 3 and 4.

Download:

Fig 3. Imputation and bias–correction methods in binary diagnostic outcomes.

https://doi.org/10.1371/journal.pone.0223832.g003

Download:

Fig 4. Imputation and bias–correction methods in three- classes diagnostic outcomes where ROC and VUS is estimated.

https://doi.org/10.1371/journal.pone.0223832.g004

Differential verification approach: Participants whose disease status was not verified with the gold standard could undergo another reference standard (that is imperfect or less invasive than the gold standard [84]) to ascertain their disease status. This is known as differential verification [200]. Differential verification has been explored Alonzo et al, De Groot et al and Naaktgeboren et al [200–202]. They discussed the bias associated with differential verification, and how results using this approach could be presented. There are three identified statistical methods in this group. They are: a Bayesian latent class approach proposed by De Groot et al [82], a Bayesian method proposed by Lu et al [203] and a ROC approach proposed by Glueck et al [16]. These three methods aim to simultaneously adjust for differential verification bias and reference standard bias that arises from using an alternative reference standard (i.e. imperfect reference standard) for participants whose true disease status was not verified with the gold standard.

Correction methods

This group includes algebraic methods developed to correct the estimated sensitivity and specificity of the index test when the sensitivity and specificity of the imperfect reference standard is known. There are seven statistical methods in this group described in five different articles [91–95]. The methods by Emerson et al [95] does not estimate a single value for sensitivity or specificity, unlike the other correction methods [91–94] but produces an upper bound value and a lower bound value for the sensitivity and specificity of the index test. These bounded values are used to explain the uncertainty around the estimated sensitivity and specificity of the index test.

Methods with multiple imperfect reference standards

A gold standard or accurate information about the diagnostic accuracy of the imperfect reference standard are often not available to evaluate the index test. In these situations, multiple imperfect reference standards can be employed to evaluate the index test. Methods in this group include:

Discrepancy analysis: this compares the index test with an imperfect reference standard. Participants with discordant results undergo another imperfect test, called the resolver test, to ascertain their disease status. Discrepancy analysis is typically not recommended because it produces biased estimates [100, 204]. Modifications of this approach have been proposed [18, 101, 136]. In these, some of the participants with concordant responses (true positives and true negatives) are sampled to undertake the resolver test alongside participants with discordant responses (false negative–FN and false positive–FP). However, further research is needed to explore if these modified approaches are adequate to remove or reduce the potential bias.
Latent class analysis (LCA): The test performance of all the tests employed in the study are evaluated simultaneously using probabilistic models with the basic assumption that the disease status is latent or unobserved. There are frequentist LCAs and Bayesian LCAs. The frequentist LCAs use only the data from the participants in the study to estimate the diagnostic accuracy measures of the tests; while the Bayesian LCAs employ external information (e.g. expert opinion or estimates from previous research study) on the diagnostic accuracy measures of the tests evaluated in additional to the empirical data obtained from participants within the study. The LCAs assume that the tests (new test and reference standards) are either conditionally independent given the true disease status or the tests are conditionally dependent. To model the conditional dependence among the tests, various latent class model (LCM) with different dependence structure have been developed such as the Log-linear LCM [102], Probit LCM [103], extended log-linear and Probit LCM [108], Gaussian Random Effect LCM [105] and two-crossed random effect LCM [107] among others. However, some studies [205],[206] have shown that latent class models with different conditional dependence structures produce different estimates of sensitivities and specificities and each model still has a good fit. Thus, further research could be carried out to explore if each of the conditional dependence LCM are case specific.
Construct composite reference standard: this method combines results from multiple imperfect tests (excluding the index test) with a predetermined rule to construct a reference standard that is used to evaluate the index test. By excluding the index test as part of the composite reference standard, incorporation bias can be avoided [131]. A novel method identified under the composite reference standard is the “dual composite reference standard” proposed by Tang et al [134].
Panel or consensus diagnosis: this method uses the decision from a panel of experts to ascertain the disease status of each participant, which is then used to evaluate the index test.

Other methods

This group includes methods that fit the inclusion criteria but could not be placed into the other three groups. They include study of agreement, test positivity rate and the use of an alternative study design such as analytical validation. Study of agreement and test positivity rate are best used as exploratory tools alongside other methods [152, 178] because they are not robust enough to assess the diagnostic ability of the medical test. Validation of a medical test cut across different disciplines in medicine such as psychology, laboratory or experimental medicine. With this approach, the medical test is assessed based on what it is designed to do [191]. Other designs include case-control designs (where the participants are known to have or not have the target condition) [207, 208], laboratory based studies or experimental studies which are undertaken with the aim to evaluate the analytical sensitivity and specificity of the index test [190, 209, 210].

Guidance to researchers

The guidance flowchart (Fig 5) is a modification and extension of the guidance for researchers flow-diagram developed by Reitsma et al [34].

Download:

Fig 5. Guidance flowchart of methods employed to evaluate medical test in missing and no gold standard scenarios.

https://doi.org/10.1371/journal.pone.0223832.g005

Since, evaluating the accuracy measures of the index test is the focus of any diagnostic accuracy study, the flowchart starts with asking the first question “Is there a gold standard to evaluate the index test?” Following the responses from each question box (not bold); methods are suggested (bold boxes at the bottom of the flowchart) to guide clinical researchers, test evaluators, and researchers as to the different methods to consider.

Although, this review aims to provide up-to-date approaches that have been proposed or employed to evaluate the diagnostic accuracy of an index test in the absence of a gold standard for some or all of the participants in the accuracy study; some things researchers can consider when designing an accuracy study aside from the aim of their studies, are outlined in Box 1 ([26, 211–218]).

Box 1: Suggestions when designing a diagnostic accuracy study.

Design a protocol: The protocol describes every step of the study. It states the problem and how it will be addressed.
Selection of participants from target population: The target population determines the criteria for including participants in the study. Also, the population is important in selecting the appropriate setting for the study.
Selection of appropriate reference standard: The reference standard should diagnose same target condition as the index test. The choice of reference standard (gold or non-gold) determines the methods to apply when evaluating the index test (see Fig 5).
Sample size: Having adequate sample size is necessary to make precise inference from the statistical analysis that will be carried out. Studies that discuss the appropriate sample size to consider when planning test accuracy are [211–215].
Selection of accuracy measure to estimate: The researchers need to decide which accuracy measures they wish to estimate, and this is often determined by the test’s response (binary or continuous).
Anticipate and eliminate possible bias: multiple forms of bias may exist [26, 216–218]. Exploring how to avoid or adjust for these bias (if they are unavoidable) is important.
Validation of results: Is validation of the results from the study on an independent sample feasible? Validation ensures an understanding of the reproducibility, strengths, and limitations of the study.

Some guidelines and tools have been developed to assist in designing, conducting and reporting diagnostic accuracy studies such as the STARD [219–223] guidelines, GATE [224] framework, QUADAS [225] tools; which can aid the design of a robust test accuracy study.

Discussion

This review sought to identify and review new and existing methods employed to evaluate the diagnostic accuracy of a medical test in the absence of gold standard. The identified methods are classified into four main groups based on the availability and/or the application of the gold standard on the participants in the study. The four groups are: methods employed when only a sub-sample of the participants have their disease status verified with the gold standard (group 1); correction methods (group 2); methods using multiple imperfect reference standards (group 3) and other methods (group 4) such as study of agreement, test positivity rate and alternative study designs like validation.

In this review additional statistical methods have been identified that were not included in the earlier reviews on this topic by Reitsma et al [34] and Alonzo [25]. A list of all the methods identified in this review are presented in the supplementary material (S1 Supplementary Information). This includes a brief description of the methods and a discussion of their strengths and weaknesses and any identified case studies where the methods have been clinically applied. Only a small number of the methods we have identified have applied clinically and published [38, 63]. This may be due to the complexity of these methods (in terms of application and interpretation of results), and/or a disconnection between the fields of expertise of those who develop (e.g. mathematicians or statisticians) and those who employ the methods (e.g. clinical researchers). For example, the publication of such method in specialist statistical journals may not be readily accessible to clinical researchers designing the study. In order to close this gap, two flow-diagrams (Figs 3 and 4) were constructed in addition to the modified guidance flowchart, (Fig 5) as guidance tools to aid clinical researchers and test evaluators in the choice of methods to consider when evaluating medical test in the absence of gold standard. Also, an R package (bcROCsurface) and an interactive web application (Shiny app) that estimates the ROC surface and VUS in the presence of verification bias have been developed by To Duc [78] to help bridge the gap.

One of the issues not addressed in this current review was on methods that evaluate the differences in diagnostic accuracy of two or more tests in the presence of verification bias. Some published articles that consider this issue are Nofuentes and Del Castillo [226–230], Marin-Jimenez and Nofuentes [231], Harel and Zhou [232] and Zhou and Castelluccio [233]. This review also did not consider methods employed to estimate the time-variant sensitivity and specificity of diagnostic test in absence of a gold standard. This issue has recently been addressed by Wang et al [234].

In terms of the methodology, a limitation of this review is the exclusion of books, dissertations, thesis, conference abstract and articles not published in English language (such as the review by Masaebi et al [235] which was published in 2019), which could imply that there could still be some methods not identified by this review.

Regarding the methods identified in this review, further research could be carried to explore the different modification to the discrepancy analysis approaches to understand if these modifications reduce or remove the potential bias. In addition, further research is needed to determine if the different methods developed to evaluate an index test in the presence of verification bias are robust methods. Given the large numbers of statistical methods that have been developed especially to evaluate medical tests when there is a missing gold standard and the complexity of some of these methods; more interactive web application (e.g. Shiny package in R [236]) could be developed to implement these methods in addition to the Shiny app developed by To Duc [78] and Lim et al [237]. The development of such interactive web tools will expedite the clinical applications of these developed methods and help bridge the gap between the method developers and the clinical researchers or tests evaluators who are the end users of these methods.

Conclusion

Various methods have been proposed and applied in the evaluation of medical tests when there is a missing gold standard result for some participants, or no gold standard at all. These methods depend on the availability of the gold standard, its application to all or subsample of participants in the study, the availability of alternative reference standard(s), and underlying assumption(s) made with respect to the index test(s) and / or participants in the study.

Knowing the appropriate method to employ when analysing the data from participants of a diagnostic accuracy studies in the absence of gold standard, help to make statistically robust inference on the accuracy of the index test. This, in addition to data on cost-effectiveness, utility and usability of the test will support clinicians, policy makers and stake holders to decide the adoption of the new test in practice or not.

Supporting information

S1 Checklist. PRISMA checklist.

https://doi.org/10.1371/journal.pone.0223832.s001

(DOC)

S1 Data. Data extraction form.

https://doi.org/10.1371/journal.pone.0223832.s002

(DOCX)

S1 Appendix.

https://doi.org/10.1371/journal.pone.0223832.s003

(DOCX)

S1 Supplementary Information.

https://doi.org/10.1371/journal.pone.0223832.s004

(DOCX)

Acknowledgments

The authors will like to acknowledge Professor Patrick Bossuyt from the Department of Clinical Epidemiology and Biostatistics, Academic Medical centre, University of Amsterdam, the Netherlands, for giving the consent to update his review, reviewing the protocol and his continued advice throughout this work. Also we will like to acknowledge the authors of the previous review, Dr Anne Rutjes in University of Bern, Switzerland; Professor Johannes Reitsma in the Department of Epidemiology, Julius Center Research Program Methodology UMC Utrecht, The Netherlands; Professor Arri Coomarasamy in the College of Medical and Dental Sciences, University of Birmingham, UK; and Professor Khalid Saeed Khan in Queen Mary, University of London for the guidance flowchart which was modified and extended. AJA, SG, and LV are supported by the National Institute for Health Research (NIHR) Newcastle In Vitro Diagnostics Co-operative. The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care.

References

1. Bossuyt PM, Reitsma JB, Linnet K, Moons KG. Beyond diagnostic accuracy: the clinical utility of diagnostic tests. Clinical chemistry. 2012;58(12):1636–43. pmid:22730450
- View Article
- PubMed/NCBI
- Google Scholar
2. Burke W. Genetic tests: clinical validity and clinical utility. Current protocols in human genetics. 2014;81(1):9.15. 1–9. 8.
- View Article
- Google Scholar
3. Mallett S, Halligan S, Matthew Thompson GP, Collins GS, Altman DG. Interpreting diagnostic accuracy studies for patient care. BMJ (Online). 2012;345(7871). pmid:22750423
- View Article
- PubMed/NCBI
- Google Scholar
4. Bossuyt PMI L.; Craig J.; Glasziou P. Comparative accuracy: Assessing new tests against existing diagnostic pathways. British Medical Journal. 2006;332(7549):1089–92. pmid:16675820
- View Article
- PubMed/NCBI
- Google Scholar
5. Altman DG, Bland JM. Diagnostic tests 1: Sensitivity and specificity. British Medical Journal. 1994;308(6943):1552. pmid:8019315
- View Article
- PubMed/NCBI
- Google Scholar
6. Eusebi P. Diagnostic Accuracy Measures. Cerebrovascular Diseases. 2013;36(4):267–72. WOS:000326935800004. pmid:24135733
- View Article
- PubMed/NCBI
- Google Scholar
7. Šimundić A-M. Measures of diagnostic accuracy: basic definitions. Ejifcc. 2009;19(4):203. pmid:27683318
- View Article
- PubMed/NCBI
- Google Scholar
8. Altman DG, Bland JM. Diagnostic tests 2: Predictive values. British Medical Journal. 1994;309(6947):102. pmid:8038641
- View Article
- PubMed/NCBI
- Google Scholar
9. Wong HB, Lim GH. Measures of diagnostic accuracy: Sensitivity, specificity, PPV and NPV. Proceedings of Singapore Healthcare. 2011;20(4):316–8.
- View Article
- Google Scholar
10. Alonzo TA, Pepe MS. Assessing accuracy of a continuous screening test in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2005;54(1):173–90.
- View Article
- Google Scholar
11. Duc KT, Chiogna M, Adimari G. Bias–corrected methods for estimating the receiver operating characteristic surface of continuous diagnostic tests. Electronic Journal of Statistics. 2016;10(2):3063–113.
- View Article
- Google Scholar
12. Chi YY, Zhou XH. Receiver operating characteristic surfaces in the presence of verification bias. Journal of the Royal Statistical Society Series C: Applied Statistics. 2008;57(1):1–23.
- View Article
- Google Scholar
13. Zhang Y, Alonzo TA, for the Alzheimer's Disease Neuroimaging I. Inverse probability weighting estimation of the volume under the ROC surface in the presence of verification bias. Biometrical Journal. 2016;58(6):1338–56. pmid:27338713
- View Article
- PubMed/NCBI
- Google Scholar
14. Rutjes AW, Reitsma JB, Coomarasamy A, Khan KS, Bossuyt PM. Evaluation of diagnostic tests when there is no gold standard. A review of methods. Health technology assessment (Winchester, England). 2007;11(50):iii, ix–51.
- View Article
- Google Scholar
15. Kohn MA, Carpenter CR, Newman TB. Understanding the Direction of Bias in Studies of Diagnostic Test Accuracy. Academic Emergency Medicine. 2013;20(11):1194–206. WOS:000327026400017. pmid:24238322
- View Article
- PubMed/NCBI
- Google Scholar
16. Glueck DHL M. M.; O'Donnell C. I.; Ringham B. M.; Brinton J. T.; Muller K. E.; Lewin J. M.; Alonzo T. A.; Pisano E. D. Bias in trials comparing paired continuous tests can cause researchers to choose the wrong screening modality. BMC medical research methodology. 2009;9:4. pmid:19154609
- View Article
- PubMed/NCBI
- Google Scholar
17. Theel ES, Hilgart H, Breen-Lyles M, McCoy K, Flury R, Breeher LE, et al. Comparison of the QuantiFERON-TB gold plus and QuantiFERON-TB gold in-tube interferon gamma release assays in patients at risk for tuberculosis and in health care workers. Journal of Clinical Microbiology. 2018;56(7). pmid:29743310
- View Article
- PubMed/NCBI
- Google Scholar
18. Van Dyck E, Buvé A, Weiss HA, Glynn JR, Brown DWG, De Deken B, et al. Performance of commercially available enzyme immunoassays for detection of antibodies against herpes simplex virus type 2 in African populations. Journal of Clinical Microbiology. 2004;42(7):2961–5. pmid:15243045
- View Article
- PubMed/NCBI
- Google Scholar
19. Naaktgeboren CA, De Groot JAH, Rutjes AWS, Bossuyt PMM, Reitsma JB, Moons KGM. Anticipating missing reference standard data when planning diagnostic accuracy studies. BMJ (Online). 2016;352. pmid:26861453
- View Article
- PubMed/NCBI
- Google Scholar
20. Karch AK A.; Zapf A.; Zerr I.; Karch A. Partial verification bias and incorporation bias affected accuracy estimates of diagnostic studies for biomarkers that were part of an existing composite gold standard. Journal of Clinical Epidemiology. 2016;78:73–82. WOS:000389615400010. pmid:27107877
- View Article
- PubMed/NCBI
- Google Scholar
21. Begg CB, Greenes RA. Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics. 1983;39(1):207–15. pmid:6871349
- View Article
- PubMed/NCBI
- Google Scholar
22. Thompson M, Van den Bruel A. Sources of Bias in Diagnostic Studies. Diagnostic Tests Toolkit: Wiley-Blackwell; 2011. p. 26–33.
23. Walsh T. Fuzzy gold standards: Approaches to handling an imperfect reference standard. Journal of Dentistry. 2018;74:S47–S9. pmid:29929589
- View Article
- PubMed/NCBI
- Google Scholar
24. Zhou XH. Correcting for verification bias in studies of a diagnostic test's accuracy. Statistical Methods in Medical Research. 1998;7(4):337–53. pmid:9871951
- View Article
- PubMed/NCBI
- Google Scholar
25. Alonzo TA. Verification bias-impact and methods for correction when assessing accuracy of diagnostic tests. Revstat Statistical Journal. 2014;12(1):67–83.
- View Article
- Google Scholar
26. Naaktgeboren CA, de Groot JA, Rutjes AW, Bossuyt PM, Reitsma JB, Moons KG. Anticipating missing reference standard data when planning diagnostic accuracy studies. bmj. 2016;352:i402. pmid:26861453
- View Article
- PubMed/NCBI
- Google Scholar
27. Van Smeden M, Naaktgeboren CA, Reitsma JB, Moons KGM, de Groot JAH. Latent Class Models in Diagnostic Studies When There is No Reference Standard-A Systematic Review. American Journal of Epidemiology. 2014;179(4):423–31. WOS:000331264100003. pmid:24272278
- View Article
- PubMed/NCBI
- Google Scholar
28. Collins J, Huynh M. Estimation of diagnostic test accuracy without full verification: a review of latent class methods. Statistics in Medicine. 2014;33(24):4141–69. pmid:24910172
- View Article
- PubMed/NCBI
- Google Scholar
29. Hui SL, Zhou XH. Evaluation of diagnostic tests without gold standards. Statistical Methods in Medical Research. 1998;7(4):354–70. pmid:9871952
- View Article
- PubMed/NCBI
- Google Scholar
30. Trikalinos TA, Balion CM. Chapter 9: Options for summarizing medical test performance in the absence of a "gold standard". Journal of General Internal Medicine. 2012;27(SUPPL.1):S67–S75.
- View Article
- Google Scholar
31. Enøe C, Georgiadis MP, Johnson WO. Estimation of sensitivity and specificity of diagnostic tests and disease prevalence when the true disease state is unknown. Preventive Veterinary Medicine. 2000;45(1):61–81. https://doi.org/10.1016/S0167-5877(00)00117-3.
- View Article
- Google Scholar
32. Zaki R, Bulgiba A, Ismail R, Ismail NA. Statistical methods used to test for agreement of medical instruments measuring continuous variables in method comparison studies: a systematic review. PloS one. 2012;7(5):e37908. pmid:22662248
- View Article
- PubMed/NCBI
- Google Scholar
33. Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive veterinary medicine. 2005;68(2–4):145–63. pmid:15820113
- View Article
- PubMed/NCBI
- Google Scholar
34. Reitsma JBR A. W. S.; Khan K. S.; Coomarasamy A.; Bossuyt P. M. A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. Journal of Clinical Epidemiology. 2009;62(8):797–806. pmid:19447581
- View Article
- PubMed/NCBI
- Google Scholar
35. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JP, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ (Clinical research ed). 2009;339. pmid:19622552
- View Article
- PubMed/NCBI
- Google Scholar
36. Tips Sayers A. and tricks in performing a systematic review. Br J Gen Pract. 2008;58(547):136–.
- View Article
- Google Scholar
37. Harel OZ X. H. Multiple imputation for correcting verification bias. Statistics in Medicine. 2006;25(22):3769–86. WOS:000242429400001. pmid:16435337
- View Article
- PubMed/NCBI
- Google Scholar
38. He H, McDermott MP. A robust method using propensity score stratification for correcting verification bias for binary tests. Biostatistics. 2012;13(1):32–47. pmid:21856650
- View Article
- PubMed/NCBI
- Google Scholar
39. Zhou XH. Maximum likelihood estimators of sensitivity and specificity corrected for verification bias. Communications in Statistics—Theory and Methods. 1993;22(11):3177–98.
- View Article
- Google Scholar
40. Kosinski AS, Barnhart HX. Accounting for nonignorable verification bias in assessment of diagnostic tests. Biometrics. 2003;59(1):163–71. pmid:12762453
- View Article
- PubMed/NCBI
- Google Scholar
41. Kosinski AS, Barnhart HX. A global sensitivity analysis of performance of a medical diagnostic test when verification bias is present. Statistics in Medicine. 2003;22(17):2711–21. pmid:12939781
- View Article
- PubMed/NCBI
- Google Scholar
42. Martinez EZAA J.; Louzada-Neto F. Estimators of sensitivity and specificity in the presence of verification bias: A Bayesian approach. Computational Statistics and Data Analysis. 2006;51(2):601–11.
- View Article
- Google Scholar
43. Buzoianu M, Kadane JB. Adjusting for verification bias in diagnostic test evaluation: A Bayesian approach. Statistics in Medicine. 2008;27(13):2453–73. pmid:17979150
- View Article
- PubMed/NCBI
- Google Scholar
44. Hajivandi A, Shirazi HRG, Saadat SH, Chehrazi M. A Bayesian analysis with informative prior on disease prevalence for predicting missing values due to verification bias. Open Access Macedonian Journal of Medical Sciences. 2018;6(7):1225–30. pmid:30087725
- View Article
- PubMed/NCBI
- Google Scholar
45. Zhou XH. Comparing accuracies of two screening tests in a two-phase study for dementia. Journal of the Royal Statistical Society Series C: Applied Statistics. 1998;47(1):135–47.
- View Article
- Google Scholar
46. Lloyd CJ, Frommer DJ. An application of multinomial logistic regression to estimating performance of a multiple-screening test with incomplete verification. Journal of the Royal Statistical Society Series C-Applied Statistics. 2008;57:89–102. WOS:000252330800006.
- View Article
- Google Scholar
47. Albert PS. Imputation approaches for estimating diagnostic accuracy for multiple tests from partially verified designs. Biometrics. 2007;63(3):947–57. pmid:17825024
- View Article
- PubMed/NCBI
- Google Scholar
48. Albert PS, Dodd LE. On estimating diagnostic accuracy from studies with multiple raters and partial gold standard evaluation. Journal of the American Statistical Association. 2008;103(481):61–73. WOS:000254311500014. pmid:19802353
- View Article
- PubMed/NCBI
- Google Scholar
49. Martinez EZ, Achcar JA, Louzada-Neto F. Bayesian estimation of diagnostic tests accuracy for semi-latent data with covariates. Journal of Biopharmaceutical Statistics. 2005;15(5):809–21. pmid:16078387
- View Article
- PubMed/NCBI
- Google Scholar
50. Xue X, Kim MY, Castle PE, Strickler HD. A new method to address verification bias in studies of clinical screening tests: Cervical cancer screening assays as an example. Journal of Clinical Epidemiology. 2014;67(3):343–53. pmid:24332397
- View Article
- PubMed/NCBI
- Google Scholar
51. Walter SD. Estimation of test sensitivity and specificity when disease confirmation is limited to positive results. Epidemiology. 1999:67–72. pmid:9888282
- View Article
- PubMed/NCBI
- Google Scholar
52. Böhning D, Patilea V. A capture–recapture approach for screening using two diagnostic tests with availability of disease status for the test positives only. Journal of the American Statistical Association. 2008;103(481):212–21.
- View Article
- Google Scholar
53. Chu HZ, Yijie; Cole, Stephen R.; Ibrahim, Joseph G. On the estimation of disease prevalence by latent class models for screening studies using two screening tests with categorical disease status verified in test positives only. Statistics in Medicine. 2010;29(11):1206–18. pmid:20191614
- View Article
- PubMed/NCBI
- Google Scholar
54. Baker SG. Evaluating multiple diagnostic tests with partial verification. Biometrics. 1995;51(1):330–7. pmid:7539300
- View Article
- PubMed/NCBI
- Google Scholar
55. Van Geloven NB K. A.; Opmeer B. C.; Mol B. W.; Zwinderman A. H. How to deal with double partial verification when evaluating two index tests in relation to a reference test? Statistics in Medicine. 2012;31(11–12):1265–76. pmid:22161741
- View Article
- PubMed/NCBI
- Google Scholar
56. Van Geloven N, Broeze KA, Opmeer BC, Mol BW, Zwinderman AH. Correction: How to deal with double partial verification when evaluating two index tests in relation to a reference test? Statistics in Medicine. 2012;31(28):3787–8.
- View Article
- Google Scholar
57. Aragon DC, Martinez EZ, Alberto Achcar J. Bayesian estimation for performance measures of two diagnostic tests in the presence of verification bias. Journal of biopharmaceutical statistics. 2010;20(4):821–34. pmid:20496208
- View Article
- PubMed/NCBI
- Google Scholar
58. Gray R, Begg CB, Greenes RA. Construction of receiver operating characteristic curves when disease verification is subject to selection bias. Medical Decision Making. 1984;4(2):151–64. pmid:6472063
- View Article
- PubMed/NCBI
- Google Scholar
59. Zhou XH. A nonparametric maximum likelihood estimator for the receiver operating characteristic curve area in the presence of verification bias. Biometrics. 1996;52(1):299–305. pmid:8934599
- View Article
- PubMed/NCBI
- Google Scholar
60. Rodenberg C, Zhou XH. ROC curve estimation when covariates affect the verification process. Biometrics. 2000;56(4):1256–62. pmid:11129488
- View Article
- PubMed/NCBI
- Google Scholar
61. Zhou XH, Rodenberg CA. Estimating an ROC curve in the presence of non-ignorable verification bias. Communications in Statistics—Theory and Methods. 1998;27(3):635–57.
- View Article
- Google Scholar
62. Hunink MG, Richardson DK, Doubilet PM, Begg CB. Testing for fetal pulmonary maturity: ROC analysis involving covariates, verification bias, and combination testing. Medical Decision Making. 1990;10(3):201–11. pmid:2370827
- View Article
- PubMed/NCBI
- Google Scholar
63. He HL, Jeffrey M.; McDermott, Michael P. Direct estimation of the area under the receiver operating characteristic curve in the presence of verification bias. Statistics in Medicine. 2009;28(3):361–76. pmid:18680124
- View Article
- PubMed/NCBI
- Google Scholar
64. Adimari G, Chiogna M. Nearest-neighbor estimation for ROC analysis under verification bias. International Journal of Biostatistics. 2015;11(1):109–24. pmid:25781712
- View Article
- PubMed/NCBI
- Google Scholar
65. Adimari G, Chiogna M. Nonparametric verification bias-corrected inference for the area under the ROC curve of a continuous-scale diagnostic test. Statistics and its Interface. 2017;10(4):629–41.
- View Article
- Google Scholar
66. Gu J, Ghosal S, Kleiner DE. Bayesian ROC curve estimation under verification bias. Statistics in Medicine. 2014;33(29):5081–96. pmid:25269427
- View Article
- PubMed/NCBI
- Google Scholar
67. Fluss RR, Benjamin; Faraggi, David; Rotnitzky, Andrea. Estimation of the ROC Curve under Verification Bias. Biometrical Journal. 2009;51(3):475–90. pmid:19588455
- View Article
- PubMed/NCBI
- Google Scholar
68. Rotnitzky A, Faraggi D, Schisterman E. Doubly robust estimation of the area under the receiver-operating characteristic curve in the presence of verification bias. Journal of the American Statistical Association. 2006;101(475):1276–88.
- View Article
- Google Scholar
69. Fluss R, Reiser B, Faraggi D. Adjusting ROC curves for covariates in the presence of verification bias. Journal of Statistical Planning and Inference. 2012;142(1):1–11.
- View Article
- Google Scholar
70. Liu DZ, Xiao-Hua. A Model for Adjusting for Nonignorable Verification Bias in Estimation of the ROC Curve and Its Area with Likelihood-Based Approach. Biometrics. 2010;66(4):1119–28. pmid:20222937
- View Article
- PubMed/NCBI
- Google Scholar
71. Yu W, Kim JK, Park T. Estimation of area under the ROC Curve under nonignorable verification bias. Statistica Sinica. 2018;28(4):2149–66. pmid:31367164
- View Article
- PubMed/NCBI
- Google Scholar
72. Page JH, Rotnitzky A. Estimation of the disease-specific diagnostic marker distribution under verification bias. Computational Statistics and Data Analysis. 2009;53(3):707–17. pmid:23087495
- View Article
- PubMed/NCBI
- Google Scholar
73. Liu DZ, Xiao-Hua . Covariate Adjustment in Estimating the Area Under ROC Curve with Partially Missing Gold Standard. Biometrics. 2013;69(1):91–100. pmid:23410529
- View Article
- PubMed/NCBI
- Google Scholar
74. Liu D, Zhou XH. Semiparametric Estimation of the Covariate-Specific ROC Curve in Presence of Ignorable Verification Bias. Biometrics. 2011;67(3):906–16. pmid:21361890
- View Article
- PubMed/NCBI
- Google Scholar
75. Yu BZ, Chuan . Assessing the accuracy of a multiphase diagnosis procedure for dementia. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2012;61(1):67–81.
- View Article
- Google Scholar
76. Chi Y-YZ, Xiao-Hua . Receiver operating characteristic surfaces in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2008;57(1):1–23.
- View Article
- Google Scholar
77. Duc KT, Chiogna M, Adimari G. Nonparametric Estimation of ROC Surfaces Under Verification Bias. 2016.
- View Article
- Google Scholar
78. To Duc K. bcROCsurface: An R package for correcting verification bias in estimation of the ROC surface and its volume for continuous diagnostic tests. BMC Bioinformatics. 2017;18(1).
- View Article
- Google Scholar
79. Zhang Y, Alonzo TA, for the Alzheimer's Disease Neuroimaging I. Estimation of the volume under the receiver-operating characteristic surface adjusting for non-ignorable verification bias. Statistical Methods in Medical Research. 2018;27(3):715–39. pmid:29338546
- View Article
- PubMed/NCBI
- Google Scholar
80. Zhu R, Ghosal S. Bayesian Semiparametric ROC surface estimation under verification bias. Computational Statistics and Data Analysis. 2019;133:40–52.
- View Article
- Google Scholar
81. To Duc K, Chiogna M, Adimari G, for the Alzheimer's Disease Neuroimaging I. Estimation of the volume under the ROC surface in presence of nonignorable verification bias. Statistical Methods and Applications. 2019.
- View Article
- Google Scholar
82. De Groot JAH, Dendukuri N, Janssen KJM, Reitsma J, Bossuyt PM, Moons KGM. Adjusting for differential verification bias in diagnostic accuracy studies: A bayesian approach. American Journal of Epidemiology. 2010;11):S140.
- View Article
- Google Scholar
83. Lu YD, Nandini; Schiller Ian; Joseph Lawrence. A Bayesian approach to simultaneously adjusting for verification and reference standard bias in diagnostic test studies. Statistics in Medicine. 2010;29(24):2532–43. pmid:20799249
- View Article
- PubMed/NCBI
- Google Scholar
84. Glueck DH, Lamb MM, O'Donnell CI, Ringham BM, Brinton JT, Muller KE, et al. Bias in trials comparing paired continuous tests can cause researchers to choose the wrong screening modality. Bmc Medical Research Methodology. 2009;9. WOS:000264155400001. pmid:19154609
- View Article
- PubMed/NCBI
- Google Scholar
85. Capelli GN A.; Nardelli S.; di Regalbono A. F.; Pietrobelli M. Validation of a commercially available cELISA test for canine neosporosis against an indirect fluorescent antibody test (IFAT). Preventive Veterinary Medicine. 2006;73(4):315–20. WOS:000236336000007. pmid:16293328
- View Article
- PubMed/NCBI
- Google Scholar
86. Ferreccio C, Barriga MI, Lagos M, Ibáñez C, Poggi H, González F, et al. Screening trial of human papillomavirus for early detection of cervical cancer in Santiago, Chile. International Journal of Cancer. 2012;132(4):916–23. pmid:22684726
- View Article
- PubMed/NCBI
- Google Scholar
87. Iglesias-Garriz I, Rodríguez MA, García-Porrero E, Ereño F, Garrote C, Suarez G. Emergency Nontraumatic Chest Pain: Use of Stress Echocardiography to Detect Significant Coronary Artery Stenosis. Journal of the American Society of Echocardiography. 2005;18(11):1181–6. pmid:16275527
- View Article
- PubMed/NCBI
- Google Scholar
88. Cronin AM, Vickers AJ. Statistical methods to correct for verification bias in diagnostic studies are inadequate when there are few false negatives: A simulation study. BMC Medical Research Methodology. 2008;8. pmid:19014457
- View Article
- PubMed/NCBI
- Google Scholar
89. de Groot JAH, Janssen KJM, Zwinderman AH, Bossuyt PMM, Reitsma JB, Moons KGM. Correcting for Partial Verification Bias: A Comparison of Methods. Annals of Epidemiology. 2011;21(2):139–48. WOS:000286348200009. pmid:21109454
- View Article
- PubMed/NCBI
- Google Scholar
90. Heida A, Van De Vijver E, Van Ravenzwaaij D, Van Biervliet S, Hummel TZ, Yuksel Z, et al. Predicting inflammatory bowel disease in children with abdominal pain and diarrhoea: Calgranulin-C versus calprotectin stool tests. Archives of Disease in Childhood. 2018;103(6):565–71. pmid:29514815
- View Article
- PubMed/NCBI
- Google Scholar
91. Brenner H. Correcting for exposure misclassification using an alloyed gold standard. Epidemiology. 1996;7(4):406–10. pmid:8793367
- View Article
- PubMed/NCBI
- Google Scholar
92. Gart JJ, Buck AA. COMPARISON OF A SCREENING TEST AND A REFERENCE TEST IN EPIDEMIOLOGIC STUDIES .2. A PROBABILISTIC MODEL FOR COMPARISON OF DIAGNOSTIC TESTS. American Journal of Epidemiology. 1966;83(3):593–&. WOS:A19667894500018. pmid:5932703
- View Article
- PubMed/NCBI
- Google Scholar
93. Staquet M, Rozencweig M, Lee YJ, Muggia FM. Methodology for the assessment of new dichotomous diagnostic tests. Journal of Chronic Diseases. 1981;34(12):599–610. pmid:6458624
- View Article
- PubMed/NCBI
- Google Scholar
94. Albert PS. Estimating diagnostic accuracy of multiple binary tests with an imperfect reference standard. Statistics in Medicine. 2009;28(5):780–97. pmid:19101935
- View Article
- PubMed/NCBI
- Google Scholar
95. Emerson SC, Waikar SS, Fuentes C, Bonventre JV, Betensky RA. Biomarker validation with an imperfect reference: Issues and bounds. Statistical Methods in Medical Research. 2018;27(10):2933–45. pmid:28166709
- View Article
- PubMed/NCBI
- Google Scholar
96. Thibodeau L. Evaluating diagnostic tests. Biometrics. 1981:801–4.
- View Article
- Google Scholar
97. Hahn AL, Marc; Landt Olfert; Schwarz, Norbert Georg; Frickmann Hagen. Comparison of one commercial and two in-house TaqMan multiplex real-time PCR assays for detection of enteropathogenic, enterotoxigenic and enteroaggregative Escherichia coli. Tropical Medicine & International Health. 2017;22(11):1371–6. pmid:28906580
- View Article
- PubMed/NCBI
- Google Scholar
98. Matos RN, T. F.; Braga M. M.; Siqueira W. L.; Duarte D. A.; Mendes F. M. Clinical performance of two fluorescence-based methods in detecting occlusal caries lesions in primary teeth. Caries Research. 2011;45(3):294–302. pmid:21625126
- View Article
- PubMed/NCBI
- Google Scholar
99. Mathews WC, Cachay ER, Caperna J, Sitapati A, Cosman B, Abramson I. Estimating the accuracy of anal cytology in the presence of an imperfect reference standard. PLoS ONE. 2010;5(8). pmid:20808869
- View Article
- PubMed/NCBI
- Google Scholar
100. Hadgu A, Dendukuri N, Hilden J. Evaluation of nucleic acid amplification tests in the absence of a perfect gold-standard test: a review of the statistical and epidemiologic issues. Epidemiology. 2005:604–12. pmid:16135935
- View Article
- PubMed/NCBI
- Google Scholar
101. Hawkins DMG J. A.; Stephenson B. Some issues in resolution of diagnostic tests using an imperfect gold standard. Statistics in Medicine. 2001;20(13):1987–2001. pmid:11427955
- View Article
- PubMed/NCBI
- Google Scholar
102. Hagenaars JA. Latent structure models with direct effects between indicators: local dependence models. Sociological Methods & Research. 1988;16(3):379–405.
- View Article
- Google Scholar
103. Uebersax JS. Probit latent class analysis with dichotomous or ordered category measures: Conditional independence/dependence models. Applied Psychological Measurement. 1999;23(4):283–97.
- View Article
- Google Scholar
104. Yang I, Becker MP. Latent variable modeling of diagnostic accuracy. Biometrics. 1997:948–58. pmid:9290225
- View Article
- PubMed/NCBI
- Google Scholar
105. Qu Y, Tan M, Kutner MH. Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics. 1996;52(3):797–810. pmid:8805757
- View Article
- PubMed/NCBI
- Google Scholar
106. Albert PS, McShane LM, Shih JH, Network USNCIBTM. Latent class modeling approaches for assessing diagnostic error without a gold standard: with applications to p53 immunohistochemical assays in bladder tumors. Biometrics. 2001;57(2):610–9. pmid:11414591
- View Article
- PubMed/NCBI
- Google Scholar
107. Zhang BC Z.; Albert P. S. Estimating Diagnostic Accuracy of Raters Without a Gold Standard by Exploiting a Group of Experts. Biometrics. 2012;68(4):1294–302. pmid:23006010
- View Article
- PubMed/NCBI
- Google Scholar
108. Xu HB, Michael A.; Craig, Bruce A. Evaluating accuracy of diagnostic tests with intermediate results in the absence of a gold standard. Statistics in Medicine. 2013;32(15):2571–84. pmid:23212851
- View Article
- PubMed/NCBI
- Google Scholar
109. Wang Z, Zhou X-H, Wang M. Evaluation of diagnostic accuracy in detecting ordered symptom statuses without a gold standard. Biostatistics. 2011;12(3):567–81. pmid:21209155
- View Article
- PubMed/NCBI
- Google Scholar
110. Wang ZZ, Xiao-Hua . Random effects models for assessing diagnostic accuracy of traditional Chinese doctors in absence of a gold standard. Statistics in Medicine. 2012;31(7):661–71. pmid:21626532
- View Article
- PubMed/NCBI
- Google Scholar
111. Liu WZ B.; Zhang Z. W.; Chen B. J.; Zhou X. H. A pseudo-likelihood approach for estimating diagnostic accuracy of multiple binary medical tests. Computational Statistics & Data Analysis. 2015;84:85–98. WOS:000348263200007.
- View Article
- Google Scholar
112. Xue X, Oktay M, Goswami S, Kim MY. A method to compare the performance of two molecular diagnostic tools in the absence of a gold standard. Statistical Methods in Medical Research. 2019;28(2):419–31. pmid:28814156
- View Article
- PubMed/NCBI
- Google Scholar
113. Nérette P, Stryhn H, Dohoo I, Hammell L. Using pseudogold standards and latent-class analysis in combination to evaluate the accuracy of three diagnostic tests. Preventive veterinary medicine. 2008;85(3–4):207–25. pmid:18355935
- View Article
- PubMed/NCBI
- Google Scholar
114. Dendukuri N, Hadgu A, Wang L. Modeling conditional dependence between diagnostic tests: a multiple latent variable model. Statistics in medicine. 2009;28(3):441–61. pmid:19067379
- View Article
- PubMed/NCBI
- Google Scholar
115. Johnson WO, Gastwirth JL, Pearson LM. Screening without a "gold standard": The Hui-Walter paradigm revisited. American Journal of Epidemiology. 2001;153(9):921–4. pmid:11323324
- View Article
- PubMed/NCBI
- Google Scholar
116. Martinez EZL-N F.; Derchain S. F. M.; Achcar J. A.; Gontijo R. C.; Sarian L. O. Z.; Syrjänen K. J. Bayesian estimation of performance measures of cervical cancer screening tests in the presence of covariates and absence of a gold standard. Cancer Informatics. 2008;6:33–46. pmid:19259401
- View Article
- PubMed/NCBI
- Google Scholar
117. Zhang J, Cole SR, Richardson DB, Chu H. A Bayesian approach to strengthen inference for case‐control studies with multiple error‐prone exposure assessments. Statistics in medicine. 2013;32(25):4426–37. pmid:23661263
- View Article
- PubMed/NCBI
- Google Scholar
118. Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2002;64(4):583–639.
- View Article
- Google Scholar
119. Pereira da Silva HD, Ascaso C, Gonçalves AQ, Orlandi PP, Abellana R. A Bayesian approach to model the conditional correlation between several diagnostic tests and various replicated subjects measurements. Statistics in Medicine. 2017;36(20):3154–70. pmid:28543307
- View Article
- PubMed/NCBI
- Google Scholar
120. Zhou X-HC, Pete; Zhou Chuan. Nonparametric Estimation of ROC Curves in the Absence of a Gold Standard. Biometrics. 2005;61(2):600–9. pmid:16011710
- View Article
- PubMed/NCBI
- Google Scholar
121. Henkelman RM, Kay I, Bronskill MJ. Receiver operator characteristic (ROC) analysis without truth. Medical Decision Making. 1990;10(1):24–9. pmid:2325524
- View Article
- PubMed/NCBI
- Google Scholar
122. Beiden SV, Campbell G, Meier KL, Wagner RF, editors. The problem of ROC analysis without truth: The EM algorithm and the information matrix. Medical Imaging 2000: Image Perception and Performance; 2000: International Society for Optics and Photonics.
- View Article
- Google Scholar
123. Choi YK, Johnson WO, Collins MT, Gardner IA. Bayesian inferences for receiver operating characteristic curves in the absence of a gold standard. Journal of Agricultural, Biological, and Environmental Statistics. 2006;11(2):210–29.
- View Article
- Google Scholar
124. Wang C, Turnbull BW, Gröhn YT, Nielsen SS. Nonparametric estimation of ROC curves based on Bayesian models when the true disease state is unknown. Journal of Agricultural, Biological, and Environmental Statistics. 2007;12(1):128–46.
- View Article
- Google Scholar
125. Branscum AJJ, Wesley O.; Hanson, Timothy E.; Gardner, Ian A. Bayesian semiparametric ROC curve estimation and disease diagnosis. Statistics in Medicine. 2008;27(13):2474–96. pmid:18300333
- View Article
- PubMed/NCBI
- Google Scholar
126. Erkanli AS, Minje; Jane Costello E.; Angold Adrian. Bayesian semi-parametric ROC analysis. Statistics in Medicine. 2006;25(22):3905–28. pmid:16416403
- View Article
- PubMed/NCBI
- Google Scholar
127. García Barrado L, Coart E, Burzykowski T. Development of a diagnostic test based on multiple continuous biomarkers with an imperfect reference test. Statistics in Medicine. 2016;35(4):595–608. pmid:26388206
- View Article
- PubMed/NCBI
- Google Scholar
128. Coart E, Barrado LG, Duits FH, Scheltens P, Van Der Flier WM, Teunissen CE, et al. Correcting for the Absence of a Gold Standard Improves Diagnostic Accuracy of Biomarkers in Alzheimer's Disease. Journal of Alzheimer's Disease. 2015;46(4):889–99. pmid:25869788
- View Article
- PubMed/NCBI
- Google Scholar
129. Jafarzadeh SR, Johnson WO, Gardner IA. Bayesian modeling and inference for diagnostic accuracy and probability of disease based on multiple diagnostic biomarkers with and without a perfect reference standard. Statistics in Medicine. 2016;35(6):859–76. pmid:26415924
- View Article
- PubMed/NCBI
- Google Scholar
130. Hwang BS, Chen Z. An Integrated Bayesian Nonparametric Approach for Stochastic and Variability Orders in ROC Curve Estimation: An Application to Endometriosis Diagnosis. Journal of the American Statistical Association. 2015;110(511):923–34. pmid:26839441
- View Article
- PubMed/NCBI
- Google Scholar
131. Alonzo TA, Pepe MS. Using a combination of reference tests to assess the accuracy of a new diagnostic test. Statistics in Medicine. 1999;18(22):2987–3003. pmid:10544302
- View Article
- PubMed/NCBI
- Google Scholar
132. Schiller IvS M.; Hadgu A.; Libman M.; Reitsma J. B.; Dendukuri N. Bias due to composite reference standards in diagnostic accuracy studies. Statistics in Medicine. 2016;35(9):1454–70. pmid:26555849
- View Article
- PubMed/NCBI
- Google Scholar
133. Naaktgeboren CA, Bertens LC, van Smeden M, de Groot JA, Moons KG, Reitsma JB. Value of composite reference standards in diagnostic research. Bmj. 2013;347:f5605. pmid:24162938
- View Article
- PubMed/NCBI
- Google Scholar
134. Tang S, Hemyari P, Canchola JA, Duncan J. Dual composite reference standards (dCRS) in molecular diagnostic research: A new approach to reduce bias in the presence of Imperfect reference. Journal of Biopharmaceutical Statistics. 2018;28(5):951–65. pmid:29355450
- View Article
- PubMed/NCBI
- Google Scholar
135. Bertens LC, Broekhuizen BD, Naaktgeboren CA, Rutten FH, Hoes AW, van Mourik Y, et al. Use of expert panels to define the reference standard in diagnostic research: a systematic review of published methods and reporting. PLoS medicine. 2013;10(10):e1001531. pmid:24143138
- View Article
- PubMed/NCBI
- Google Scholar
136. Juhl DV A.; Luhm J.; Ziemann M.; Hennig H.; Görg S. Comparison of the two fully automated anti-HCMV IgG assays: Abbott Architect CMV IgG assay and Biotest anti-HCMV recombinant IgG ELISA. Transfusion Medicine. 2013;23(3):187–94. pmid:23578169
- View Article
- PubMed/NCBI
- Google Scholar
137. Rostami MNR B. H.; Aghsaghloo F.; Nazari R. Comparison of clinical performance of antigen based-enzyme immunoassay (EIA) and major outer membrane protein (MOMP)-PCR for detection of genital Chlamydia trachomatis infection. International Journal of Reproductive Biomedicine. 2016;14(6):411–20. WOS:000388374300007. pmid:27525325
- View Article
- PubMed/NCBI
- Google Scholar
138. Spada EP Daniela; Baggiani Luciana; Bagnagatti De Giorgi Giada; Perego Roberta; Ferro Elisabetta. Evaluation of an immunochromatographic test for feline AB system blood typing. Journal of Veterinary Emergency and Critical Care. 2016;26(1):137–41. pmid:26264678
- View Article
- PubMed/NCBI
- Google Scholar
139. Brocchi E, Bergmann IE, Dekker A, Paton DJ, Sammin DJ, Greiner M, et al. Comparative evaluation of six ELISAs for the detection of antibodies to the non-structural proteins of foot-and-mouth disease virus. Vaccine. 2006;24(47):6966–79.
- View Article
- Google Scholar
140. Williams GJM, Petra; Kerr Marianne; Fitzgerald Dominic A.; Isaacs David; Codarini Miriam; McCaskill Mary; Prelog Kristina; Craig Jonathan C. Variability and accuracy in interpretation of consolidation on chest radiography for diagnosing pneumonia in children under 5 years of age. Pediatric Pulmonology. 2013;48(12):1195–200. pmid:23997040
- View Article
- PubMed/NCBI
- Google Scholar
141. Asselineau J, Paye A, Bessède E, Perez P, Proust-Lima C. Different latent class models were used and evaluated for assessing the accuracy of campylobacter diagnostic tests: Overcoming imperfect reference standards? Epidemiology and Infection. 2018;146(12):1556–64. pmid:29945689
- View Article
- PubMed/NCBI
- Google Scholar
142. Sobotzki CR M.; Kennerknecht N.; Hulsse C.; Littmann M.; White A.; Von Kries R.; Von Kotnig C. H. W. Latent class analysis of diagnostic tests for adenovirus, Bordetella pertussis and influenza virus infections in German adults with longer lasting coughs. Epidemiology and Infection. 2016;144(4):840–6. WOS:000369712100021. pmid:26380914
- View Article
- PubMed/NCBI
- Google Scholar
143. Poynard TDL V.; Zarski J. P.; Stanciu C.; Munteanu M.; Vergniol J.; France J.; Trifan A.; Le Naour G.; Vaillant J. C.; Ratziu V.; Charlotte F. Relative performances of FibroTest, Fibroscan, and biopsy for the assessment of the stage of liver fibrosis in patients with chronic hepatitis C: A step toward the truth in the absence of a gold standard. Journal of Hepatology. 2012;56(3):541–8. pmid:21889468
- View Article
- PubMed/NCBI
- Google Scholar
144. De La Rosa GDV M. L.; Arango C. M.; Gomez C. I.; Garcia A.; Ospina S.; Osorno S.; Henao A.; Jaimes F. A. Toward an operative diagnosis in sepsis: A latent class approach. BMC Infectious Diseases. 2008;8 (no pagination)(18).
- View Article
- Google Scholar
145. Xie YC, Zhen; Albert Paul S. A crossed random effects modeling approach for estimating diagnostic accuracy from ordinal ratings without a gold standard. Statistics in Medicine. 2013;32(20):3472–85. pmid:23529923
- View Article
- PubMed/NCBI
- Google Scholar
146. See CWA W.; Melese M.; Zhou Z.; Porco T. C.; Shiboski S.; Gaynor B. D.; Eng J.; Keenan J. D.; Lietman T. M. How reliable are tests for trachoma?—A latent class approach. Investigative Ophthalmology and Visual Science. 2011;52(9):6133–7. pmid:21685340
- View Article
- PubMed/NCBI
- Google Scholar
147. Nérette P, Dohoo I, Hammell L. Estimation of specificity and sensitivity of three diagnostic tests for infectious salmon anaemia virus in the absence of a gold standard. Journal of Fish Diseases. 2005;28(2):89–99. pmid:15705154
- View Article
- PubMed/NCBI
- Google Scholar
148. Pak SIK D. Evaluation of diagnostic performance of a polymerase chain reaction for detection of canine Dirofilaria immitis. Journal of Veterinary Clinics. 2007;24(2):77–81.
- View Article
- Google Scholar
149. Jokinen J, Snellman M, Palmu AA, Saukkoriipi A, Verlant V, Pascal T, et al. Testing Pneumonia Vaccines in the Elderly: Determining a Case Definition for Pneumococcal Pneumonia in the Absence of a Gold Standard. American Journal of Epidemiology. 2018;187(6):1295–302. pmid:29253067
- View Article
- PubMed/NCBI
- Google Scholar
150. Santos FLN, Campos ACP, Amorim LDAF, Silva ED, Zanchin NIT, Celedon PAF, et al. Highly accurate chimeric proteins for the serological diagnosis of chronic chagas disease: A latent class analysis. American Journal of Tropical Medicine and Hygiene. 2018;99(5):1174–9. pmid:30226130
- View Article
- PubMed/NCBI
- Google Scholar
151. Mamtani M, Jawahirani A, Das K, Rughwani V, Kulkarni H. Bias-corrected diagnostic performance of the naked eye single tube red cell osmotic fragility test (NESTROFT): An effective screening tool for β-thalassemia. Hematology. 2006;11(4):277–86. pmid:17178668
- View Article
- PubMed/NCBI
- Google Scholar
152. Karaman BF, Açıkalın A, Ünal İ, Aksungur VL. Diagnostic values of KOH examination, histological examination, and culture for onychomycosis: a latent class analysis. International Journal of Dermatology. 2019;58(3):319–24. pmid:30246397
- View Article
- PubMed/NCBI
- Google Scholar
153. Yan Q, Karau MJ, Greenwood-Quaintance KE, Mandrekar JN, Osmon DR, Abdel MP, et al. Comparison of diagnostic accuracy of periprosthetic tissue culture in blood culture bottles to that of prosthesis sonication fluid culture for diagnosis of prosthetic joint infection (PJI) by use of Bayesian latent class modeling and IDSA PJI criteria for classification. Journal of Clinical Microbiology. 2018;56(6). pmid:29643202
- View Article
- PubMed/NCBI
- Google Scholar
154. Lurier T, Delignette-Muller ML, Rannou B, Strube C, Arcangioli MA, Bourgoin G. Diagnosis of bovine dictyocaulosis by bronchoalveolar lavage technique: A comparative study using a Bayesian approach. Preventive Veterinary Medicine. 2018;154:124–31. pmid:29685436
- View Article
- PubMed/NCBI
- Google Scholar
155. Falley BN, Stamey JD, Beaujean AA. Bayesian estimation of logistic regression with misclassified covariates and response. Journal of Applied Statistics. 2018;45(10):1756–69.
- View Article
- Google Scholar
156. Dufour SD J.; Dubuc J.; Dendukuri N.; Hassan S.; Buczinski S. Bayesian estimation of sensitivity and specificity of a milk pregnancy-associated glycoprotein-based ELISA and of transrectal ultrasonographic exam for diagnosis of pregnancy at 28–45 days following breeding in dairy cows. Preventive Veterinary Medicine. 2017;140:122–33. pmid:28460745
- View Article
- PubMed/NCBI
- Google Scholar
157. Bermingham MLH I. G.; Glass E. J.; Woolliams J. A.; Bronsvoort B. M. D. C.; McBride S. H.; Skuce R. A.; Allen A. R.; McDowell S. W. J.; Bishop S. C. Hui and Walter's latent-class model extended to estimate diagnostic test properties from surveillance data: A latent model for latent data. Scientific Reports. 2015;5. pmid:26148538
- View Article
- PubMed/NCBI
- Google Scholar
158. Busch EL, Don PK, Chu H, Richardson DB, Keku TO, Eberhard DA, et al. Diagnostic accuracy and prediction increment of markers of epithelial-mesenchymal transition to assess cancer cell detachment from primary tumors. BMC Cancer. 2018;18(1). pmid:29338703
- View Article
- PubMed/NCBI
- Google Scholar
159. de Araujo Pereira GL F.; de Fatima Barbosa V.; Ferreira-Silva M. M.; Moraes-Souza H. A general latent class model for performance evaluation of diagnostic tests in the absence of a gold standard: an application to Chagas disease. Computational and mathematical methods in medicine. 2012;2012:487502. pmid:22919430
- View Article
- PubMed/NCBI
- Google Scholar
160. Hubbard RA, Huang J, Harton J, Oganisian A, Choi G, Utidjian L, et al. A Bayesian latent class approach for EHR-based phenotyping. Statistics in Medicine. 2019;38(1):74–87. pmid:30252148
- View Article
- PubMed/NCBI
- Google Scholar
161. Caraguel C, Stryhn H, Gagné N, Dohoo I, Hammell L. Use of a third class in latent class modelling for the diagnostic evaluation of five infectious salmon anaemia virus detection tests. Preventive Veterinary Medicine. 2012;104(1):165–73. https://doi.org/10.1016/j.prevetmed.2011.10.006.
- View Article
- Google Scholar
162. De Waele V, Berzano M, Berkvens D, Speybroeck N, Lowery C, Mulcahy GM, et al. Age-Stratified Bayesian Analysis To Estimate Sensitivity and Specificity of Four Diagnostic Tests for Detection of Cryptosporidium Oocysts in Neonatal Calves. Journal of Clinical Microbiology. 2011;49(1):76–84. WOS:000285787100010. pmid:21048012
- View Article
- PubMed/NCBI
- Google Scholar
163. Dendukuri N, Wang LL, Hadgu A. Evaluating Diagnostic Tests for Chlamydia trachomatis in the Absence of a Gold Standard: A Comparison of Three Statistical Methods. Statistics in Biopharmaceutical Research. 2011;3(2):385–97. WOS:000292680800023.
- View Article
- Google Scholar
164. Habib IS I.; Uyttendaele M.; De Zutter L.; Berkvens D. A Bayesian modelling framework to estimate Campylobacter prevalence and culture methods sensitivity: application to a chicken meat survey in Belgium. Journal of Applied Microbiology. 2008;105(6):2002–8. pmid:19120647
- View Article
- PubMed/NCBI
- Google Scholar
165. Vidal EM A.; Bertolini E.; Cambra M. Estimation of the accuracy of two diagnostic methods for the detection of Plum pox virus in nursery blocks by latent class models. Plant Pathology. 2012;61(2):413–22.
- View Article
- Google Scholar
166. Aly SSA R. J.; Whitlock R. H.; Adaska J. M. Sensitivity and Specificity of Two Enzyme-linked Immunosorbent Assays and a Quantitative Real-time Polymerase Chain Reaction for Bovine Paratuberculosis Testing of a Large Dairy Herd. International Journal of Applied Research in Veterinary Medicine. 2014;12(1):1–7. WOS:000348617300001.
- View Article
- Google Scholar
167. Rahman AKMA, Saegerman C, Berkvens D, Fretin D, Gani MO, Ershaduzzaman M, et al. Bayesian estimation of true prevalence, sensitivity and specificity of indirect ELISA, Rose Bengal Test and Slow Agglutination Test for the diagnosis of brucellosis in sheep and goats in Bangladesh. Preventive Veterinary Medicine. 2013;110(2):242–52. pmid:23276401
- View Article
- PubMed/NCBI
- Google Scholar
168. Praet NV, Jaco J.; Mwape, Kabemba E.; Phiri Isaac K.; Muma John B.; Zulu Gideon; van Lieshout Lisette; Rodriguez-Hidalgo Richar; Benitez-Ortiz Washington; Dorny Pierre; Gabriël Sarah. Bayesian modelling to estimate the test characteristics of coprology, coproantigen ELISA and a novel real-time PCR for the diagnosis of taeniasis. Tropical Medicine & International Health. 2013;18(5):608–14. pmid:23464616
- View Article
- PubMed/NCBI
- Google Scholar
169. Espejo LA, Zagmutt FJ, Groenendaal H, Munoz-Zanzi C, Wells SJ. Evaluation of performance of bacterial culture of feces and serum ELISA across stages of Johne's disease in cattle using a Bayesian latent class model. Journal of dairy science. 2015;98(11):8227–39. pmid:26364104
- View Article
- PubMed/NCBI
- Google Scholar
170. Haley C, Wagner B, Puvanendiran S, Abrahante J, Murtaugh MP. Diagnostic performance measures of ELISA and quantitative PCR tests for porcine circovirus type 2 exposure using Bayesian latent class analysis. Preventive veterinary medicine. 2011;101(1–2):79–88. pmid:21632130
- View Article
- PubMed/NCBI
- Google Scholar
171. Menten JB Marleen; Lesaffre Emmanuel. Bayesian latent class models with conditionally dependent diagnostic tests: A case study. Statistics in Medicine. 2008;27(22):4469–88. pmid:18551515
- View Article
- PubMed/NCBI
- Google Scholar
172. Tasony-Wagener EA. Evaluation of Antigen Detection Assays for the Avian Influenza Virus [Ph.D.]. Ann Arbor: University of Prince Edward Island (Canada); 2012.
173. Weichenthal S, Joseph L, Bélisle P, Dufresne A. Bayesian Estimation of the Probability of Asbestos Exposure from Lung Fiber Counts. Biometrics. 2010;66(2):603–12. pmid:19508240
- View Article
- PubMed/NCBI
- Google Scholar
174. Jafarzadeh SR, Warren DK, Nickel KB, Wallace AE, Fraser VJ, Olsen MA. Bayesian estimation of the accuracy of ICD-9-CM- and CPT-4-based algorithms to identify cholecystectomy procedures in administrative data without a reference standard. Pharmacoepidemiology and Drug Safety. 2016;25(3):263–8. WOS:000371825900004. pmid:26349484
- View Article
- PubMed/NCBI
- Google Scholar
175. García Barrado L, Coart E, Burzykowski T. Estimation of diagnostic accuracy of a combination of continuous biomarkers allowing for conditional dependence between the biomarkers and the imperfect reference-test. Biometrics. 2017;73(2):646–55. pmid:27598904
- View Article
- PubMed/NCBI
- Google Scholar
176. Jafarzadeh SR, Johnson WO, Utts JM, Gardner IA. Bayesian estimation of the receiver operating characteristic curve for a diagnostic test with a limit of detection in the absence of a gold standard. Statistics in Medicine. 2010;29(20):2092–106.
- View Article
- Google Scholar
177. Saugar JM, Merino FJ, Martin-Rabadan P, Fernandez-Soto P, Ortega S, Garate T, et al. Application of real-time PCR for the detection of Strongyloides spp. in clinical samples in a reference center in Spain. Acta tropica. 2015;142:20–5. pmid:25447829
- View Article
- PubMed/NCBI
- Google Scholar
178. Peterson LRY S. A.; Davis T. E.; Wang Z. X.; Duncan J.; Noutsios C.; Liesenfeld O.; Osiecki J. C.; Lewinski M. A. Evaluation of the cobas cdiff test for detection of toxigenic clostridium difficile in stool samples. Journal of Clinical Microbiology. 2017;55(12):3426–36. pmid:28954901
- View Article
- PubMed/NCBI
- Google Scholar
179. Fiebrich HBB A. H.; Kerstens M. N.; Pijl M. E. J.; Kema I. P.; De Jong J. R.; Jager P. L.; Elsinga P. H.; Dierckx R. A. J. O.; Van Der Wal J. E.; Sluiter W. J.; De Vries E. G. E.; Links T. P. 6-[F-18]fluoro-L-dihydroxyphenylalanine positron emission tomography is superior to conventional imaging with123I-metaiodobenzylguanidine scintigraphy, computer tomography, and magnetic resonance imaging in localizing tumors causing catecholamine excess. Journal of Clinical Endocrinology and Metabolism. 2009;94(10):3922–30. pmid:19622618
- View Article
- PubMed/NCBI
- Google Scholar
180. Wu HM, Cordeiro SM, Harcourt BH, Carvalho M, Azevedo J, Oliveira TQ, et al. Accuracy of real-time PCR, Gram stain and culture for Streptococcus pneumoniae, Neisseria meningitidis and Haemophilus influenzae meningitis diagnosis. BMC Infectious Diseases. 2013;13 (1) (no pagination)(26).
- View Article
- Google Scholar
181. Dendukuri N, Schiller I, De Groot J, Libman M, Moons K, Reitsma J, et al. Concerns about composite reference standards in diagnostic research. BMJ (Online). 2018;360. pmid:29348126
- View Article
- PubMed/NCBI
- Google Scholar
182. Driesen M, Kondo Y, de Jong BC, Torrea G, Asnong S, Desmaretz C, et al. Evaluation of a novel line probe assay to detect resistance to pyrazinamide, a key drug used for tuberculosis treatment. Clinical Microbiology and Infection. 2018;24(1):60–4. pmid:28587904
- View Article
- PubMed/NCBI
- Google Scholar
183. Bessède E, Asselineau J, Perez P, Valdenaire G, Richer O, Lehours P, et al. Evaluation of the diagnostic accuracy of two immunochromatographic tests detecting campylobacter in stools and their role in campylobacter infection diagnosis. Journal of Clinical Microbiology. 2018;56(4). pmid:29436423
- View Article
- PubMed/NCBI
- Google Scholar
184. Alcántara R, Fuentes P, Antiparra R, Santos M, Gilman RH, Kirwan DE, et al. MODS-Wayne, a colorimetric adaptation of the Microscopic-Observation Drug Susceptibility (MODS) assay for detection of mycobacterium tuberculosis pyrazinamide resistance from sputum samples. Journal of Clinical Microbiology. 2019;57(2). pmid:30429257
- View Article
- PubMed/NCBI
- Google Scholar
185. Ziswiler HR, Reichenbach S, Vögelin E, Bachmann LM, Villiger PM, Jüni P. Diagnostic value of sonography in patients with suspected carpal tunnel syndrome: A prospective study. Arthritis and Rheumatism. 2005;52(1):304–11. pmid:15641050
- View Article
- PubMed/NCBI
- Google Scholar
186. Taylor SA, Mallett S, Bhatnagar G, Baldwin-Cleland R, Bloom S, Gupta A, et al. Diagnostic accuracy of magnetic resonance enterography and small bowel ultrasound for the extent and activity of newly diagnosed and relapsed Crohn's disease (METRIC): a multicentre trial. The Lancet Gastroenterology and Hepatology. 2018;3(8):548–58. pmid:29914843
- View Article
- PubMed/NCBI
- Google Scholar
187. Eddyani M, Sopoh GE, Ayelo G, Brun LVC, Roux JJ, Barogui Y, et al. Diagnostic accuracy of clinical and microbiological signs in patients with skin lesions resembling buruli ulcer in an endemic region. Clinical Infectious Diseases. 2018;67(6):827–34. pmid:29538642
- View Article
- PubMed/NCBI
- Google Scholar
188. Lerner EB, McKee CH, Cady CE, Cone DC, Colella MR, Cooper A, et al. A consensus-based gold standard for the evaluation of mass casualty triage systems. Prehospital Emergency Care. 2015;19(2):267–71. pmid:25290529
- View Article
- PubMed/NCBI
- Google Scholar
189. van Houten CB, de Groot JAH, Klein A, Srugo I, Chistyakov I, de Waal W, et al. A host-protein based assay to differentiate between bacterial and viral infections in preschool children (OPPORTUNITY): a double-blind, multicentre, validation study. The Lancet Infectious Diseases. 2017;17(4):431–40. pmid:28012942
- View Article
- PubMed/NCBI
- Google Scholar
190. Elliott DG, Applegate LJ, Murray AL, Purcell MK, McKibben CL. Bench-top validation testing of selected immunological and molecular Renibacterium salmoninarum diagnostic assays by comparison with quantitative bacteriological culture. Journal of Fish Diseases. 2013;36(9):779–809. pmid:23346868
- View Article
- PubMed/NCBI
- Google Scholar
191. Bland JM, Altman DG. Validating scales and indexes. Bmj. 2002;324(7337):606–7. pmid:11884331
- View Article
- PubMed/NCBI
- Google Scholar
192. Hsia ECS Neil; Cush John J.; Chaisson Richard E.; Matteson Eric L.; Xu Stephen; Beutler Anna; Doyle Mittie K.; Hsu Benjamin; Rahman Mahboob U. Interferon-γ release assay versus tuberculin skin test prior to treatment with golimumab, a human anti-tumor necrosis factor antibody, in patients with rheumatoid arthritis, psoriatic arthritis, or ankylosing spondylitis. Arthritis & Rheumatism. 2012;64(7):2068–77. pmid:104469597. Language: English. Entry Date: 20120717. Revision Date: 20150711. Publication Type: Journal Article.
- View Article
- PubMed/NCBI
- Google Scholar
193. Itza F, Zarza D, Salinas J, Teba F, Ximenez C. Turn-amplitude analysis as a diagnostic test for myofascial syndrome in patients with chronic pelvic pain. Pain Research and Management. 2015;20(2):96–100. pmid:25848846
- View Article
- PubMed/NCBI
- Google Scholar
194. Booi ANM Jerome; Norton H. James; Anderson William E.; Ellis Amy C. Validation of a Screening Tool to Identify Undernutrition in Ambulatory Patients With Liver Cirrhosis. Nutrition in Clinical Practice. 2015;30(5):683–9. pmid:26024676
- View Article
- PubMed/NCBI
- Google Scholar
195. von Heymann W, Moll H, Rauch G. Study on sacroiliac joint diagnostics: Reliability of functional and pain provocation tests. Manuelle Medizin. 2018;56(3):239–48.
- View Article
- Google Scholar
196. Schliep KC, Stanford JB, Chen Z, Zhang B, Dorais JK, Boiman Johnstone E, et al. Interrater and intrarater reliability in the diagnosis and staging of endometriosis. Obstetrics and Gynecology. 2012;120(1):104–12. pmid:22914398
- View Article
- PubMed/NCBI
- Google Scholar
197. Teresa Pérez-Warnisher MTG-G; Giraldo-Cadavid Luis Fernando; Troncoso Acevedo Maria Fernanda; Rodríguez Rodríguez Paula; Carballosa de Miguel Pilar; González Mangado Nicolás. Diagnostic accuracy of nasal cannula versus microphone for detection of snoring. The Laryngoscope. 2017;127(12):2886–90. pmid:28731530
- View Article
- PubMed/NCBI
- Google Scholar
198. Soltan MA, Tsai YL, Lee PYA, Tsai CF, Chang HFG, Wang HTT, et al. Comparison of electron microscopy, ELISA, real time RT-PCR and insulated isothermal RT-PCR for the detection of Rotavirus group A (RVA) in feces of different animal species. Journal of Virological Methods. 2016;235:99–104. pmid:27180038
- View Article
- PubMed/NCBI
- Google Scholar
199. Palit ST N.; Knowles C. H.; Lunniss P. J.; Bharucha A. E.; Scott S. M. Diagnostic disagreement between tests of evacuatory function: a prospective study of 100 constipated patients. Neurogastroenterology & Motility. 2016;28(10):1589–98. pmid:27154577
- View Article
- PubMed/NCBI
- Google Scholar
200. Alonzo TA, Brinton JT, Ringham BM, Glueck DH. Bias in estimating accuracy of a binary screening test with differential disease verification. Statistics in Medicine. 2011;30(15):1852–64. pmid:21495059
- View Article
- PubMed/NCBI
- Google Scholar
201. Naaktgeboren CAdG J. A.; van Smeden M.; Moons K. G.; Reitsma J. B. Evaluating diagnostic accuracy in the face of multiple reference standards. Annals of Internal Medicine. 2013;159(3):195–202. pmid:23922065.
- View Article
- PubMed/NCBI
- Google Scholar
202. De Groot JAHB P. M. M.; Reitsma J. B.; Rutjes A. W. S.; Dendukuri N.; Janssen K. J. M.; Moons K. G. M. Verification problems in diagnostic accuracy studies: Consequences and solutions. BMJ (Online). 2011;343(7821). pmid:21810869
- View Article
- PubMed/NCBI
- Google Scholar
203. Lu Y, Dendukuri N, Schiller I, Joseph L. A Bayesian approach to simultaneously adjusting for verification and reference standard bias in diagnostic test studies. Statistics in Medicine. 2010;29(24):2532–43. pmid:20799249
- View Article
- PubMed/NCBI
- Google Scholar
204. Dendukuri N, Wang L, Hadgu A. Evaluating diagnostic tests for Chlamydia trachomatis in the absence of a gold standard: A comparison of three statistical methods. Statistics in Biopharmaceutical Research. 2011;3(2):385–97.
- View Article
- Google Scholar
205. Albert PS, Dodd LE. A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error without a Gold Standard. Biometrics. 2004;60(2):427–35. pmid:15180668
- View Article
- PubMed/NCBI
- Google Scholar
206. Pepe MS, Janes H. Insights into latent class analysis of diagnostic test performance. Biostatistics. 2006;8(2):474–84. pmid:17085745
- View Article
- PubMed/NCBI
- Google Scholar
207. Nortunen T, Puustinen J, Luostarinen L, Huhtala H, Hänninen T. Validation of the finnish version of the montreal cognitive assessment test. Acta Neuropsychologica. 2018;16(4):353–60.
- View Article
- Google Scholar
208. Cheng MF, Guo YL, Yen RF, Chen YC, Ko CL, Tien YW, et al. Clinical Utility of FDG PET/CT in Patients with Autoimmune Pancreatitis: A Case-Control Study. Scientific Reports. 2018;8(1). pmid:29483544
- View Article
- PubMed/NCBI
- Google Scholar
209. Gorman SLR S.; Melnick M. E.; Abrams G. M.; Byl N. N. Development and validation of the function in sitting test in adults with acute stroke. Journal of Neurologic Physical Therapy. 2010;34(3):150–60. pmid:20716989. Language: English. Entry Date: 20101004. Revision Date: 20150818. Publication Type: Journal Article.
- View Article
- PubMed/NCBI
- Google Scholar
210. Young GP, Senore C, Mandel JS, Allison JE, Atkin WS, Benamouzig R, et al. Recommendations for a step-wise comparative approach to the evaluation of new screening tests for colorectal cancer. Cancer. 2016;122(6):826–39. pmid:26828588
- View Article
- PubMed/NCBI
- Google Scholar
211. Flahault A, Cadilhac M, Thomas G. Sample size calculation should be performed for design accuracy in diagnostic test studies. Journal of clinical epidemiology. 2005;58(8):859–62. pmid:16018921
- View Article
- PubMed/NCBI
- Google Scholar
212. Cheng D, Branscum AJ, Johnson WO. Sample size calculations for ROC studies: parametric robustness and Bayesian nonparametrics. Statistics in Medicine. 2012;31(2):131–42. pmid:22139729
- View Article
- PubMed/NCBI
- Google Scholar
213. Branscum AJ, Johnson WO, Gardner IA. Sample size calculations for studies designed to evaluate diagnostic test accuracy. Journal of agricultural, biological, and environmental statistics. 2007;12(1):112.
- View Article
- Google Scholar
214. Hajian-Tilaki K. Sample size estimation in diagnostic test studies of biomedical informatics. Journal of biomedical informatics. 2014;48:193–204. pmid:24582925
- View Article
- PubMed/NCBI
- Google Scholar
215. Dendukuri N, Rahme E, Bélisle P, Joseph L. Bayesian sample size determination for prevalence and diagnostic test studies in the absence of a gold standard test. Biometrics. 2004;60(2):388–97. pmid:15180664
- View Article
- PubMed/NCBI
- Google Scholar
216. Schmidt RL, Factor RE. Understanding sources of bias in diagnostic accuracy studies. Archives of pathology & laboratory medicine. 2013;137(4):558–65.
- View Article
- Google Scholar
217. Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Grp Q-S. A systematic review classifies sources of bias and variation in diagnostic test accuracy studies. Journal of Clinical Epidemiology. 2013;66(10):1093–104. WOS:000324086000005. pmid:23958378
- View Article
- PubMed/NCBI
- Google Scholar
218. Whiting P, Rutjes AW, Reitsma JB, Glas AS, Bossuyt PM, Kleijnen J. Sources of variation and bias in studies of diagnostic accuracy. A systematic review. Annals of internal medicine. 2004;140(3):189–202. pmid:14757617
- View Article
- PubMed/NCBI
- Google Scholar
219. Cohen JF, Korevaar DA, Altman DG, Bruns DE, Gatsonis CA, Hooft L, et al. STARD 2015 guidelines for reporting diagnostic accuracy studies: Explanation and elaboration. BMJ Open. 2016;6(11). pmid:28137831
- View Article
- PubMed/NCBI
- Google Scholar
220. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig L, et al. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. Bmj-British Medical Journal. 2015;351. WOS:000364152500002. pmid:26511519
- View Article
- PubMed/NCBI
- Google Scholar
221. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. The STARD statement for reporting studies of diagnostic accuracy: Explanation and elaboration. Croatian Medical Journal. 2003;44(5):639–50. WOS:000186027500024. pmid:14515429
- View Article
- PubMed/NCBI
- Google Scholar
222. Kostoulas P, Nielsen SS, Branscum AJ, Johnson WO, Dendukuri N, Dhand NK, et al. Reporting guidelines for diagnostic accuracy studies that use Bayesian latent class models (STARD-BLCM). Statistics in Medicine. 2017;36(23):3603–4. pmid:28675923
- View Article
- PubMed/NCBI
- Google Scholar
223. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Croatian Medical Journal. 2003;44(5):635–8. WOS:000186027500023. pmid:14515428
- View Article
- PubMed/NCBI
- Google Scholar
224. Jackson R, Ameratunga S, Broad J, Connor J, Lethaby A, Robb G, et al. The GATE frame: critical appraisal with pictures. BMJ Evidence-Based Medicine. 2006;11(2):35–8.
- View Article
- Google Scholar
225. Whiting P, Rutjes AW, Reitsma JB, Bossuyt PM, Kleijnen J. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC medical research methodology. 2003;3(1):25.
- View Article
- Google Scholar
226. Nofuentes JAR, Del Castillo JDL. Comparing the likelihood ratios of two binary diagnostic tests in the presence of partial verification. Biometrical Journal. 2005;47(4):442–57. pmid:16161803
- View Article
- PubMed/NCBI
- Google Scholar
227. Nofuentes JAR, Del Castillo JdDL. Comparison of the likelihood ratios of two binary diagnostic tests in paired designs. Statistics in Medicine. 2007;26(22):4179–201. pmid:17357992
- View Article
- PubMed/NCBI
- Google Scholar
228. Nofuentes JAR, Del Castillo JDL. EM algorithm for comparing two binary diagnostic tests when not all the patients are verified. Journal of Statistical Computation and Simulation. 2008;78(1):19–35.
- View Article
- Google Scholar
229. Nofuentes JARDC J. D. L.; Marzo P. F. Computational methods for comparing two binary diagnostic tests in the presence of partial verification of the disease. Computational Statistics. 2009;24(4):695–718. WOS:000271540200008.
- View Article
- Google Scholar
230. Nofuentes JARDC J. D. L.; Jimenez A. E. M. Comparison of the accuracy of multiple binary tests in the presence of partial disease verification. Journal of Statistical Planning and Inference. 2010;140(9):2504–19. WOS:000278398300016.
- View Article
- Google Scholar
231. Marin-Jimenez AE, Roldan-Nofuentes JA. Global hypothesis test to compare the likelihood ratios of multiple binary diagnostic tests with ignorable missing data. Sort-Statistics and Operations Research Transactions. 2014;38(2):305–23. WOS:000346689100011.
- View Article
- Google Scholar
232. Harel O, Zhou XH. Multiple imputation for the comparison of two screening tests in two-phase Alzheimer studies. Statistics in Medicine. 2007;26(11):2370–88. pmid:17054089
- View Article
- PubMed/NCBI
- Google Scholar
233. Zhou XH, Castelluccio P. Nonparametric analysis for the ROC areas of two diagnostic tests in the presence of nonignorable verification bias. Journal of Statistical Planning and Inference. 2003;115(1):193–213.
- View Article
- Google Scholar
234. Wang C, Turnbull BW, Nielsen SS, Grohn YT. Bayesian analysis of longitudinal Johne's disease diagnostic data without a gold standard test. Journal of Dairy Science. 2011;94(5):2320–8. WOS:000289789000017. pmid:21524521
- View Article
- PubMed/NCBI
- Google Scholar
235. Masaebi F, Zayeri F, Nasiri M, Azizmohammadlooha M. Contrastive analysis of diagnostic tests evaluation without gold standard: Review article. Tehran University Medical Journal. 2019;76(11):708–14.
- View Article
- Google Scholar
236. Beeley C. Web application development with R using Shiny: Packt Publishing Ltd; 2013.
237. Lim C, Wannapinij P, White L, Day NP, Cooper BS, Peacock SJ, et al. Using a web-based application to define the accuracy of diagnostic tests when the gold standard is imperfect. PloS one. 2013;8(11):e79489. pmid:24265775
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Bossuyt PM, Reitsma JB, Linnet K, Moons KG. Beyond diagnostic accuracy: the clinical utility of diagnostic tests. Clinical chemistry. 2012;58(12):1636–43. pmid:22730450
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Burke W. Genetic tests: clinical validity and clinical utility. Current protocols in human genetics. 2014;81(1):9.15. 1–9. 8.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref3] 3. Mallett S, Halligan S, Matthew Thompson GP, Collins GS, Altman DG. Interpreting diagnostic accuracy studies for patient care. BMJ (Online). 2012;345(7871). pmid:22750423
View Article
PubMed/NCBI
Google Scholar

[9] View Article

[10] PubMed/NCBI

[11] Google Scholar

[ref4] 4. Bossuyt PMI L.; Craig J.; Glasziou P. Comparative accuracy: Assessing new tests against existing diagnostic pathways. British Medical Journal. 2006;332(7549):1089–92. pmid:16675820
View Article
PubMed/NCBI
Google Scholar

[13] View Article

[14] PubMed/NCBI

[15] Google Scholar

[ref5] 5. Altman DG, Bland JM. Diagnostic tests 1: Sensitivity and specificity. British Medical Journal. 1994;308(6943):1552. pmid:8019315
View Article
PubMed/NCBI
Google Scholar

[17] View Article

[18] PubMed/NCBI

[19] Google Scholar

[ref6] 6. Eusebi P. Diagnostic Accuracy Measures. Cerebrovascular Diseases. 2013;36(4):267–72. WOS:000326935800004. pmid:24135733
View Article
PubMed/NCBI
Google Scholar

[21] View Article

[22] PubMed/NCBI

[23] Google Scholar

[ref7] 7. Šimundić A-M. Measures of diagnostic accuracy: basic definitions. Ejifcc. 2009;19(4):203. pmid:27683318
View Article
PubMed/NCBI
Google Scholar

[25] View Article

[26] PubMed/NCBI

[27] Google Scholar

[ref8] 8. Altman DG, Bland JM. Diagnostic tests 2: Predictive values. British Medical Journal. 1994;309(6947):102. pmid:8038641
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref9] 9. Wong HB, Lim GH. Measures of diagnostic accuracy: Sensitivity, specificity, PPV and NPV. Proceedings of Singapore Healthcare. 2011;20(4):316–8.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref10] 10. Alonzo TA, Pepe MS. Assessing accuracy of a continuous screening test in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2005;54(1):173–90.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref11] 11. Duc KT, Chiogna M, Adimari G. Bias–corrected methods for estimating the receiver operating characteristic surface of continuous diagnostic tests. Electronic Journal of Statistics. 2016;10(2):3063–113.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref12] 12. Chi YY, Zhou XH. Receiver operating characteristic surfaces in the presence of verification bias. Journal of the Royal Statistical Society Series C: Applied Statistics. 2008;57(1):1–23.
View Article
Google Scholar

[42] View Article

[43] Google Scholar

[ref13] 13. Zhang Y, Alonzo TA, for the Alzheimer's Disease Neuroimaging I. Inverse probability weighting estimation of the volume under the ROC surface in the presence of verification bias. Biometrical Journal. 2016;58(6):1338–56. pmid:27338713
View Article
PubMed/NCBI
Google Scholar

[45] View Article

[46] PubMed/NCBI

[47] Google Scholar

[ref14] 14. Rutjes AW, Reitsma JB, Coomarasamy A, Khan KS, Bossuyt PM. Evaluation of diagnostic tests when there is no gold standard. A review of methods. Health technology assessment (Winchester, England). 2007;11(50):iii, ix–51.
View Article
Google Scholar

[49] View Article

[50] Google Scholar

[ref15] 15. Kohn MA, Carpenter CR, Newman TB. Understanding the Direction of Bias in Studies of Diagnostic Test Accuracy. Academic Emergency Medicine. 2013;20(11):1194–206. WOS:000327026400017. pmid:24238322
View Article
PubMed/NCBI
Google Scholar

[52] View Article

[53] PubMed/NCBI

[54] Google Scholar

[ref16] 16. Glueck DHL M. M.; O'Donnell C. I.; Ringham B. M.; Brinton J. T.; Muller K. E.; Lewin J. M.; Alonzo T. A.; Pisano E. D. Bias in trials comparing paired continuous tests can cause researchers to choose the wrong screening modality. BMC medical research methodology. 2009;9:4. pmid:19154609
View Article
PubMed/NCBI
Google Scholar

[56] View Article

[57] PubMed/NCBI

[58] Google Scholar

[ref17] 17. Theel ES, Hilgart H, Breen-Lyles M, McCoy K, Flury R, Breeher LE, et al. Comparison of the QuantiFERON-TB gold plus and QuantiFERON-TB gold in-tube interferon gamma release assays in patients at risk for tuberculosis and in health care workers. Journal of Clinical Microbiology. 2018;56(7). pmid:29743310
View Article
PubMed/NCBI
Google Scholar

[60] View Article

[61] PubMed/NCBI

[62] Google Scholar

[ref18] 18. Van Dyck E, Buvé A, Weiss HA, Glynn JR, Brown DWG, De Deken B, et al. Performance of commercially available enzyme immunoassays for detection of antibodies against herpes simplex virus type 2 in African populations. Journal of Clinical Microbiology. 2004;42(7):2961–5. pmid:15243045
View Article
PubMed/NCBI
Google Scholar

[64] View Article

[65] PubMed/NCBI

[66] Google Scholar

[ref19] 19. Naaktgeboren CA, De Groot JAH, Rutjes AWS, Bossuyt PMM, Reitsma JB, Moons KGM. Anticipating missing reference standard data when planning diagnostic accuracy studies. BMJ (Online). 2016;352. pmid:26861453
View Article
PubMed/NCBI
Google Scholar

[68] View Article

[69] PubMed/NCBI

[70] Google Scholar

[ref20] 20. Karch AK A.; Zapf A.; Zerr I.; Karch A. Partial verification bias and incorporation bias affected accuracy estimates of diagnostic studies for biomarkers that were part of an existing composite gold standard. Journal of Clinical Epidemiology. 2016;78:73–82. WOS:000389615400010. pmid:27107877
View Article
PubMed/NCBI
Google Scholar

[72] View Article

[73] PubMed/NCBI

[74] Google Scholar

[ref21] 21. Begg CB, Greenes RA. Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics. 1983;39(1):207–15. pmid:6871349
View Article
PubMed/NCBI
Google Scholar

[76] View Article

[77] PubMed/NCBI

[78] Google Scholar

[ref22] 22. Thompson M, Van den Bruel A. Sources of Bias in Diagnostic Studies. Diagnostic Tests Toolkit: Wiley-Blackwell; 2011. p. 26–33.

[ref23] 23. Walsh T. Fuzzy gold standards: Approaches to handling an imperfect reference standard. Journal of Dentistry. 2018;74:S47–S9. pmid:29929589
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref24] 24. Zhou XH. Correcting for verification bias in studies of a diagnostic test's accuracy. Statistical Methods in Medical Research. 1998;7(4):337–53. pmid:9871951
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref25] 25. Alonzo TA. Verification bias-impact and methods for correction when assessing accuracy of diagnostic tests. Revstat Statistical Journal. 2014;12(1):67–83.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref26] 26. Naaktgeboren CA, de Groot JA, Rutjes AW, Bossuyt PM, Reitsma JB, Moons KG. Anticipating missing reference standard data when planning diagnostic accuracy studies. bmj. 2016;352:i402. pmid:26861453
View Article
PubMed/NCBI
Google Scholar

[92] View Article

[93] PubMed/NCBI

[94] Google Scholar

[ref27] 27. Van Smeden M, Naaktgeboren CA, Reitsma JB, Moons KGM, de Groot JAH. Latent Class Models in Diagnostic Studies When There is No Reference Standard-A Systematic Review. American Journal of Epidemiology. 2014;179(4):423–31. WOS:000331264100003. pmid:24272278
View Article
PubMed/NCBI
Google Scholar

[96] View Article

[97] PubMed/NCBI

[98] Google Scholar

[ref28] 28. Collins J, Huynh M. Estimation of diagnostic test accuracy without full verification: a review of latent class methods. Statistics in Medicine. 2014;33(24):4141–69. pmid:24910172
View Article
PubMed/NCBI
Google Scholar

[100] View Article

[101] PubMed/NCBI

[102] Google Scholar

[ref29] 29. Hui SL, Zhou XH. Evaluation of diagnostic tests without gold standards. Statistical Methods in Medical Research. 1998;7(4):354–70. pmid:9871952
View Article
PubMed/NCBI
Google Scholar

[104] View Article

[105] PubMed/NCBI

[106] Google Scholar

[ref30] 30. Trikalinos TA, Balion CM. Chapter 9: Options for summarizing medical test performance in the absence of a "gold standard". Journal of General Internal Medicine. 2012;27(SUPPL.1):S67–S75.
View Article
Google Scholar

[108] View Article

[109] Google Scholar

[ref31] 31. Enøe C, Georgiadis MP, Johnson WO. Estimation of sensitivity and specificity of diagnostic tests and disease prevalence when the true disease state is unknown. Preventive Veterinary Medicine. 2000;45(1):61–81. https://doi.org/10.1016/S0167-5877(00)00117-3.
View Article
Google Scholar

[111] View Article

[112] Google Scholar

[ref32] 32. Zaki R, Bulgiba A, Ismail R, Ismail NA. Statistical methods used to test for agreement of medical instruments measuring continuous variables in method comparison studies: a systematic review. PloS one. 2012;7(5):e37908. pmid:22662248
View Article
PubMed/NCBI
Google Scholar

[114] View Article

[115] PubMed/NCBI

[116] Google Scholar

[ref33] 33. Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive veterinary medicine. 2005;68(2–4):145–63. pmid:15820113
View Article
PubMed/NCBI
Google Scholar

[118] View Article

[119] PubMed/NCBI

[120] Google Scholar

[ref34] 34. Reitsma JBR A. W. S.; Khan K. S.; Coomarasamy A.; Bossuyt P. M. A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. Journal of Clinical Epidemiology. 2009;62(8):797–806. pmid:19447581
View Article
PubMed/NCBI
Google Scholar

[122] View Article

[123] PubMed/NCBI

[124] Google Scholar

[ref35] 35. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JP, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ (Clinical research ed). 2009;339. pmid:19622552
View Article
PubMed/NCBI
Google Scholar

[126] View Article

[127] PubMed/NCBI

[128] Google Scholar

[ref36] 36. Tips Sayers A. and tricks in performing a systematic review. Br J Gen Pract. 2008;58(547):136–.
View Article
Google Scholar

[130] View Article

[131] Google Scholar

[ref37] 37. Harel OZ X. H. Multiple imputation for correcting verification bias. Statistics in Medicine. 2006;25(22):3769–86. WOS:000242429400001. pmid:16435337
View Article
PubMed/NCBI
Google Scholar

[133] View Article

[134] PubMed/NCBI

[135] Google Scholar

[ref38] 38. He H, McDermott MP. A robust method using propensity score stratification for correcting verification bias for binary tests. Biostatistics. 2012;13(1):32–47. pmid:21856650
View Article
PubMed/NCBI
Google Scholar

[137] View Article

[138] PubMed/NCBI

[139] Google Scholar

[ref39] 39. Zhou XH. Maximum likelihood estimators of sensitivity and specificity corrected for verification bias. Communications in Statistics—Theory and Methods. 1993;22(11):3177–98.
View Article
Google Scholar

[141] View Article

[142] Google Scholar

[ref40] 40. Kosinski AS, Barnhart HX. Accounting for nonignorable verification bias in assessment of diagnostic tests. Biometrics. 2003;59(1):163–71. pmid:12762453
View Article
PubMed/NCBI
Google Scholar

[144] View Article

[145] PubMed/NCBI

[146] Google Scholar

[ref41] 41. Kosinski AS, Barnhart HX. A global sensitivity analysis of performance of a medical diagnostic test when verification bias is present. Statistics in Medicine. 2003;22(17):2711–21. pmid:12939781
View Article
PubMed/NCBI
Google Scholar

[148] View Article

[149] PubMed/NCBI

[150] Google Scholar

[ref42] 42. Martinez EZAA J.; Louzada-Neto F. Estimators of sensitivity and specificity in the presence of verification bias: A Bayesian approach. Computational Statistics and Data Analysis. 2006;51(2):601–11.
View Article
Google Scholar

[152] View Article

[153] Google Scholar

[ref43] 43. Buzoianu M, Kadane JB. Adjusting for verification bias in diagnostic test evaluation: A Bayesian approach. Statistics in Medicine. 2008;27(13):2453–73. pmid:17979150
View Article
PubMed/NCBI
Google Scholar

[155] View Article

[156] PubMed/NCBI

[157] Google Scholar

[ref44] 44. Hajivandi A, Shirazi HRG, Saadat SH, Chehrazi M. A Bayesian analysis with informative prior on disease prevalence for predicting missing values due to verification bias. Open Access Macedonian Journal of Medical Sciences. 2018;6(7):1225–30. pmid:30087725
View Article
PubMed/NCBI
Google Scholar

[159] View Article

[160] PubMed/NCBI

[161] Google Scholar

[ref45] 45. Zhou XH. Comparing accuracies of two screening tests in a two-phase study for dementia. Journal of the Royal Statistical Society Series C: Applied Statistics. 1998;47(1):135–47.
View Article
Google Scholar

[163] View Article

[164] Google Scholar

[ref46] 46. Lloyd CJ, Frommer DJ. An application of multinomial logistic regression to estimating performance of a multiple-screening test with incomplete verification. Journal of the Royal Statistical Society Series C-Applied Statistics. 2008;57:89–102. WOS:000252330800006.
View Article
Google Scholar

[166] View Article

[167] Google Scholar

[ref47] 47. Albert PS. Imputation approaches for estimating diagnostic accuracy for multiple tests from partially verified designs. Biometrics. 2007;63(3):947–57. pmid:17825024
View Article
PubMed/NCBI
Google Scholar

[169] View Article

[170] PubMed/NCBI

[171] Google Scholar

[ref48] 48. Albert PS, Dodd LE. On estimating diagnostic accuracy from studies with multiple raters and partial gold standard evaluation. Journal of the American Statistical Association. 2008;103(481):61–73. WOS:000254311500014. pmid:19802353
View Article
PubMed/NCBI
Google Scholar

[173] View Article

[174] PubMed/NCBI

[175] Google Scholar

[ref49] 49. Martinez EZ, Achcar JA, Louzada-Neto F. Bayesian estimation of diagnostic tests accuracy for semi-latent data with covariates. Journal of Biopharmaceutical Statistics. 2005;15(5):809–21. pmid:16078387
View Article
PubMed/NCBI
Google Scholar

[177] View Article

[178] PubMed/NCBI

[179] Google Scholar

[ref50] 50. Xue X, Kim MY, Castle PE, Strickler HD. A new method to address verification bias in studies of clinical screening tests: Cervical cancer screening assays as an example. Journal of Clinical Epidemiology. 2014;67(3):343–53. pmid:24332397
View Article
PubMed/NCBI
Google Scholar

[181] View Article

[182] PubMed/NCBI

[183] Google Scholar

[ref51] 51. Walter SD. Estimation of test sensitivity and specificity when disease confirmation is limited to positive results. Epidemiology. 1999:67–72. pmid:9888282
View Article
PubMed/NCBI
Google Scholar

[185] View Article

[186] PubMed/NCBI

[187] Google Scholar

[ref52] 52. Böhning D, Patilea V. A capture–recapture approach for screening using two diagnostic tests with availability of disease status for the test positives only. Journal of the American Statistical Association. 2008;103(481):212–21.
View Article
Google Scholar

[189] View Article

[190] Google Scholar

[ref53] 53. Chu HZ, Yijie; Cole, Stephen R.; Ibrahim, Joseph G. On the estimation of disease prevalence by latent class models for screening studies using two screening tests with categorical disease status verified in test positives only. Statistics in Medicine. 2010;29(11):1206–18. pmid:20191614
View Article
PubMed/NCBI
Google Scholar

[192] View Article

[193] PubMed/NCBI

[194] Google Scholar

[ref54] 54. Baker SG. Evaluating multiple diagnostic tests with partial verification. Biometrics. 1995;51(1):330–7. pmid:7539300
View Article
PubMed/NCBI
Google Scholar

[196] View Article

[197] PubMed/NCBI

[198] Google Scholar

[ref55] 55. Van Geloven NB K. A.; Opmeer B. C.; Mol B. W.; Zwinderman A. H. How to deal with double partial verification when evaluating two index tests in relation to a reference test? Statistics in Medicine. 2012;31(11–12):1265–76. pmid:22161741
View Article
PubMed/NCBI
Google Scholar

[200] View Article

[201] PubMed/NCBI

[202] Google Scholar

[ref56] 56. Van Geloven N, Broeze KA, Opmeer BC, Mol BW, Zwinderman AH. Correction: How to deal with double partial verification when evaluating two index tests in relation to a reference test? Statistics in Medicine. 2012;31(28):3787–8.
View Article
Google Scholar

[204] View Article

[205] Google Scholar

[ref57] 57. Aragon DC, Martinez EZ, Alberto Achcar J. Bayesian estimation for performance measures of two diagnostic tests in the presence of verification bias. Journal of biopharmaceutical statistics. 2010;20(4):821–34. pmid:20496208
View Article
PubMed/NCBI
Google Scholar

[207] View Article

[208] PubMed/NCBI

[209] Google Scholar

[ref58] 58. Gray R, Begg CB, Greenes RA. Construction of receiver operating characteristic curves when disease verification is subject to selection bias. Medical Decision Making. 1984;4(2):151–64. pmid:6472063
View Article
PubMed/NCBI
Google Scholar

[211] View Article

[212] PubMed/NCBI

[213] Google Scholar

[ref59] 59. Zhou XH. A nonparametric maximum likelihood estimator for the receiver operating characteristic curve area in the presence of verification bias. Biometrics. 1996;52(1):299–305. pmid:8934599
View Article
PubMed/NCBI
Google Scholar

[215] View Article

[216] PubMed/NCBI

[217] Google Scholar

[ref60] 60. Rodenberg C, Zhou XH. ROC curve estimation when covariates affect the verification process. Biometrics. 2000;56(4):1256–62. pmid:11129488
View Article
PubMed/NCBI
Google Scholar

[219] View Article

[220] PubMed/NCBI

[221] Google Scholar

[ref61] 61. Zhou XH, Rodenberg CA. Estimating an ROC curve in the presence of non-ignorable verification bias. Communications in Statistics—Theory and Methods. 1998;27(3):635–57.
View Article
Google Scholar

[223] View Article

[224] Google Scholar

[ref62] 62. Hunink MG, Richardson DK, Doubilet PM, Begg CB. Testing for fetal pulmonary maturity: ROC analysis involving covariates, verification bias, and combination testing. Medical Decision Making. 1990;10(3):201–11. pmid:2370827
View Article
PubMed/NCBI
Google Scholar

[226] View Article

[227] PubMed/NCBI

[228] Google Scholar

[ref63] 63. He HL, Jeffrey M.; McDermott, Michael P. Direct estimation of the area under the receiver operating characteristic curve in the presence of verification bias. Statistics in Medicine. 2009;28(3):361–76. pmid:18680124
View Article
PubMed/NCBI
Google Scholar

[230] View Article

[231] PubMed/NCBI

[232] Google Scholar

[ref64] 64. Adimari G, Chiogna M. Nearest-neighbor estimation for ROC analysis under verification bias. International Journal of Biostatistics. 2015;11(1):109–24. pmid:25781712
View Article
PubMed/NCBI
Google Scholar

[234] View Article

[235] PubMed/NCBI

[236] Google Scholar

[ref65] 65. Adimari G, Chiogna M. Nonparametric verification bias-corrected inference for the area under the ROC curve of a continuous-scale diagnostic test. Statistics and its Interface. 2017;10(4):629–41.
View Article
Google Scholar

[238] View Article

[239] Google Scholar

[ref66] 66. Gu J, Ghosal S, Kleiner DE. Bayesian ROC curve estimation under verification bias. Statistics in Medicine. 2014;33(29):5081–96. pmid:25269427
View Article
PubMed/NCBI
Google Scholar

[241] View Article

[242] PubMed/NCBI

[243] Google Scholar

[ref67] 67. Fluss RR, Benjamin; Faraggi, David; Rotnitzky, Andrea. Estimation of the ROC Curve under Verification Bias. Biometrical Journal. 2009;51(3):475–90. pmid:19588455
View Article
PubMed/NCBI
Google Scholar

[245] View Article

[246] PubMed/NCBI

[247] Google Scholar

[ref68] 68. Rotnitzky A, Faraggi D, Schisterman E. Doubly robust estimation of the area under the receiver-operating characteristic curve in the presence of verification bias. Journal of the American Statistical Association. 2006;101(475):1276–88.
View Article
Google Scholar

[249] View Article

[250] Google Scholar

[ref69] 69. Fluss R, Reiser B, Faraggi D. Adjusting ROC curves for covariates in the presence of verification bias. Journal of Statistical Planning and Inference. 2012;142(1):1–11.
View Article
Google Scholar

[252] View Article

[253] Google Scholar

[ref70] 70. Liu DZ, Xiao-Hua. A Model for Adjusting for Nonignorable Verification Bias in Estimation of the ROC Curve and Its Area with Likelihood-Based Approach. Biometrics. 2010;66(4):1119–28. pmid:20222937
View Article
PubMed/NCBI
Google Scholar

[255] View Article

[256] PubMed/NCBI

[257] Google Scholar

[ref71] 71. Yu W, Kim JK, Park T. Estimation of area under the ROC Curve under nonignorable verification bias. Statistica Sinica. 2018;28(4):2149–66. pmid:31367164
View Article
PubMed/NCBI
Google Scholar

[259] View Article

[260] PubMed/NCBI

[261] Google Scholar

[ref72] 72. Page JH, Rotnitzky A. Estimation of the disease-specific diagnostic marker distribution under verification bias. Computational Statistics and Data Analysis. 2009;53(3):707–17. pmid:23087495
View Article
PubMed/NCBI
Google Scholar

[263] View Article

[264] PubMed/NCBI

[265] Google Scholar

[ref73] 73. Liu DZ, Xiao-Hua . Covariate Adjustment in Estimating the Area Under ROC Curve with Partially Missing Gold Standard. Biometrics. 2013;69(1):91–100. pmid:23410529
View Article
PubMed/NCBI
Google Scholar

[267] View Article

[268] PubMed/NCBI

[269] Google Scholar

[ref74] 74. Liu D, Zhou XH. Semiparametric Estimation of the Covariate-Specific ROC Curve in Presence of Ignorable Verification Bias. Biometrics. 2011;67(3):906–16. pmid:21361890
View Article
PubMed/NCBI
Google Scholar

[271] View Article

[272] PubMed/NCBI

[273] Google Scholar

[ref75] 75. Yu BZ, Chuan . Assessing the accuracy of a multiphase diagnosis procedure for dementia. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2012;61(1):67–81.
View Article
Google Scholar

[275] View Article

[276] Google Scholar

[ref76] 76. Chi Y-YZ, Xiao-Hua . Receiver operating characteristic surfaces in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2008;57(1):1–23.
View Article
Google Scholar

[278] View Article

[279] Google Scholar

[ref77] 77. Duc KT, Chiogna M, Adimari G. Nonparametric Estimation of ROC Surfaces Under Verification Bias. 2016.
View Article
Google Scholar

[281] View Article

[282] Google Scholar

[ref78] 78. To Duc K. bcROCsurface: An R package for correcting verification bias in estimation of the ROC surface and its volume for continuous diagnostic tests. BMC Bioinformatics. 2017;18(1).
View Article
Google Scholar

[284] View Article

[285] Google Scholar

[ref79] 79. Zhang Y, Alonzo TA, for the Alzheimer's Disease Neuroimaging I. Estimation of the volume under the receiver-operating characteristic surface adjusting for non-ignorable verification bias. Statistical Methods in Medical Research. 2018;27(3):715–39. pmid:29338546
View Article
PubMed/NCBI
Google Scholar

[287] View Article

[288] PubMed/NCBI

[289] Google Scholar

[ref80] 80. Zhu R, Ghosal S. Bayesian Semiparametric ROC surface estimation under verification bias. Computational Statistics and Data Analysis. 2019;133:40–52.
View Article
Google Scholar

[291] View Article

[292] Google Scholar

[ref81] 81. To Duc K, Chiogna M, Adimari G, for the Alzheimer's Disease Neuroimaging I. Estimation of the volume under the ROC surface in presence of nonignorable verification bias. Statistical Methods and Applications. 2019.
View Article
Google Scholar

[294] View Article

[295] Google Scholar

[ref82] 82. De Groot JAH, Dendukuri N, Janssen KJM, Reitsma J, Bossuyt PM, Moons KGM. Adjusting for differential verification bias in diagnostic accuracy studies: A bayesian approach. American Journal of Epidemiology. 2010;11):S140.
View Article
Google Scholar

[297] View Article

[298] Google Scholar

[ref83] 83. Lu YD, Nandini; Schiller Ian; Joseph Lawrence. A Bayesian approach to simultaneously adjusting for verification and reference standard bias in diagnostic test studies. Statistics in Medicine. 2010;29(24):2532–43. pmid:20799249
View Article
PubMed/NCBI
Google Scholar

[300] View Article

[301] PubMed/NCBI

[302] Google Scholar

[ref84] 84. Glueck DH, Lamb MM, O'Donnell CI, Ringham BM, Brinton JT, Muller KE, et al. Bias in trials comparing paired continuous tests can cause researchers to choose the wrong screening modality. Bmc Medical Research Methodology. 2009;9. WOS:000264155400001. pmid:19154609
View Article
PubMed/NCBI
Google Scholar

[304] View Article

[305] PubMed/NCBI

[306] Google Scholar

[ref85] 85. Capelli GN A.; Nardelli S.; di Regalbono A. F.; Pietrobelli M. Validation of a commercially available cELISA test for canine neosporosis against an indirect fluorescent antibody test (IFAT). Preventive Veterinary Medicine. 2006;73(4):315–20. WOS:000236336000007. pmid:16293328
View Article
PubMed/NCBI
Google Scholar

[308] View Article

[309] PubMed/NCBI

[310] Google Scholar

[ref86] 86. Ferreccio C, Barriga MI, Lagos M, Ibáñez C, Poggi H, González F, et al. Screening trial of human papillomavirus for early detection of cervical cancer in Santiago, Chile. International Journal of Cancer. 2012;132(4):916–23. pmid:22684726
View Article
PubMed/NCBI
Google Scholar

[312] View Article

[313] PubMed/NCBI

[314] Google Scholar

[ref87] 87. Iglesias-Garriz I, Rodríguez MA, García-Porrero E, Ereño F, Garrote C, Suarez G. Emergency Nontraumatic Chest Pain: Use of Stress Echocardiography to Detect Significant Coronary Artery Stenosis. Journal of the American Society of Echocardiography. 2005;18(11):1181–6. pmid:16275527
View Article
PubMed/NCBI
Google Scholar

[316] View Article

[317] PubMed/NCBI

[318] Google Scholar

[ref88] 88. Cronin AM, Vickers AJ. Statistical methods to correct for verification bias in diagnostic studies are inadequate when there are few false negatives: A simulation study. BMC Medical Research Methodology. 2008;8. pmid:19014457
View Article
PubMed/NCBI
Google Scholar

[320] View Article

[321] PubMed/NCBI

[322] Google Scholar

[ref89] 89. de Groot JAH, Janssen KJM, Zwinderman AH, Bossuyt PMM, Reitsma JB, Moons KGM. Correcting for Partial Verification Bias: A Comparison of Methods. Annals of Epidemiology. 2011;21(2):139–48. WOS:000286348200009. pmid:21109454
View Article
PubMed/NCBI
Google Scholar

[324] View Article

[325] PubMed/NCBI

[326] Google Scholar

[ref90] 90. Heida A, Van De Vijver E, Van Ravenzwaaij D, Van Biervliet S, Hummel TZ, Yuksel Z, et al. Predicting inflammatory bowel disease in children with abdominal pain and diarrhoea: Calgranulin-C versus calprotectin stool tests. Archives of Disease in Childhood. 2018;103(6):565–71. pmid:29514815
View Article
PubMed/NCBI
Google Scholar

[328] View Article

[329] PubMed/NCBI

[330] Google Scholar

[ref91] 91. Brenner H. Correcting for exposure misclassification using an alloyed gold standard. Epidemiology. 1996;7(4):406–10. pmid:8793367
View Article
PubMed/NCBI
Google Scholar

[332] View Article

[333] PubMed/NCBI

[334] Google Scholar

[ref92] 92. Gart JJ, Buck AA. COMPARISON OF A SCREENING TEST AND A REFERENCE TEST IN EPIDEMIOLOGIC STUDIES .2. A PROBABILISTIC MODEL FOR COMPARISON OF DIAGNOSTIC TESTS. American Journal of Epidemiology. 1966;83(3):593–&. WOS:A19667894500018. pmid:5932703
View Article
PubMed/NCBI
Google Scholar

[336] View Article

[337] PubMed/NCBI

[338] Google Scholar

[ref93] 93. Staquet M, Rozencweig M, Lee YJ, Muggia FM. Methodology for the assessment of new dichotomous diagnostic tests. Journal of Chronic Diseases. 1981;34(12):599–610. pmid:6458624
View Article
PubMed/NCBI
Google Scholar

[340] View Article

[341] PubMed/NCBI

[342] Google Scholar

[ref94] 94. Albert PS. Estimating diagnostic accuracy of multiple binary tests with an imperfect reference standard. Statistics in Medicine. 2009;28(5):780–97. pmid:19101935
View Article
PubMed/NCBI
Google Scholar

[344] View Article

[345] PubMed/NCBI

[346] Google Scholar

[ref95] 95. Emerson SC, Waikar SS, Fuentes C, Bonventre JV, Betensky RA. Biomarker validation with an imperfect reference: Issues and bounds. Statistical Methods in Medical Research. 2018;27(10):2933–45. pmid:28166709
View Article
PubMed/NCBI
Google Scholar

[348] View Article

[349] PubMed/NCBI

[350] Google Scholar

[ref96] 96. Thibodeau L. Evaluating diagnostic tests. Biometrics. 1981:801–4.
View Article
Google Scholar

[352] View Article

[353] Google Scholar

[ref97] 97. Hahn AL, Marc; Landt Olfert; Schwarz, Norbert Georg; Frickmann Hagen. Comparison of one commercial and two in-house TaqMan multiplex real-time PCR assays for detection of enteropathogenic, enterotoxigenic and enteroaggregative Escherichia coli. Tropical Medicine & International Health. 2017;22(11):1371–6. pmid:28906580
View Article
PubMed/NCBI
Google Scholar

[355] View Article

[356] PubMed/NCBI

[357] Google Scholar

[ref98] 98. Matos RN, T. F.; Braga M. M.; Siqueira W. L.; Duarte D. A.; Mendes F. M. Clinical performance of two fluorescence-based methods in detecting occlusal caries lesions in primary teeth. Caries Research. 2011;45(3):294–302. pmid:21625126
View Article
PubMed/NCBI
Google Scholar

[359] View Article

[360] PubMed/NCBI

[361] Google Scholar

[ref99] 99. Mathews WC, Cachay ER, Caperna J, Sitapati A, Cosman B, Abramson I. Estimating the accuracy of anal cytology in the presence of an imperfect reference standard. PLoS ONE. 2010;5(8). pmid:20808869
View Article
PubMed/NCBI
Google Scholar

[363] View Article

[364] PubMed/NCBI

[365] Google Scholar

[ref100] 100. Hadgu A, Dendukuri N, Hilden J. Evaluation of nucleic acid amplification tests in the absence of a perfect gold-standard test: a review of the statistical and epidemiologic issues. Epidemiology. 2005:604–12. pmid:16135935
View Article
PubMed/NCBI
Google Scholar

[367] View Article

[368] PubMed/NCBI

[369] Google Scholar

[ref101] 101. Hawkins DMG J. A.; Stephenson B. Some issues in resolution of diagnostic tests using an imperfect gold standard. Statistics in Medicine. 2001;20(13):1987–2001. pmid:11427955
View Article
PubMed/NCBI
Google Scholar

[371] View Article

[372] PubMed/NCBI

[373] Google Scholar

[ref102] 102. Hagenaars JA. Latent structure models with direct effects between indicators: local dependence models. Sociological Methods & Research. 1988;16(3):379–405.
View Article
Google Scholar

[375] View Article

[376] Google Scholar

[ref103] 103. Uebersax JS. Probit latent class analysis with dichotomous or ordered category measures: Conditional independence/dependence models. Applied Psychological Measurement. 1999;23(4):283–97.
View Article
Google Scholar

[378] View Article

[379] Google Scholar

[ref104] 104. Yang I, Becker MP. Latent variable modeling of diagnostic accuracy. Biometrics. 1997:948–58. pmid:9290225
View Article
PubMed/NCBI
Google Scholar

[381] View Article

[382] PubMed/NCBI

[383] Google Scholar

[ref105] 105. Qu Y, Tan M, Kutner MH. Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics. 1996;52(3):797–810. pmid:8805757
View Article
PubMed/NCBI
Google Scholar

[385] View Article

[386] PubMed/NCBI

[387] Google Scholar

[ref106] 106. Albert PS, McShane LM, Shih JH, Network USNCIBTM. Latent class modeling approaches for assessing diagnostic error without a gold standard: with applications to p53 immunohistochemical assays in bladder tumors. Biometrics. 2001;57(2):610–9. pmid:11414591
View Article
PubMed/NCBI
Google Scholar

[389] View Article

[390] PubMed/NCBI

[391] Google Scholar

[ref107] 107. Zhang BC Z.; Albert P. S. Estimating Diagnostic Accuracy of Raters Without a Gold Standard by Exploiting a Group of Experts. Biometrics. 2012;68(4):1294–302. pmid:23006010
View Article
PubMed/NCBI
Google Scholar

[393] View Article

[394] PubMed/NCBI

[395] Google Scholar

[ref108] 108. Xu HB, Michael A.; Craig, Bruce A. Evaluating accuracy of diagnostic tests with intermediate results in the absence of a gold standard. Statistics in Medicine. 2013;32(15):2571–84. pmid:23212851
View Article
PubMed/NCBI
Google Scholar

[397] View Article

[398] PubMed/NCBI

[399] Google Scholar

[ref109] 109. Wang Z, Zhou X-H, Wang M. Evaluation of diagnostic accuracy in detecting ordered symptom statuses without a gold standard. Biostatistics. 2011;12(3):567–81. pmid:21209155
View Article
PubMed/NCBI
Google Scholar

[401] View Article

[402] PubMed/NCBI

[403] Google Scholar

[ref110] 110. Wang ZZ, Xiao-Hua . Random effects models for assessing diagnostic accuracy of traditional Chinese doctors in absence of a gold standard. Statistics in Medicine. 2012;31(7):661–71. pmid:21626532
View Article
PubMed/NCBI
Google Scholar

[405] View Article

[406] PubMed/NCBI

[407] Google Scholar

[ref111] 111. Liu WZ B.; Zhang Z. W.; Chen B. J.; Zhou X. H. A pseudo-likelihood approach for estimating diagnostic accuracy of multiple binary medical tests. Computational Statistics & Data Analysis. 2015;84:85–98. WOS:000348263200007.
View Article
Google Scholar

[409] View Article

[410] Google Scholar

[ref112] 112. Xue X, Oktay M, Goswami S, Kim MY. A method to compare the performance of two molecular diagnostic tools in the absence of a gold standard. Statistical Methods in Medical Research. 2019;28(2):419–31. pmid:28814156
View Article
PubMed/NCBI
Google Scholar

[412] View Article

[413] PubMed/NCBI

[414] Google Scholar

[ref113] 113. Nérette P, Stryhn H, Dohoo I, Hammell L. Using pseudogold standards and latent-class analysis in combination to evaluate the accuracy of three diagnostic tests. Preventive veterinary medicine. 2008;85(3–4):207–25. pmid:18355935
View Article
PubMed/NCBI
Google Scholar

[416] View Article

[417] PubMed/NCBI

[418] Google Scholar

[ref114] 114. Dendukuri N, Hadgu A, Wang L. Modeling conditional dependence between diagnostic tests: a multiple latent variable model. Statistics in medicine. 2009;28(3):441–61. pmid:19067379
View Article
PubMed/NCBI
Google Scholar

[420] View Article

[421] PubMed/NCBI

[422] Google Scholar

[ref115] 115. Johnson WO, Gastwirth JL, Pearson LM. Screening without a "gold standard": The Hui-Walter paradigm revisited. American Journal of Epidemiology. 2001;153(9):921–4. pmid:11323324
View Article
PubMed/NCBI
Google Scholar

[424] View Article

[425] PubMed/NCBI

[426] Google Scholar

[ref116] 116. Martinez EZL-N F.; Derchain S. F. M.; Achcar J. A.; Gontijo R. C.; Sarian L. O. Z.; Syrjänen K. J. Bayesian estimation of performance measures of cervical cancer screening tests in the presence of covariates and absence of a gold standard. Cancer Informatics. 2008;6:33–46. pmid:19259401
View Article
PubMed/NCBI
Google Scholar

[428] View Article

[429] PubMed/NCBI

[430] Google Scholar

[ref117] 117. Zhang J, Cole SR, Richardson DB, Chu H. A Bayesian approach to strengthen inference for case‐control studies with multiple error‐prone exposure assessments. Statistics in medicine. 2013;32(25):4426–37. pmid:23661263
View Article
PubMed/NCBI
Google Scholar

[432] View Article

[433] PubMed/NCBI

[434] Google Scholar

[ref118] 118. Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2002;64(4):583–639.
View Article
Google Scholar

[436] View Article

[437] Google Scholar

[ref119] 119. Pereira da Silva HD, Ascaso C, Gonçalves AQ, Orlandi PP, Abellana R. A Bayesian approach to model the conditional correlation between several diagnostic tests and various replicated subjects measurements. Statistics in Medicine. 2017;36(20):3154–70. pmid:28543307
View Article
PubMed/NCBI
Google Scholar

[439] View Article

[440] PubMed/NCBI

[441] Google Scholar

[ref120] 120. Zhou X-HC, Pete; Zhou Chuan. Nonparametric Estimation of ROC Curves in the Absence of a Gold Standard. Biometrics. 2005;61(2):600–9. pmid:16011710
View Article
PubMed/NCBI
Google Scholar

[443] View Article

[444] PubMed/NCBI

[445] Google Scholar

[ref121] 121. Henkelman RM, Kay I, Bronskill MJ. Receiver operator characteristic (ROC) analysis without truth. Medical Decision Making. 1990;10(1):24–9. pmid:2325524
View Article
PubMed/NCBI
Google Scholar

[447] View Article

[448] PubMed/NCBI

[449] Google Scholar

[ref122] 122. Beiden SV, Campbell G, Meier KL, Wagner RF, editors. The problem of ROC analysis without truth: The EM algorithm and the information matrix. Medical Imaging 2000: Image Perception and Performance; 2000: International Society for Optics and Photonics.
View Article
Google Scholar

[451] View Article

[452] Google Scholar

[ref123] 123. Choi YK, Johnson WO, Collins MT, Gardner IA. Bayesian inferences for receiver operating characteristic curves in the absence of a gold standard. Journal of Agricultural, Biological, and Environmental Statistics. 2006;11(2):210–29.
View Article
Google Scholar

[454] View Article

[455] Google Scholar

[ref124] 124. Wang C, Turnbull BW, Gröhn YT, Nielsen SS. Nonparametric estimation of ROC curves based on Bayesian models when the true disease state is unknown. Journal of Agricultural, Biological, and Environmental Statistics. 2007;12(1):128–46.
View Article
Google Scholar

[457] View Article

[458] Google Scholar

[ref125] 125. Branscum AJJ, Wesley O.; Hanson, Timothy E.; Gardner, Ian A. Bayesian semiparametric ROC curve estimation and disease diagnosis. Statistics in Medicine. 2008;27(13):2474–96. pmid:18300333
View Article
PubMed/NCBI
Google Scholar

[460] View Article

[461] PubMed/NCBI

[462] Google Scholar

[ref126] 126. Erkanli AS, Minje; Jane Costello E.; Angold Adrian. Bayesian semi-parametric ROC analysis. Statistics in Medicine. 2006;25(22):3905–28. pmid:16416403
View Article
PubMed/NCBI
Google Scholar

[464] View Article

[465] PubMed/NCBI

[466] Google Scholar

[ref127] 127. García Barrado L, Coart E, Burzykowski T. Development of a diagnostic test based on multiple continuous biomarkers with an imperfect reference test. Statistics in Medicine. 2016;35(4):595–608. pmid:26388206
View Article
PubMed/NCBI
Google Scholar

[468] View Article

[469] PubMed/NCBI

[470] Google Scholar

[ref128] 128. Coart E, Barrado LG, Duits FH, Scheltens P, Van Der Flier WM, Teunissen CE, et al. Correcting for the Absence of a Gold Standard Improves Diagnostic Accuracy of Biomarkers in Alzheimer's Disease. Journal of Alzheimer's Disease. 2015;46(4):889–99. pmid:25869788
View Article
PubMed/NCBI
Google Scholar

[472] View Article

[473] PubMed/NCBI

[474] Google Scholar

[ref129] 129. Jafarzadeh SR, Johnson WO, Gardner IA. Bayesian modeling and inference for diagnostic accuracy and probability of disease based on multiple diagnostic biomarkers with and without a perfect reference standard. Statistics in Medicine. 2016;35(6):859–76. pmid:26415924
View Article
PubMed/NCBI
Google Scholar

[476] View Article

[477] PubMed/NCBI

[478] Google Scholar

[ref130] 130. Hwang BS, Chen Z. An Integrated Bayesian Nonparametric Approach for Stochastic and Variability Orders in ROC Curve Estimation: An Application to Endometriosis Diagnosis. Journal of the American Statistical Association. 2015;110(511):923–34. pmid:26839441
View Article
PubMed/NCBI
Google Scholar

[480] View Article

[481] PubMed/NCBI

[482] Google Scholar

[ref131] 131. Alonzo TA, Pepe MS. Using a combination of reference tests to assess the accuracy of a new diagnostic test. Statistics in Medicine. 1999;18(22):2987–3003. pmid:10544302
View Article
PubMed/NCBI
Google Scholar

[484] View Article

[485] PubMed/NCBI

[486] Google Scholar

[ref132] 132. Schiller IvS M.; Hadgu A.; Libman M.; Reitsma J. B.; Dendukuri N. Bias due to composite reference standards in diagnostic accuracy studies. Statistics in Medicine. 2016;35(9):1454–70. pmid:26555849
View Article
PubMed/NCBI
Google Scholar

[488] View Article

[489] PubMed/NCBI

[490] Google Scholar

[ref133] 133. Naaktgeboren CA, Bertens LC, van Smeden M, de Groot JA, Moons KG, Reitsma JB. Value of composite reference standards in diagnostic research. Bmj. 2013;347:f5605. pmid:24162938
View Article
PubMed/NCBI
Google Scholar

[492] View Article

[493] PubMed/NCBI

[494] Google Scholar

[ref134] 134. Tang S, Hemyari P, Canchola JA, Duncan J. Dual composite reference standards (dCRS) in molecular diagnostic research: A new approach to reduce bias in the presence of Imperfect reference. Journal of Biopharmaceutical Statistics. 2018;28(5):951–65. pmid:29355450
View Article
PubMed/NCBI
Google Scholar

[496] View Article

[497] PubMed/NCBI

[498] Google Scholar

[ref135] 135. Bertens LC, Broekhuizen BD, Naaktgeboren CA, Rutten FH, Hoes AW, van Mourik Y, et al. Use of expert panels to define the reference standard in diagnostic research: a systematic review of published methods and reporting. PLoS medicine. 2013;10(10):e1001531. pmid:24143138
View Article
PubMed/NCBI
Google Scholar

[500] View Article

[501] PubMed/NCBI

Figures

Abstract

Objective

Study design and settings

Results

Conclusion

Introduction

Methodology

Eligibility criteria

Inclusion.

Exclusion.

Search strategies and selection of articles

Data synthesis

Results

Methods employed when gold standard is missing

Correction methods

Methods with multiple imperfect reference standards

Other methods

Guidance to researchers

Box 1: Suggestions when designing a diagnostic accuracy study.

Discussion

Conclusion

Supporting information

S1 Checklist. PRISMA checklist.

S1 Data. Data extraction form.

S1 Appendix.

S1 Supplementary Information.

Acknowledgments

References