Article Text

Normal reference values for aerobic fitness in cystic fibrosis: a scoping review
  1. Owen W Tomlinson1,2,3,
  2. Curtis A Wadey1,
  3. Craig A Williams1,2
  1. 1Children’s Health and Exercise Research Centre, Department of Public Health and Sport Sciences, Faculty of Health and Life Sceinces, University of Exeter, Exeter, UK
  2. 2Academic Department of Respiratory Medicine, Royal Devon University Healthcare NHS Foundation Trust, Exeter, UK
  3. 3Department of Clinical and Biomedical Sciences, Faculty of Health and Life Sciences, University of Exeter, Exeter, UK
  1. Correspondence to Dr Owen W Tomlinson; O.W.Tomlinson{at}


Objective The importance of aerobic fitness (VO2peak) in cystic fibrosis (CF) is well established, and regular exercise testing is recommended. To standardise VO2peak, a ‘percentage of predicted’ (%pred) derived from normative reference values (NRV), as promoted by the 2015 European Cystic Fibrosis Society Exercise Working Group (ECFS EWG), can be reported. However, the NRVs used in CF and their relative frequency is unknown.

Method A scoping review was performed via systematic database searches (PubMed, Embase, Web of Science, SciELO, EBSCO) and forward citation searches for studies that include people with CF and report VO2peak as %pred. Studies were screened using Covidence, and data related to patient demographics, testing modality and reference equations were extracted. Additional analyses were performed on studies published in 2016–2021, following the ECFS EWG statement in 2015.

Results A total of 170 studies were identified, dating from 1984 to 2022, representing 6831 patients with CF, citing 34 NRV. Most studies (154/170) used cycle ergometry, 15/170 used treadmills, and the remainder used alternative, combination or undeclared modalities. In total, 61/170 failed to declare the NRV used. There were 61 studies published since the ECFS EWG statement, whereby 18/61 used the suggested NRV.

Conclusion There is a wide discrepancy in NRV used in the CF literature base to describe VO2peak as %pred, with few studies using NRV from the ECFS EWG statement. This high variance compromises the interpretation and comparison of studies while leaving them susceptible to misinterpretation and limiting replication. Standardisation and alignment of reporting of VO2peak values are urgently needed.

  • Aerobic fitness
  • Review
  • Respiratory

Data availability statement

All data relevant to the study are included in the article or uploaded as online supplemental information. Not applicable.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Aerobic fitness is a valuable outcome in people with cystic fibrosis and can be presented as a ‘per cent of predicted’ against a normative reference value to aid clinical decision making.

  • However, the normative reference values used in cystic fibrosis and how often they are used are unknown.


  • This review shows a wide variation in the number, and frequency, of normative values, used to describe aerobic fitness as a ‘per cent of predicted’ in cystic fibrosis.

  • Approximately one-third of studies fail to state which normative values they used, which has notable consequences on the interpretation of data.


It has been well established that aerobic fitness (as represented by peak oxygen uptake, VO2peak) is an important biomarker in people with cystic fibrosis (CF). A higher level of aerobic fitness is associated with a higher risk of early mortality or transplant,1 reduced risk of being hospitalised,2 and enhanced quality of life.3 As such, regular exercise testing is recommended4–6 for people with CF to monitor changes and guide exercise training interventions to improve fitness.

Cardiopulmonary exercise testing (CPET) is noted as the gold-standard procedure for assessing fitness and establishing VO2peak (and where possible, maximal oxygen uptake, VO2max7) and is typically performed using cycle or treadmill ergometry.5 Moreover, there are multiple ways to report VO2peak data, whereby this is typically displayed in either: (A) absolute units (mL/min), although this does not account for body size and therefore smaller individuals can be unfairly penalised or (B) relative to body mass (mL/kg/min), although these reports can be biased by body composition, that is, those individuals with larger muscle mass can be unfairly penalised and misclassified as having low fitness. There are several further assumptions and errors in using these approaches,8 and therefore precautions should be made prior to their use in reports.

Consequently, presenting data as a ‘per cent of predicted’ (%pred)—reported relative to an expected value for a certain age, sex, height and weight—can be used to present data in an intuitive way that can be easily understood by clinicians and patients alike. Using %pred in CF is commonplace for scoring values derived from spirometry, such as forced expiratory volume in one second and forced vital capacity. To facilitate this, normal reference values (NRV) are available for lung function,9 and are used routinely in registry reports.10 The available lung function NRVs are multi-ethnic, derived from ~100 000 patient records in over 30 countries, and are collaboratively developed by multiple international organisations, leading to widespread acceptance as the gold-standard NRV for spirometry.9 However, unlike spirometry, there is no universal agreement on the most appropriate NRV to use for CPET, and interpretation of VO2peak.

Recent literature reviews have identified a high volume of NRV available,11 12 with 29 sets of NRV dedicated to CPET parameters from 2014 to 2019 alone.12 These NRV are not wholly focused on VO2peak, and also include reference to work rate, peak heart rate, oxygen pulse and ventilation, among others.11 12 This heterogeneity of NRV presents a dilemma for clinicians as it is not clear which is the ‘correct’ NRV (and parameter) to use. To facilitate this choice, the European Cystic Fibrosis Society Exercise Working Group (ECFS EWG) has published a statement on exercise testing in CF,5 detailing protocols and strategies for implementing and interpreting CPET data, including VO2peak. As part of this statement, several sets of NRV have been recommended for use, dependent on exercise modality (table 1). However, since the publication of this statement, it is unclear to what extent these have been adopted for use; and to what extent NRV are generally used in the CF literature base. Recent survey work of CF clinics in the UK has established a wide variation in NRV used for interpreting CPET,13 suggesting that this variation in available literature may translate to variable implementation in clinical practice.

Table 1

Normative reference values recommended for use by European Cystic Fibrosis Society Exercise Working Group (ECFS EWG)

Therefore, the purpose of this scoping review was to establish which NRV are used to report VO2peak as %pred in the CF literature and identify how many studies since the publication of the ECFS EWG statement used the recommended NRV.


Search strategy

A multifaceted search strategy, guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist,14 was used, with three components:

  1. A search using the terms [(cystic fibrosis) AND (vo2* OR vo2max OR vo2peak)] was employed in the PubMed, Embase (Ovid MEDLINE, APA PsycInfo, Embase, HMIC Health Management Information Consortium, Social Policy and Practice, Global Health, CAB Abstracts, APA PsycExtra), Web of Science (Science Citation Index Expanded [SCI-EXPANDED], Emerging Sources Citation Index [ESCI], Conference Proceedings Citation Index- Science [CPCI-S], Social Sciences Citation Index [SSCI]), SciELO, and EBSCO (The Allied and Complementary Medicine Database [AMED], Child Development & Adolescent Studies, CINAHL Complete, MEDLINE, SPORTDiscus) databases, from inception to December 2021. Articles were then screened using freely available specialist software (Covidence, Veritas Health Innovation, Melbourne, Australia).

  2. Forward citation searches from two key papers in the CF and exercise literature. First, the ECFS EWG Statement from Hebestreit et al5—the only CF-specific exercise testing document to date—advocates for the aforementioned equations to report normative data. Second, the landmark study of Nixon et al15—the first to establish the association between VO2peak as a per cent of predicted and mortality—thus becoming a cornerstone study in the field with hundreds of citations. Forward citations were obtained from Web of Science, from respective publication dates to December 2021, filtered to only include ‘article’ and ‘early view’ studies.

  3. A manual search of PubMed, using the term [(cystic fibrosis) AND (exercise)], from inception to December 2021.

All searches and screening were undertaken by a single author (OT). Double-screening was not performed to increase the speed of conducting the scoping review.

Inclusion/exclusion criteria

Articles were included if they satisfied the following: (1) original investigation, (2) partial or complete inclusion of people with CF, (3) inclusion of VO2max or VO2peak data as a directly measured outcome, and (4) VO2max or VO2peak presented as %pred.

Studies were excluded if they were: (1) not original investigation (eg, review, protocol paper, conference abstract), (2) did not include people with CF, (3) did not include VO2max or VO2peak data (ie, only submaximal data), (4) VO2max or VO2peak not presented as a percentage of predicted (ie, only L/min, mL/kg/min). No exclusions were made based on language.

Data extraction

Once studies were screened, identified and selected, full texts were retrieved, and the following data related to the study extracted: study title and year of publication; participant sample (sample size, age, sex and the number of people with CF if part of a larger cohort); testing modality used for determination of VO2peak; NRV cited and year of publication. In studies that cited a further study for methodology (eg, ‘this test was conducted as previously described by (author)’), the original reference was traced and examined to determine the exact NRV used.

A list of the cited NRV studies was also compiled, with individual equations extracted from each study, alongside the derived population (sample size, age, sex) and the testing modality used to derive VO2peak.

Quality assessment/risk of bias

This scoping review aimed to obtain descriptive data on NRV equations used within the literature base. Therefore, a formal risk of bias (RoB) was not applicable, and no such tool was available. However, a customised RoB approach was designed, verifying whether a study citing an NRV equation was doing so correctly.

This verification process included examining categories of sex, age, modality and date (two categories). Within this process, studies could be awarded ‘yes’, ‘unknown’ or ‘no’ status and be awarded +1, 0, or −1 points, respectively (ie, a study to correctly use all five categories would be awarded five points); akin to ‘low’, ‘moderate’ and ‘high’ RoB seen in traditional scoring models. A full explanation and examples of RoB are provided in online supplemental file 3.

Supplemental material

A quasi-random sample of 10% of studies—identified using an online pseudo-randomisation programme (CalculatorSoup, was independently verified by a second author (CAWa). If any disputes arose, a third coauthor (CAWi) was consulted to resolve conflicts.

Statistical analysis

Analyses compromised absolute frequencies and percentages. Separate and combined analyses related to RoB were undertaken for studies to cite NRV and those that did not. Additional frequencies and percentages were undertaken to identify which NRV recommended by the ECFS EWG is used within the CF literature.


Included studies and study characteristics

Following searches and screening, a total of n=170 eligible studies were identified, with a PRISMA flow diagram14 provided in figure 1. A full list of studies, with individual characteristics, including sample, exercise modality and NRV used, is provided in online supplemental file 1.

Supplemental material

Figure 1

PRISMA flow chart detailing identification and inclusion of studies in scoping review. CF, cystic fibrosis; NR, narrative review; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses; SR, systematic review; VO2, oxygen uptake; VO2max, maximal oxygen uptake; VO2peak, peak oxygen uptake.

The n=170 studies spanned from 1984 to 2022, covering a total sample of n=6831 people with CF (n=3555 males, n=2711 females, remainder unspecified). Of these studies, n=109 (64%) were published from 1984 to 2015, and n=61 (34%) were published from 2016 to 2022 (postpublication of the ECFS EWG statement). With regard to exercise modality, n=154 used cycle ergometry, n=15 used treadmill ergometry, n=2 were of unknown modality and n=1 for each of 10 m shuttle walk, arm ergometry, and quadriceps exercise, with n=4 studies using more than one modality.

Normal reference values

Of the n=170 studies, 61 (36%) provided no details on the NRV used to present VO2peak data as a percentage of predicted, leaving n=109 studies (64%) to explicitly state which NRV were used. Within these studies, n=34 sets of NRV were used, dating from 1971 to 2019. The mean difference in time between a study and its cited NRV was 18±11 years (median=17 years, range=1–48 years).

Of the n=34 NRV cited, n=18 (53%) were only cited once. Moreover, of the NRV recommended by the ECFS EWG,5 these are cited a total of n=32 times (18% of 179 uses of NRV). Within this n=32, a total of n=18 (56%) of these citations were done so since the statement’s publication. This n=18 also represented 30% of the n=61 studies published since the ECFS EWG statement. None of the NRV recommended for treadmill testing by the ECFS EWG were cited. The n=16 NRV to be cited more than once is provided in table 2.

Table 2

Normal reference values identified by scoping review and count of frequency of use

Supplemental material

Additional analyses for RoB are also performed based on this split of inclusion versus non-inclusion of NRV.

Quality assessment/RoB

From the total of n=170 studies identified, n=179 RoB analyses were performed because n=8 studies used more than one set of NRV. When considering all studies (n=170 studies, n=179 RoB analyses), 50% of studies used NRV that was of an appropriate derivation population for sex, 13% for age, and 18% for modality. Only 8% of studies used an NRV from within the prior 5 years, and 18% from within the prior 10 years. When only considering studies to stipulate the NRV used (n=109 studies, n=118 RoB analyses), 76% used NRV that were of an appropriate derivation population for sex, 20% for age, and 27% for modality. Only 13% used an NRV from the prior 5 years, and 28% from the prior 10 years. A full breakdown for each category RoB is provided in figure 2.

Figure 2

Risk of bias (RoB) assessment for included studies, presented as absolute counts and as percentages. (A) RoB for all studies and analyses, presented as absolute numbers; (B) RoB for all studies and analyses, presented as a percentage; (C) RoB for all studies and analyses to explicitly state NRV used (ie, excluding those who do not state NRV), presented as whole numbers; (D) RoB for all studies and analyses to explicitly state NRV used, presented as a percentage. Red=wrong details/high RoB; yellow=unclear details/moderate RoB; Green=correct details/low RoB. NRV, normal reference value.

Scores ranged from −3, to +5, with 0 being the most common score (n=79) due to the high number of studies to not report NRV used. Otherwise, the most prevalent scores were −1 (n=42) and +1 (n=23). Only n=3 studies were awarded a score of +5 points, matching their cited NRV for age, sex, modality and time frame (≤5 and ≤10 years). A schematic detailing RoB scores and their prevalence are displayed in figure 3. The full RoB analyses are provided in online supplemental file 3.

Figure 3

Number of studies with each risk of bias (RoB) score. Figure details the possible combination of RoB scores (and number of total n=179 analyses with each score). Figure does not state explicit categories themselves (eg, sex, age), but the distribution of possible scores (Y/?/N). This is because equivalent scores can be obtained via multiple categories and methods (eg, a score of +3 can be obtained by four+1 scores and a −1 score, but also via three +1 scores and two 0 scores—all regardless of explicit category).


For the first time, this scoping review has characterised the reference values and equations used to characterise VO2peak as a ‘per cent of predicted’ in people with CF. Given the inherent value of VO2peak in the clinical management of this disease, the main finding is that approximately one-third of studies do not report the NRV used—and the wide range of NRV used (34 in total)—is a cause for concern.

Reporting of values

The lack of reporting in this one-third of studies is concerning, as this under-reporting introduces bias and can misrepresent the data.16 If, for example, a study consists of adult participants but uses an NRV designed for a paediatric population, this will likely result in inflation of results (ie, scoring better than anticipated) and thus can inadvertently manipulate the data. Without the reporting of the cited NRV, assurances that such practices do not occur cannot be guaranteed. Moreover, the unavailability of methodological details has been noted as a contributory factor to the current replication crisis facing the wider scientific community,17 and this scoping review found that the CF literature base is not immune from this problem.

Conversely, approximately two-thirds of studies (64%) did indeed provide data on the NRV used to describe VO2peak as a per cent of predicted, although only 32 studies used NRV suggested by the ECFS EWG. While this large proportion, declaring the NRV, could initially be considered an encouraging statistic, there is a notable range in the volume of NRV used, whereby 34 distinct sets of values are used, most are only used once. A lack of agreement on which NRV to use is reflected in recent survey work,13 whereby CF clinics in the UK present with wide variation in NRV used, and a lack of understanding on what constitutes the best set of values to use. There is equally a wide level of variation seen in the NRV recommended for use by leading medical organisations, and their documentation for how to perform CPET in a clinical scenario.5 18–20

Quality of reporting

In addition to the wide range of NRV used, very few studies are using NRV that are recent, and have an appropriately matched derivation population (age, sex, modality and recency); reflected by only three studies in this review scoring a perfect five points for RoB, as indicated in figure 3. This finding does not mean that other studies are deficient in their respective study designs, as many are high-quality randomised control trials and cohort studies that are well designed and executed, nor that they are deliberately using inappropriate NRV. It will mean, however, that studies are citing NRV that are deficient in their own reporting, and the literature base itself is limited by the number of NRV that robustly report how data is generated. For example, one NRV recommended by the ECFS EWG is that of Orenstein21—being cited 20 times by studies in this review. However, on inspection of this work, no information is available on the characteristics of the derivation population and therefore, the studies that cite this work still cannot be assured that they are using an appropriate NRV for their own population—therefore being awarded few points for RoB in this scoping review.

There is notable heterogeneity in how NRV are derived, as shown by the equations in online supplemental file 2, whereby some studies solely use age to derive an NRV22–24, whereas some will incorporate further variables such as height and weight5 26, and further studies will use exercise-derived factors such as heart rate or time to exhaustion.27 28 This variance in how NRV are established can have notable impact on NRV selection, particularly if studies are not collating certain types of data, or NRV are not suitable for the population in question.

Implications for clinical practice

This discrepancy in the NRV used in clinical situations can have genuine adverse clinical impacts, as highlighted in a recent case report from Waterfall et al,29 whereby a patient underwent exercise testing at two different hospitals (who used two different sets of NRV) with a delay in medical treatment occurring as a result. In addition, use of multiple NRV can result in alternative interpretations of the same data, as shown by a paper within this review30 who used two sets of NRV to reveal one statistically significant, and one non-significant, result for VO2peak as per cent predicted, despite the underlying raw data being the same. Such manipulation of data is poor practice and has partially occurred by virtue of the number of NRV available. This case therefore indicates the drastic clinical consequences that can occur due to the ack of standardisation and use of differing NRV.

It should also be noted that this lack of consistency in reporting is not limited to VO2peak, and therefore, variables for which NRV exist, such as work rate, heart rate, oxygen pulse, ventilation etc,11 12 are all equally likely to be affected by poor and inappropriate reporting as shown in the current work. Moreover, this is not a phenomenon wholly related to clinical groups. For example, interpretation of exercise responses in children can vary on choice of heart rate thresholds,31 and can impact on determination of a true VO2max, or potentially submaximal response. For children with clinical considerations, this can have even further negative impacts.

To counter the negative findings within this scoping review—a lack of reporting, and wide variation in data to be reported—the wider exercise and clinical physiology community must take action. Several large NRV studies and databases exist,32–35 and therefore, pooling of data has been advocated for by leading organisations,18 36 to create a singular and comprehensive set of normative values. Therefore, a Task Force has been established by the European Respiratory Society (ERS; TF-2021–09),37 in collaboration with the Global Lung Initiative (GLI), to create such a database for a range of CPET values, including VO2peak. The GLI has previously created reference values for spirometry9 and enhanced interpretation of lung function in CF38 and therefore it is anticipated that a similar, positive, outcome may be found with this new ERS Task Force.

In the interim, it is not clear which is the most appropriate method by which to present VO2peak, not just for people with CF, but for all populations. As previously mentioned, use of absolute values (L/min) or values normalised to body mass (mL/kg/min) can be biased. Therefore, use of allometric scaling (which removes residual effects of body size) may be a viable option, although several scaling exponents are available,39 and are specific to the measured population and have limited transferability. Therefore, until a solution is found, the authors recommend that clinical and research staff to use CPET should be as open with reporting VO2peak as possible to avoid misinterpretation. This includes simultaneously providing data in (A) absolute values, (B) scaled relative to body mass, (C) allometrically scaled for the specific population, and (D) using %pred, but only if an explicit equation provided, and not just a reference, as the data shown in online supplemental file 2 indicates that a single reference can provide multiple equations, further compounding interpretation of data.

Strengths and limitations

There are several strengths and weaknesses to this scoping review to acknowledge. First, the wide remit for inclusion (ie, CF, and VO2peak as %pred) has led to a notably large number of studies being included, thus enhancing the confidence in the findings. Moreover, referencing the existing ECFS EWG as a source of existing NRV has ensured that this review maintains a high level of clinical relevance. In contrast, as no standardised method for RoB is available for such a scoping review, a customised approach was designed, which will inevitably be open to scrutiny. However, as clinical guidelines recommend that NRV used in studies should match population characteristics and CPET protocols,20 the RoB approach used in this review is deemed an ecologically suitable approach and warrants replication in further clinical groups.


In summary, this scoping review has identified wide discrepancies in how VO2peak is reported as a ‘per cent of predicted’ within the CF literature base. A singular, comprehensive, dataset is required by the wider medical and exercise physiology communities, and it is anticipated that ongoing projects using enhanced reporting and collaborative integration of existing databases will address this gap in the near future.

Data availability statement

All data relevant to the study are included in the article or uploaded as online supplemental information. Not applicable.

Ethics statements

Patient consent for publication


The authors acknowledge Neil Armstrong, Bart Bongers, Christiaan Saris, Wim Saris, and the University of Exeter Library Services for valued assistance in acquiring and sharing literature to ensure the completion of this review.


Supplementary materials


  • Contributors OT: conceptualisation, data searching, data extraction, data analysis, interpretation, writing of manuscript, review of manuscript, approval of final manuscript, guarantor of manuscript. CAWa: data extraction, interpretation, review of manuscript, approval of final manuscript. CAWi: conceptualisation, interpretation, review of manuscript, approval of final manuscript.

  • Funding CAWa is funded by an industrial PhD scholarship by Canon Medical Systems UK Ltd. and the University of Exeter. There is no further funding to report.

  • Competing interests OT and CAWa are members of the European Cystic Fibrosis Exercise Working Group. This work is not written on behalf of, nor endorsed by, this group.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.