Article Text

Download PDFPDF

Submaximal exercise testing in the assessment of interstitial lung disease secondary to systemic sclerosis: reproducibility and correlations of the 6-min walk test
  1. M H Buch1,
  2. C P Denton2,
  3. D E Furst3,
  4. L Guillevin4,
  5. L J Rubin5,
  6. A U Wells6,
  7. M Matucci-Cerinic7,
  8. G Riemekasten8,
  9. P Emery9,
  10. H Chadha-Boreham10,
  11. P Charef10,
  12. S Roux10,
  13. C M Black2,
  14. J R Seibold1
  1. 1University of Michigan Scleroderma Program, University of Michigan Health System, Ann Arbor, Michigan, USA
  2. 2Centre for Rheumatology, Royal Free Campus, Royal Free and University College Medical School, London, UK
  3. 3Division of Rheumatology, UCLA School of Medicine, University of California, Los Angeles, California, USA
  4. 4Department of Internal Medicine, Hopital Cochin, Assistance Publique-Hopitaux de Paris, Universite Rene Descartes, Paris, France
  5. 5Division of Pulmonary and Critical Care Medicine, Department of Medicine, University of California, San Diego, La Jolla, California, USA
  6. 6Interstitial Lung Disease Unit, Royal Brompton Hospital, Fulham, London, UK
  7. 7Division of Rheumatology, Department of Medicine, University of Florence, Florence, Italy
  8. 8Department of Rheumatology and Clinical Immunology, Charite University Hospital, Berlin, Germany
  9. 9Academic Unit of Musculoskeletal Disease, University of Leeds, Leeds, UK
  10. 10Actelion, Allschwil, Switzerland
  1. Correspondence to:
    Professor J R Seibold
    University of Michigan Scleroderma Program, 3918 Taubman Centre, Box 0358, 1500 E Medical Center Drive, Ann Arbor, MI 48109-0358, USA; jseibold{at}


Background: The 6-min walk test (6MWT) is increasingly used as an outcome measure in interstitial lung disease (ILD).

Aim: To evaluate the usefulness of the 6MWT in a cohort of patients with ILD secondary to systemic sclerosis (SSc) and to correlate with established physiological parameters.

Methods: 163 patients with SSc-ILD were recruited for a multicentre, randomised, double-blind clinical trial. Available data at protocol screening included repeated 6MWTs, pulmonary function testing with diffusing capacity, Doppler echocardiography and high-resolution computed tomography of the thorax. Borg Dyspnoea Index was evaluated before and after 6MWT.

Results: Mean (standard deviation (SD)) distance walked during walk test 1 was 396.6 (84.55) m compared with 399.5 (86.28) m at walk test 2. The within-subject, intertest correlation as determined by Pearson’s correlation coefficient testing was 0.95 (p<0.001). However, only weak correlations of 6MWT with percentage forced vital capacity and the Borg Dyspnoea Index were observed, and no correlation was observed with percentage diffusing capacity.

Conclusion: These data confirm the high reproducibility of the 6MWT in patients with SSc-ILD and therefore the validity of the test in this cohort. The lack of correlation of 6MWT with standard physiological parameters of ILD suggests a multifactorial basis for limited exercise capacity in patients with SSc and calls into question the utility of the 6MWT as a measure of outcome in future studies on SSc-ILD.

  • 6MWT, six-min walk test
  • DLCO, single-breath diffusing capacity
  • FVC, forced vital capacity
  • HRCT, high-resolution computed tomography
  • IIP, idiopathic interstitial pneumonia
  • ILD, interstitial lung disease
  • OMERACT, outcome measures in rheumatological clinical trials
  • PAH, pulmonary arterial hypertension
  • SSc, systemic sclerosis

Statistics from

Pulmonary involvement in systemic sclerosis (SSc) has emerged as the leading cause of disease-related morbidity and mortality.1,2 It is characterised by pulmonary arteriolar fibrosis leading to pulmonary arterial hypertension (PAH), interstitial lung disease (ILD) that usually manifests as non-specific interstitial pneumonia or an admixture of both processes. Moderate to severe restrictive lung disease has been identified in up to 40% of patients.3 Early reductions in forced vital capacity (FVC) represent an important risk factor for the development of end-stage ILD, with the greatest rate of deterioration observed in the first 4 years.3,4

The unfavourable clinical outcome associated with pulmonary complications of SSc has fuelled an intense search for new therapeutic strategies and an emphasis on the early detection of pulmonary pathology. This in turn has stimulated investigations of the utility of outcome measures appropriate for short-term and moderate-term clinical trials. Presently, pulmonary function testing and submaximal exercise testing in the form of a 6-min walk test (6MWT) are used to provide information on the severity of impairment of lung physiology and functional capacity respectively.

Despite its increasing use in studies on fibrotic idiopathic interstitial pneumonia (fibrotic IIP), the 6MWT has not been validated in SSc-ILD. A recent study was designed to evaluate bosentan, a dual endothelin-receptor antagonist, in patients with SSc-ILD. This report uses the 6MWT data from the screening phase of this study to determine the reproducibility of the 6MWT, to correlate results with pulmonary function indices and to examine its characteristics as a measure of outcome.


Study population

This study was conducted with approval from the institutional review board, and was a multicentre, double-blind, randomised, placebo-controlled, 12-month study comparing bosentan with placebo as treatment for SSc-ILD. A total of 163 patients were recruited from 29 centres in 10 countries and completed the screening phase of the study.

The eligibility criteria included (1) age ⩾18 years; (2) SSc as defined by the American College of Rheumatology classification criteria;5 and (3) diagnosis of limited or diffuse SSc as defined by the extent of skin involvement.

Pulmonary involvement required (1) severe ILD as evidenced on high-resolution computed tomography (HRCT; reticular changes with or without ground-glass appearances extending to venous confluence); (2) single-breath diffusing capacity (DLCO) <80% predicted; and (3) dyspnoea-limited 6MWT distance of ⩾150 m and <500 m, or ⩾500 m with marked exercise desaturation.

Characteristics of SSc included disease duration <3 years (from first non-Raynaud’s symptom) and dyspnoea on exertion, or SSc duration ⩾3 years (from first non-Raynaud’s symptom) and two of the following four criteria within the past 12 months: (1) worsening of FVC (⩾7%) or worsening of DLCO (⩾10%); (2) new area(s) of ILD on HRCT scan; (3) worsening of dyspnoea; or (4) alveolitis on bronchoalveolar lavage (does not apply to current and former smokers)—that is, differential count in neutrophils ⩾5% or in eosinophils ⩾4%.

These various criteria were believed to select a population with “active” SSc-ILD.

Patients with ILD due to conditions other than SSc were excluded. Severe comorbidities that would limit life expectancy (<1 year), severe restrictive lung disease (FVC <40% predicted (or <1.2 l) or DLCO <30% predicted) or severe obstructive lung disease (forced expiratory volume in 1 s/FVC <0.65) were also exclusion criteria as was oxygen saturation on pulse oximetry (SpO2) <84% on room air and at rest. Patients with severe PAH (tricuspid regurgitation peak velocity ⩾3.2 ms (pulmonary arterial systolic pressure ⩾50 mm Hg)) confirmed by Doppler echocardiography at rest, were also excluded. Other exclusion factors were severe heart failure (New York Heart Association Class III/IV), left ventricular ejection fraction <25% and treatment with oral corticosteroids (>10 mg/day prednisone or equivalent), and recent use of potentially confounding treatments.

6MWT (with immediate evaluation of Borg Dyspnoea Index) and pulmonary function testing, and HRCT of the thorax (unless performed within the preceding 3 months) were performed at screening.

6MWTs 1 and 2 were performed at screening and baseline (at a minimum 2-h and a maximum 4-week interval). The mean of the two 6MWT measurements was taken as the baseline measurement. 6MWT was required to be limited by dyspnoea, as judged by the administering technician. To ensure consistency of testing, the 6MWT 2 was required to be within 15% of 6MWT 1. In the event of >15% variability, a third test was required (6MWT 3) to be within 15% of 6MWT 2 or the patient was excluded.

Pulmonary function test

Pulmonary function indices at screening included FVC and single-breath DLCO according to American Thoracic Society standards.6,7

Six-min walk test

The 6MWT was performed as per American Thoracic Society guidelines8 and designed to ensure accuracy of assessment. It was undertaken at room air without additional oxygen and administered by the same tester not involved in the patient’s daily care, at the same location and time throughout the study. After the walk test, the person administering the test obtained a rating of dyspnoea using the Borg Scale.

Patients were instructed as follows: “The object of this test is to walk as far as possible for 6 min. You will walk back and forth in this hallway. Six minutes is a long time to walk, so you will be exerting yourself. You will probably get out of breath or become exhausted. You are permitted to slow down, to stop, and to rest as necessary. You may lean against the wall while resting, but resume walking as soon as you are able to”.

The 6MWT was terminated before 6 min elapsed if SpO2 went below 80%, the patient was exhausted, experienced chest pain, or intolerable leg cramps or in case of diaphoresis. Total distance walked (6MWT distance), and baseline and post-test Borg Dyspnoea Indices were documented.

Borg Dyspnoea Index

The Borg Scale is a well-validated scoring system (on a 0–10-point scale) to gauge a person’s perceived effort of exertion and degree of fatigue experience (0 = nothing at all, 10 = maximum).9

Statistical analyses

Means with standard deviations (SD) were calculated for the walk test and pulmonary function testing indices. Pearson’s correlation coefficient was applied to determine the reproducibility of the two walk tests. Pearson’s correlation test was also used to determine any correlation between the walk test distance and the pulmonary function test and Borg Dyspnoea Indices.


Disease characteristics

The dataset comprised 163 patients, 122 women and 41 men. In all, 95 (58%) patients were classified as having diffuse SSc and 68 (42%) as limited SSc. Mean (SD) age was 52.3 (11.6) years, with mean (SD) disease duration of 6.39 (6.47) years (table 1).

Table 1

 Baseline patient characteristics

Reproducibility of 6MWT

Of the 163 patients randomised, 152 showed <15% variability between 6MWTs 1 and 2, with 11 requiring a third test. All of these 11 patients also showed <15% variability on repeat 6MWT between 6MWTs 2 and 3. The last two walk tests were included in the analysis. Mean (SD) distance walked during walk test 1 was 396.6 (84.55) m compared with a mean (SD) of 399.5 (86.28) m at walk test 2, giving an overall mean (SD) distance of the two walks of 398 (84.29) m. Hence, the mean (SD) absolute difference for an individual patient between the two walks was 20.75 (18.5) m. These recordings showed a within-subject, intertest correlation of 0.95 (p<0.001) by Pearson’s correlation coefficient test (table 2, fig 1).

Table 2

 Reproducibility of 6-min walk test and Borg Dyspnoea Index

Figure 1

 Inter-test reproducibility of the 6-min walk test (6MWT) distance. Two 6MWTs were performed at randomisation, with the total distance walked being recorded (m). The high positive correlation observed with a Pearson’s correlation coefficient of 0.95 confirms good reproducibility.

Mean distance walked for the two 6MWTs performed at randomisation are shown, with the overall mean and the mean absolute difference. The two tests showed significantly high intertest reproducibility, with a Pearson’s correlation coefficient (r) of 0.95. Mean Borg Dyspnoea Index completed on the same two occasions was also included.

Borg Dyspnoea Index

The mean (SD) Borg Dyspnoea Index was 2.75 (2.05) after walk test 1 and 2.79 (2.0) after walk test 2, yielding an overall mean (SD) score of 2.77 (1.91) and a mean (SD) absolute difference between the two assessments of 0.8 (1.1).

Pulmonary function testing

The mean (SD) predicted DLCO was 46% (12.4%; range 21–78%) and mean (SD) predicted FVC was 71% (15.4%; range 38–119%).

Correlation of 6MWT with pulmonary function indices and Borg Dyspnoea Index

The 6MWT distance weakly correlated with Borg Dyspnoea Index, with a Pearson’s correlation coefficient of −0.28 (p<0.001; table 3). A weak correlation was evident between 6MWT distance and percentage predicted FVC (r = 0.19; p<0.02). The 6MWT distance did not correlate (r = 0.06) with percentage predicted DLCO (p<0.5). A weak correlation of % predicted DLCO with Borg Dyspnoea Index (r = −0.17; p<0.04) was found.

Table 3

 Correlation of 6-min walk test with Borg Dyspnoea Index and pulmonary function indices

No striking correlations were observed between the 6MWT distance and pulmonary function or Borg Dyspnoea Indices. Weak correlations were evident between 6MWT distance and Borg Dyspnoea Index and percentage predicted.


Organ-based complications form the current focus of management of SSc, with pulmonary involvement being a dominant clinical concern. The 6MWT is increasingly used as a functional tool to complement physiological data from pulmonary function testing in the assessment of ILD. The objective of this study was to evaluate the utility of the 6MWT in a well-characterised SSc-ILD cohort. The primary finding of this study is the high intertest reproducibility of the test. However, we failed to show a correlation of the 6MWT with pulmonary function parameters.

The international SSc research community has embraced the OMERACT (outcome measures in rheumatological clinical trials) initiative, the principal aim of which is to achieve data-driven, consensus opinion on optimal outcome measures. The 6MWT has been discussed at OMERACT workshops as an outcome measure predominantly in relation to PAH in SSc.10 We therefore considered the OMERACT filter in the context of our results—that is, applying the three categories of “truth” (face, content, construct and criterion validity), “discrimination” (reliability and sensitivity to change) and “feasibility”.

Firstly, the 6MWT has face validity with regard to assessing SSc-ILD; clearly some form of exercise capacity assessment would be important in evaluating its effect.

The availability of the 6MWT at investigational sites and its practicality both in specialised and community practices make it an eminently feasible test. Further, the remarkably high correlation coefficient (r = 0.95; p<0.001) between the two walk test distances in this study confirmed high reproducibility and therefore reliability of the 6MWT in this cohort. This is consistent with findings from a similar study on fibrotic IIP.11

Discriminant validity (sensitivity to change) is not dealt with in our data and remains undetermined. However, we do know that increases in 6MWT distance in patients with primary pulmonary hypertension and SSc-associated PAH after short-term administration of investigational drugs has enabled Food and Drug Administration and European approval of these agents such that this test is now established as a primary clinical outcome measure in the management of PPH and SSc-associated PAH.12–,14

Although the 6MWT has content validity in chronic obstructive pulmonary disease,15 congestive heart failure16 and PAH, our data show that this is not the case in SSc-ILD. Our results also seem to show a lack of criterion validity. FVC represents the gold-standard measure of ILD. In fibrotic IIP, baseline lung function parameters are predictive of an adverse prognosis,3,4,17,18 and the 6MWT correlates with pulmonary function variables, particularly DLCO.11 Reduced FVC (and DLCO) also provide important prognostic information in SSc-ILD.19 Yet, our results do not confirm a significant correlation between the 6MWT distance and FVC, DLCO or the Borg Dyspnoea Index. This raises the question of what exactly is being measured by the test. This comes under further scrutiny with the observations made by Bouros et al.19 In patients with NSIP, FVC remained stable, with the lack of major decline in serial pulmonary function tests over a 3-year period after presentation. Two recent double-blind, placebo-controlled, randomised studies also showed little change in lung function parameters despite improvement in quality of life measures and dyspnoea indices.20,21 Together these data suggest multiple confounding variables in the assessment of SSc-ILD. The lack of an agent with robust therapeutic effect could partly explain this poor sensitivity to change of FVC. Although the inclusion criteria used in this study identified a patient proband typical of SSc-ILD, we excluded subjects with either very mild or more severe lung impairment. The utility of exercise testing in these subgroups of SSc-ILD remains to be determined.

Comorbidity and subtle cardiopulmonary responses not detectable by current lung function tests may also contribute to the findings described above. Skin and musculoskeletal components of SSc could lead to ineffective performance of exercise testing. Multisystem disease and inactivity may lead to deconditioning and subsequent reduced cardiopulmonary capacity. Changes in cardiac physiology, including the effect of left ventricular systolic or diastolic dysfunction22–,24 on exercise capacity in the SSc group, is another area that has not been fully evaluated. Similarly, the importance of right ventricular diastolic dysfunction in SSc25 is unclear. This study did not use serial Doppler echocardiography or detailed left ventricular functional parameters, thereby limiting the ability to test the above parameters. Doppler echocardiography estimations in pulmonary arterial systolic pressure may also compromise patient inclusion.26

Other possibilities not specific to SSc include variability in FVC, especially in those subjects with better preserved DLCO and interlaboratory variation in DLCO measurement that may also have a bearing on the lack of correlation between 6MWT and DLCO.

Finally, variable responses to exercise testing remain unexplored. Different psychological responses to exercise (unrelated to disease) could lead to either a training effect or decompensation. Separately, temporal and adaptive changes in breathing patterns could also facilitate exercise performance. Indeed, such factors are suggested by the considerable variation in the 6MWT distance (but static lung function) observed in a recent study on fibrotic IIP evaluating the efficacy of etanercept.27

In summary, these data from a large SSc-ILD cohort provide convincing evidence for the intertest reproducibility of the 6MWT, adding further validity to its use. However, the lack of criterion validity and the poor correlation with gas-exchange measurements raises important questions on the overall suitability of this test in SSc-ILD. The need for effective, targeted treatments to truly assess the 6MWT as an appropriate outcome measure, and clarification of the relative contributions of non-pulmonary manifestations of SSc on exercise capacity will hopefully provide further insight into this challenging area of outcome measurement in SSc-ILD and lead to the application of a tool in line with the key OMERACT principles.



  • Published Online First 25 July 2006

  • Funding: This study was supported by Actelion Pharmaceuticals, Allschwil, Switzerland. MHB and JRS were supported by the Jonathan and Lisa Rye Scleroderma Research Fund and the Linda Dolce Research Fund of the University of Michigan, Ann Arbor, Michigan, USA.

  • Competing interests: HC-B, PC and SR work for Actelion Pharmaceuticals. CPD, DEF, LG, LJR, AUW, MM-C, CMB and JRS are on the Actelion Steering Committee and have acted as advisors and investigators in clinical trials.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.