Objectives Extent of systemic sclerosis (SSc)-related interstitial lung disease (ILD) assessed from thoracic high-resolution CT (HRCT) predicts disease course, mortality and treatment response. While quantitative HRCT analyses of extent of lung fibrosis (QLFib) or total interstitial lung disease (QILD) are more sensitive and reproducible than visual HRCT assessments of SSc-ILD, these analyses are not widely available. This study evaluates the relationship between clinical disease parameters and QLFib and QILD scores to identify potential surrogate measures of radiographic extent of ILD.
Methods Using baseline data from the Scleroderma Lung Study I (SLS I; N=158), multivariate regression analyses were performed using the best subset selection method to identify one to five variable models that best correlated with QLFib and QILD scores in both whole lung (WL) and the zone of maximal involvement (ZM). These models were subsequently validated using baseline data from SLS II (N=142). Bivariate analyses of the radiographic and clinical variables were also performed using pooled data. SLS I and II did not include patients with clinically significant pulmonary hypertension (PH).
Results Diffusing capacity for carbon monoxide (DLCO) was the single best predictor of both QLF and QILD in the WL and ZM in all of the best subset models. Adding other disease parameters to the models did not substantially improve model performance. Forced vital capacity (FVC) did not predict QLF or QILD scores in any of the models.
Conclusions In the absence of PH, DLCO provides the best overall estimate of HRCT-measured lung disease in patients from two large SSc cohorts. FVC, although commonly used, may not be the best surrogate measure of extent of SSc-ILD at any point in time.
Trial registration numbers SLS I: www.clinicaltrials.gov NCT 00000-4563; SLS II: www.clinicaltrials.gov NCT 00883129.
- Pulmonary Fibrosis
- Systemic Sclerosis
Statistics from Altmetric.com
Previous studies have demonstrated that the extent of interstitial lung disease (ILD) on a baseline thoracic high-resolution CT (HRCT) scan significantly predicts disease course of systemic sclerosis-associated ILD (SSc-ILD) and response to treatment.1–5 In addition, both retrospective and prospective observational studies have found that extent of HRCT-measured ILD significantly predicts mortality in patients with SSc-ILD.6 ,7
Quantitative assessment of fibrotic patterns, in particular, has improved sensitivity and reproducibility compared with visual assessment8 ,9 and provides an objective measurement of treatment efficacy in SSc-ILD.4 However, quantitative HRCT image analysis requires standardised image acquisition and post-hoc processing of image data sets using computer algorithms that are not as yet widely available.
Given the inaccessibility of quantitative HRCT image analysis to most practicing physicians, the present study sought to identify potential surrogate measures of extent of SSc-ILD as measured by this valid approach. While prior studies have examined correlations between the severity/extent of SSc-ILD from HRCT scans and other non-radiographic SSc disease features,10–18 these studies used visual, semiquantitative assessments of extent of reticulations (fibrosis), ground glass opacity (GGO) and/or honeycombing (HC) and are therefore susceptible to observer bias and inter-reader and intrareader variability.16 To our knowledge, this is the first study to evaluate correlations between quantitative analyses of extent of lung fibrosis (QLF) or total ILD (QILD) and traditional physiological and patient-centred indices of disease in patients with active SSc-ILD from two large, randomised, interventional studies (Scleroderma Lung Study I and II, or SLS I and II).
SLS I consisted of 158 participants randomised to oral cyclophosphamide (titrated to 2.0 mg/kg once daily) or matching placebo for 1 year.1 In SLS II, 142 patients were randomised to mycophenolate mofetil (titrated as tolerated to 3.0 g/day in divided doses) for 2 years or oral cyclophosphamide (titrated as tolerated to 2 mg/kg one daily) for 1 year followed by an additional year on placebo using a double-dummy design to maintain the blinding.19 Recruitment for the latter trial and baseline assessments have been completed but the double-blind treatment phase of the trial is ongoing. Eligibility criteria for both studies were similar (see online supplementary text).
Baseline measurements included the following physiological variables: spirometry (forced vital capacity (FVC), forced expiratory volume in 1 s (FEV1) and FEV1/FVC ratio), lung volumes (functional residual capacity, residual volume and total lung capacity (TLC) by whole-body plethysmography or helium dilution), diffusing capacity for carbon monoxide (DLCO) corrected for haemoglobin and the ratio of DLCO to alveolar volume (DL/VA). Published regression equations were used to express these physiological measures as a percentage of the reference values.20–23 FVC/DLCO ratio was also included as a baseline variable. In both studies, pulmonary function test results were read centrally for quality assurance. The following patient-centred measures were obtained in both studies: dyspnoea assessment using the Mahler Baseline Dyspnea Index (BDI) obtained using the interviewer-administered paper version24 in SLS I and the self-administered, computer-assisted version25 in SLS II; modified Rodnan Skin Score (mRSS)26 and the Health Assessment Questionnaire-Disability Index (HAQ-DI),27 as well as the Visual Analogue Scale (VAS) for breathing problems interfering with physical activities.
HRCT scans at baseline were obtained at maximal inspiration in both studies using a standardised protocol monitored by the Radiology Core at University of California at Los Angeles.16 In SLS I, non-volumetric CT scans of 1–2 mm slice thickness were acquired at 10 mm increments, whereas in SLS II volumetric CT scans of 1–1.5 mm slice thickness were acquired contiguously. Scans were reconstructed and entered into a quantitative image workstation to produce quantitative scores. The zonal levels for computer aided design scoring were non-anatomically defined as one-third of the total number of slices covering a full chest into upper, middle and lower lung zones, as described previously.8
HRCT scores for quantitative lung fibrosis (QLFib), GGO and HC were determined separately from the percentage of counts in which the classified abnormal pattern comprised reticular opacity with architectural distortion (QLFib), hazy parenchymal opacity through which normal lung markings were visible in the absence of reticular opacity or architectural distortion (pure GGO) or clustered air-filled cysts with dense walls (HC), respectively. Quantitative ILD (QILD) score was the sum of all abnormally classified scores, including scores for fibrosis, pure GGO and HC. Scores were each summated for the whole lung (WL), including both lungs, and for the zone of maximal involvement (ZM). Because baseline QLFib scores were previously shown in SLS I to correlate with disease progression, QLFib scores were selected as the primary outcomes in the multivariate regression model, whereas QILD scores (the sum of QLFib, GGO and HC scores) were analysed as subsidiary outcomes.
Summary statistics (mean, SD and frequency distribution) were generated for baseline characteristics. A two-sample t test was used to compare continuous variables and the χ2 test was used to compare categorical variables between SLS I and SLS II patients. To identify the physiological and patient-centred variables that correlated best with QLFib scores, we performed multivariate regression analyses using baseline data from SLS I. The best subset selection method was used to identify the one to five non-CT variables that correlated best with CT parameters according to the ‘R2’ criterion.28 Once identified, each regression model was also evaluated for correlations with HRCT parameters in SLS II separately. Similar analyses were also performed using QILD scores. Bivariate analysis was also performed to determine pairwise associations among the baseline radiographic, physiological and clinical features in pooled SLS I and SLS II participants. All tests were two-sided, and all analyses were performed using SAS V.9.2 (SAS Institute, Cary, North Carolina, USA).
While SLS I and SLS II employed very similar entry criteria, a few relatively modest differences were noted with respect to baseline characteristics (table 1). Compared with the participants in SLS I, SLS II patients were slightly older, had approximately 6 months shorter SSc disease duration and had slightly more ventilatory restriction, but less diffusion impairment. The greatest difference was in the BDI total score, possibly due to the implementation of a computerised self-administered version of the BDI test in SLS II instead of the written observer-administered version of the test employed in SLS I. In view of this difference, for the present analysis, we chose the breathing VAS, which showed much closer between-study agreement than the BDI.
Quantitative HRCT image analysis
Extent of lung fibrosis and total ILD based on quantitative HRCT image analysis differed between SLS I and SLS II (table 1). Compared with SLS I participants, SLS II patients had slightly higher scores for fibrosis (QLFib) and total ILD burden (QILD) in the ZM, but slightly lower scores in WL. This comparison may be confounded by the use of discrete imaging slices (1–2 mm sections at 10 mm intervals) in SLS I, in contrast to the use of volumetric scanning in SLS II to obtain WL imaging. Figure 1 shows the quantitative extent of SSc-ILD on coronal and axial HRCT images from two study participants with mild and severe fibrosis, respectively.
Primary analysis: correlates with extent of lung fibrosis (QLFib) in the zone of maximum involvement
The best one to five variable models that correlated with QLFib-ZM in SLS I are presented in table 2. Among all of the baseline non-CT variables examined (eg, age, gender, SSc type, disease duration, percentage reference values for FVC, FEV1, TLC, DLCO, DL/VA, FVC/DLCO, BDI, HAQ-DI, breathing VAS and mRSS), the DLCO significantly correlated with QLFib-ZM in all of the models, including the best one-variable model. Figure 2A and B shows the correlations between DLCO and QLFib-ZM in SLS I and II, respectively.
Based on the QLFib-ZM regression analyses, there was only a marginal improvement in the adjusted r2 value by increasing the number of variables in the model (r2=0.23 for the two-variable and three-variable models; r2=0.24 for the four-variable model and r2=0.25 for the five-variable model, compared with the one-variable model of DLCO% predicted alone (r2=0.22). Moreover, the correlations (r) for all of the models were moderately strong for both SLS I and SLS II (range 0.44–0.54).
Correlates with extent of lung fibrosis (QLFib) in the whole lung
The best one to five variable models that correlated with extent of QLFib-WL in SLS I are presented in table 3. Among all of the baseline non-CT variables examined, the DLCO was also significantly correlated with extent of QLFib-WL in all of the models. Figure 2B and D shows the correlations between DLCO and QLFib-WL in SLS I and II, respectively.
Of the QLFib-WL regression analyses, the two-variable model of DLCO% reference and FEV1/FVC ratio had a slightly higher adjusted r2 value (0.24), compared with the one-variable model of DLCO% reference alone (0.20). However, there was only marginal improvement in the adjusted r2 value by further increasing the variables in the model (r2=0.26–0.28 for the three-variable to five-variable models). In addition, the correlations (r) for all of the models were moderately strong for both SLS I and SLS II (ranged from 0.39 to 0.56).
Relationship between QLFib and pulmonary function measurements
Notably, none of the models for QLFib-ZM or QLFib-WL included FVC% reference. While the TLC% reference came into the two-variable, four-variable and five-variable models for QLFib-ZM, this variable was not a statistically significant independent correlate of extent of QLFib-ZM. As demonstrated in tables 2 and 3, the only variables that were independent correlates of QLFib-ZM in any of the models were DLCO% reference (p<0.0001) and, marginally, diffuse disease subtype (p=0.05). Independent predictors of QLFib-WL in any of the models besides DLCO% reference (p=0.0001) included FEV1/FVC (p=0.02), diffuse disease (p=0.03) and mRSS (p=0.03).
Correlates with extent of QILD in the zone of maximum involvement and the whole lung
When we applied the same methodology to analysing the QILD scores, the results were fairly similar to those obtained in the QLFib analyses. The best one-variable to five-variable models that correlated with extent of QILD-ZM and QILD-WL are presented in online supplementary tables S1 and S2, respectively. Among all of the baseline non-CT variables examined, the DLCO also significantly correlated with extent of QILD-ZM and QILD-WL in all of the models (p<0.0001). There was only marginal improvement in the adjusted r2 value by further increasing the variables in all of the QILD-ZM and QLFib-WL models. Strong correlations were observed between the SLS II and SLS I findings.
Intercorrelations among baseline characteristics (bivariate analysis)
Correlations between radiographic and selected physiological and clinical characteristics in pooled SLS I and SLS II subjects are presented in table 4 and the online supplementary table S3. DLCO was somewhat better correlated with QLFib and QILD than was either FVC or TLC % predicted. The clinical measures (breathing VAS and mRSS) were moderately correlated with each other but poorly correlated with each of the radiographic and physiological variables.
While severity of SSc-ILD is conventionally defined by the degree of ventilatory restriction and diffusion impairment, these physiological variables are indirect and variable surrogates for the extent of structural disease abnormality. In contrast, the extent of ILD, including reticulations (fibrosis), GGO and/or HC on HRCT images, defined by quantitative computer-aided diagnostic techniques,9 is a more direct and precise measure of the underlying pathological process and is one of the best predictors of disease progression and response to therapy.2–7
However, because the methodology and expertise for quantitative scoring of ILD on HRCT images are not widely available and repeated exposure to radiation can be detrimental, the present study examined the relationship between quantitatively scored radiographic indicators of SSc-ILD and more accessible and readily repeatable parameters.
Regression analyses using a best set selection method for predicting the extent of SSc-ILD on HRCT revealed that DLCO% predicted was the single variable that best correlated with the extent of both lung fibrosis and total ILD (more so than the restrictive physiological measures, FVC and TLC) when assessed both in the ZM and in the WL (tables 2 and 3 and online supplementary tables S1 and S2). Similarly, DLCO% predicted was better correlated with the radiographic extent of lung fibrosis and total ILD than either FVC or TLC % predicted in bivariate analyses. These observations suggest that parenchymal lung disease in SSc-ILD may have a greater impact on gas transfer than on lung compliance and is consistent with prior studies.14 ,15 ,17 ,18 Although subclinical pulmonary vascular disease associated with SSc that is independent of ILD could also contribute to the decrement in DLCO, one would expect that the confounding influence of an independent pulmonary vasculopathy would reduce, rather than strengthen, the association of DLCO with the extent of fibrosis or ILD. It is noteworthy that clinically significant pulmonary hypertension was an exclusion criterion for randomisation into SLS I and II.
While TLC was not independently correlated with QLFib, it did come into the two-variable, four-variable and five-variable models for predicting QLFib-ZM. Findings from prior studies examining the relationship between TLC and radiographic extent of ILD are conflicting; some studies demonstrated a significant negative correlation between these two factors,14 ,17 whereas another study found that TLC correlated poorly with disease extent on HRCT.18 Possible explanations for this discrepancy are that most of the prior studies examining these relationships were relatively small (approximately 50 patients per study) and that TLC is a derived variable from direct measurements of functional residual capacity and inspiratory capacity, thereby increasing its variability. The implementation of a centralised quality assurance core for pulmonary function may have helped reduce this variability with respect to our data.
Interestingly, while FVC did show significant, although modest, correlations with extent of QLFib and QILD in bivariate pairwise analyses (table 4), it was not an independent predictor of extent of QLFib or QILD on HRCT in any of the multivariate models. These findings are consistent with observations in prior studies based on visual radiographic assessment13 ,14 ,18 and raise the question as to whether FVC is the best surrogate measure of extent of SSc-ILD. However, although FVC was not the best correlate of QLFib or QILD at baseline, it still appears to be a useful outcome measure, if measured in a standardised manner,29 based on its performance in two randomised controlled trials in SSc-ILD.1 ,30
While the correlations we derived from models incorporating combinations of two to five variables were to a modest degree incrementally higher than those derived from models using DLCO alone (for both WL and ZM), these increments were relatively small; the largest increment occurred between the best one-variable and two-variable models. From a practical standpoint, our findings suggest that the most useful surrogate of the radiographic extent of lung disease in patients with SSc-ILD without clinical evidence of pulmonary hypertension is a measure of diffusion, with little added predictive value from the addition to the model of other physiological or clinical variables.
Not surprisingly, baseline measures, such as the HAQ-DI, were not associated with QLFib or QILD in any of the models. While functional status as measured by the HAQ-DI may be influenced by respiratory impairment, other factors, such as arthritis and myopathy, may affect outcomes on this measure.
Given the good agreement between the QLFib model correlations between SLS I and SLS II, as shown in tables 2 and 3, our findings may be generally applicable to patients with SSc who share the basic characteristics that define the SLS trials, namely, patients with symptomatic, early, progressive SSc-ILD without clinically significant pulmonary vasculopathy. In addition, the QILD models (see online supplementary tables S1 and S2) yielded results similar to those of the QLFib models, further supporting the importance of DLCO% predicted in assessing both the extent of HRCT-defined fibrosis and the ILD.
Moreover, correlations between QLFib and the non-HRCT variables were consistently stronger, although only slightly so, when applied to the ZM compared with the WL. These findings are consistent with our earlier observations that QLFib-ZM scores best predicted subsequent rate of decline in the placebo arm of the SLS I trial,1 ,2 ,7 as well as the response to treatment with cyclophosphamide when assessed by HRCT visually.1 ,2
Strengths of the present analysis include the relatively large sample size in both studies and the use of standardised assessment techniques and central quality control measures.
Our study has limitations. This is a cross-sectional analysis; therefore, it is unknown whether the observed relationships persist over time. Moreover, this analysis is limited to participants enrolled in clinical trials with specific entry criteria, thereby limiting the generalisability of the findings. In addition, ∼20% of the SLS I participants did not have HRCT scans that were technically suitable for quantitative scoring, although we believe that the resulting missing quantitative data could be considered as missing at random. A further potential limitation is that CT slices were obtained at 10 mm increments in SLS I but contiguously in SLS II. However, comparison of quantitative scores for both lung fibrosis and total ILD derived from contiguous versus non-contiguous slices yielded nearly identical results (data not shown).
Another limitation of the current analysis is that using a single data set to select the models may overestimate the model association performances when applied to another data set. However, because SLS I had a large sample size, we expect the estimate biases to be small.31 Moreover, when we applied the models to a separate patient cohort (SLS II), the results were similar.
In summary, DLCO% predicted was the single best surrogate measure of HRCT-defined severity of fibrosis and total extent of ILD in patients with SSc from two large SSc-ILD trials. Additional, longitudinal studies are warranted to determine the correlation between changes in these surrogate measures with changes in the more direct radiographic evidence of disease severity over time.
We are indebted to Ms. Gail Marlis for her invaluable assistance as the SLS Project Manager and to the rheumatology and pulmonary investigators at the SLS I and II clinical centres. The following people and institutions participated in the Scleroderma Lung Study 1: University of California at Los Angeles (UCLA), Los Angeles: P.J. Clements, D.P. Tashkin, R. Elashoff, J. Goldin, M. Roth, D. Furst, K. Bulpitt, D. Khanna, W.-L.J. Chung, S. Viasco, M. Sterz, L. Woolcock, X. Yan, J. Ho, S. Vasunilashorn and I. da Costa; University of Medicine and Dentistry of New Jersey, New Brunswick: J.R. Seibold, D.J. Riley, J.K. Amorosa, V.M. Hsu, D.A. McCloskey and J.E. Wilson; University of Illinois Chicago, Chicago: J. Varga, D. Schraufnagel, A. Wilbur, D. Lapota, S. Arami and P. Cole-Saffold; Boston University, Boston: R. Simms, A. Theodore, P. Clarke, J. Korn, K. Tobin and M. Nuite; Medical University of South Carolina, Charleston: R. Silver, M. Bolster, C. Strange, S. Schabel, E. Smith, J. Arnold, K. Caldwell and M. Bonner; Johns Hopkins School of Medicine, Baltimore: R. Wise, F. Wigley, B. White, L. Hummers, M. Bohlman, A. Polito, G. Leatherman, E. Forbes and M. Daniel; Georgetown University, Washington, D.C.: V. Steen, C. Read, C. Cooper, S. Wheaton, A. Carey and A. Ortiz; University of Texas at Houston, Houston: M. Mayes, E. Parsley, S. Oldham, T. Filemon, S. Jordan and M. Perry; University of California at San Francisco, San Francisco: K. Connolly, J. Golden, P. Wolters, R. Webb, J. Davis, C. Antolos and C. Maynetto; University of Alabama at Birmingham, Birmingham: B. Fessler, M. Olman, C. Sanders, L. Heck and T. Parkhill; University of Connecticut Health Center, Farmington: N. Rothfield, M. Metersky, R. Cobb, M. Aberles, F. Ingenito and E. Breen; Wayne State University, Detroit: M. Mayes, K. Mubarak, J.L. Granda, J. Silva, Z. Injic and R. Alexander; Virginia Mason Research Center, Seattle: D. Furst, S. Springmeyer, S. Kirkland, J. Molitor, R. Hinke and A. Mondt; Data Safety and Monitoring Board: Harvard Medical School, Boston, T. Thompson; Veterans Affairs Medical Center, Brown University, Providence, R.I., S. Rounds; Cedars Sinai–UCLA, Los Angeles, M. Weinstein; Clinical Trials Surveys, Baltimore, B. Thompson; Mortality and Morbidity Review Committee: UCLA, Los Angeles, H. Paulus and S. Levy; Johns Hopkins University, Baltimore, D. Martin. The following people and institutions participated in the Scleroderma Lung Study 2: University of Boston, Boston: A.C. Theodore, R.W. Simms, E. Kissin and F.Y. Cheong; Georgetown University, Washington, D.C.: V.D. Steen, C.A. Read Jr., C. Fridley and M. Zulmatashvili; Johns Hopkins University, Baltimore: R.A. Wise, F.M. Wigley, L. Hummers and G. Leatherman; Medical University of South Carolina, Charleston: R.M. Silver, C. Strange, F.N. Hant, J. Ham, K. Gibson and D. Rosson; University of California, Los Angeles (UCLA), Los Angeles: D.P. Tashkin, R.M. Elashoff, M.D. Roth, P.J. Clements, D. Furst, S. Kafaja, E. Kleerup, D. Elashoff, J. Goldin, E. Ariola, G. Marlis, J. Mason-Berry, P. Saffold, M. Rodriguez, L. Guzman and J. Brook; University of California, San Francisco (UCSF), San Francisco: J. Golden, M.K. Connolly, A. Eller, D. Leong, M. Lalosh and J. Obata; University of Illinois, Chicago: S. Volkov, D. Schraufnagel, S. Arami and D. Franklin; Northwestern University, Chicago: J. Varga, J. Dematte, M. Hinchcliff, C. DeLuca, H. Donnelly and C. Marlin; University of Medicine and Dentistry of New Jersey, New Brunswick: D.J. Riley, V.M. Hsu and D.A. McCloskey; University of Michigan, Ann Arbor: K. Phillips, D. Khanna, F.J. Martinez, E. Schiopu and J. Konkle; University of Texas, Houston: M. Mayes, B. Patel, S. Assassi and F. Tan; National Jewish Health, Denver: A. Fischer, J. Swigris, R. Meehan, K. Brown, T. Warren and M. Morrison; University of Utah, Salt Lake City: M. B. Scholand, T. Frecht, P. Carey and M. Villegas; University of Minnesota, Minneapolis: J. Molitor and P. Carlson. We are also grateful to Bristol—Myers Squibb for supplying cyclophosphamide for use in SLS I and to Hoffmann-LaRoche for supplying mycophenolate mofetil for use in SLS II.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
Handling editor Tore K Kvien
Contributors All of the authors have contributed to the data collection, interpretation of the data and the writing of the manuscript in its multiple drafts. C-HT and RE were principally responsible for the statistical analysis of the data.
Funding Supported by grants from the US Public Health Service (U01 HL 60587 and U01 HL 60606 for SLS I and R01 HL 089758 and R01 HL 089901 for SLS II).
Competing interests None.
Patient consent Obtained.
Ethics approval IRBs of the David Geffen School of Medicine at UCLA and all of the clinical centres participating in SLS I and SLS II.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Anonymised data from SLS I are available to investigators on application to the SLS I Executive Committee (DPT dtashkin@ mednet.ucla.edu). Data from SLS II will become similarly available after completion of the treatment phase of the study and publication of the main results.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.