Background Skin involvement is of major prognostic value in systemic sclerosis (SSc) and often the primary outcome in clinical trials. Nevertheless, an objective, validated biomarker of skin fibrosis is lacking. Optical coherence tomography (OCT) is an imaging technology providing high-contrast images with 4 μm resolution, comparable with microscopy (‘virtual biopsy’). The present study evaluated OCT to detect and quantify skin fibrosis in SSc.
Methods We performed 458 OCT scans of hands and forearms on 21 SSc patients and 22 healthy controls. We compared the findings with histology from three skin biopsies and by correlation with clinical assessment of the skin. We calculated the optical density (OD) of the OCT images employing Matlab software and performed statistical analysis of the results, including intraobserver/interobserver reliability, employing SPSS software.
Results Comparison of OCT images with skin histology indicated a progressive loss of visualisation of the dermal–epidermal junction associated with dermal fibrosis. Furthermore, SSc affected skin showed a consistent decrease of OD in the papillary dermis, progressively worse in patients with worse modified Rodnan skin score (p<0.0001). Additionally, clinically unaffected skin was also distinguishable from healthy skin for its specific pattern of OD decrease in the reticular dermis (p<0.001). The technique showed an excellent intraobserver and interobserver reliability (intraclass correlation coefficient >0.8).
Conclusions OCT of the skin could offer a feasible and reliable quantitative outcome measure in SSc. Studies determining OCT sensitivity to change over time and its role in defining skin vasculopathy may pave the way to defining OCT as a valuable imaging biomarker in SSc.
- Systemic Sclerosis
- Outcomes research
- Autoimmune Diseases
Statistics from Altmetric.com
The term Scleroderma (from the two Greek words ‘scleros’=hard and ‘derma’=skin) defines any form of skin fibrosis including linear scleroderma, morphea and systemic sclerosis (SSc). SSc is clinically heterogeneous and has a prominent autoimmune component that affects the skin and major internal organs with variable severity and prognosis.1–3 The major clinical features of SSc are related to the severe fibrotic changes occurring in multiple tissues and very prominently in the microvasculature. The extent and rate of progression of tissue fibrosis is of paramount importance in determining the clinical features and the prognosis of SSc, correlating with both survival and functional limitations.4 ,5
The clinical semiquantitative assessment of skin thickness (modified Rodnan skin score or mRSS) is currently the gold standard for skin evaluation in SSc and it is often the primary outcome measure of intervention clinical trials but has several limitations, including interobserver variability and high level of skill required for its utilisation.6 ,7 Consequently, imaging modalities are being increasingly employed to measure SSc related skin disease and, thus far, high frequency ultrasound (HFUS) of skin has proved reliable for dermal thickness measurement.8–10 However, the resolution provided even by the highest frequency probes that are currently available is not adequate to depict the finer superficial skin structures, including the dermal–epidermal junction (DEJ) region,10 which makes corresponding histological evaluation poorly feasible. This is of considerable importance for a biomarker in SSc since it has been suggested that the fibrosis of the superficial papillary dermis (PD) precedes the reticular dermis (RD) in SSc and that the other major histological findings of skin fibrosis in scleroderma1 (eg, capillary rarefaction and perivascular infiltration) are foremost represented in the superficial dermis. Indeed, recent work has shown a pivotal interaction between keratinocytes and superficial dermis in early skin fibrosis in SSc.11 ,12 Therefore, we hypothesised that an imaging technique capable of visualising with high definition the first millimetre below the surface of the skin could offer a potential sensitive imaging biomarker to assess and quantify skin fibrosis.
Optical coherence tomography (OCT) is a powerful imaging technology which employs a low-intensity infrared laser beam and is capable of producing high-contrast images of skin up to 2 mm deep with resolutions of 4–10 μm, both features making it an ideal tool to explore the most superficial layers of the skin. It is routinely used in ophthalmology and increasingly used in cardiology and dermatology.13–15 This is the first study to perform histological comparison and explore OCT as a potential imaging technique for the assessment of SSc. This has considerable potential value as an outcome measure in SSc.
Twenty-two patients attending the Scleroderma out-patient clinic at Chapel Allerton Hospital, Leeds, UK, were enrolled in the study. Twenty-one of them fulfilled the American College of Rheumatology Classification Criteria for SSc;16 one patient suffered from plaque morphea. Twenty-two healthy volunteers were recruited from the staff. All patients and healthy volunteers underwent clinical and demographic data collection including mRSS assessment by a fully trained rheumatologist, which was done independently of the OCT evaluation. All subjects had OCT scans performed on dorsal and volar aspect of at least two fingers, dorsal aspect of hands, and dorsal and volar aspect of forearms for a total of 458 OCT scans. The scanned skin regions included both clinical involved (mRSS=1, 2, 3) and uninvolved (mRSS=0) skin. The mRSS=0 regions group also included skin at atrophic phase. The morphea patient underwent OCT scanning at clinically involved and uninvolved skin regions of the forearms. Local ethical committee approval was obtained for both biopsies and non-invasive imaging study of the skin. All patients and controls signed informed consent to participate in the study.
Two diffuse cutaneous systemic sclerosis (dcSSc) patients with ≤3-year-disease duration from skin sclerosis onset and one healthy control (HC) underwent skin biopsy in the same OCT scanned region, which was used for histopathology comparison of the findings. Both biopsies were processed for H&E staining. The average A-scan curves and corresponding histology were overlapped and calibrated using Photoshop software (Adobe Systems Inc.) to align the corresponding scale bars of histology and OCT images. Three-dimensional representations of the OCT data were produced using ImageJ17 and Voxx18 software (see online supplementary figure S1).
Five HC and four SSc patients underwent OCT scans on the forearms (total 14 sites) from two different operators in the same scanning session. A total of 26 forearm sites from 11 HC and seven SSc patients were scanned in two different scanning sessions, 1 week apart, from the same operator. The mRSS at the site of analysis ranged from 0 to 3.
Optical coherence tomography
The OCT scans were performed using the ‘VivoSight’ topical OCT probe (Michelson Diagnostics) which comprises four parallel Swept-Source OCT systems using a laser with central wavelength of 1310 and 150 nm laser sweep. For each sample, the handheld OCT probe was used to automatically capture 100 OCT 4 mm B-scans with an inter-frame spacing of 4 μm (the scan is performed by direct contact through a disposable stand-off to maintain the correct focal distance between tissue and scanner and does not require the use of any gel). The resultant 4×0.4×2 mm data volume (lateral×lateral×depth) was instantly reviewed before being stored using DICOM and TIFF file format for later analysis. Mean A-scans (mean OCT signal plotted against depth-in-tissue) were derived for each data volume using custom processing written in Matlab (Mathworks Inc.). The processing first detected and flattened the skin surface to remove any natural irregularities as well as any patient–operator movement that occurred during the scan and then averaged the data at each depth beneath the skin surface. In normal skin, two clear peaks can be identified in the mean A-scan. The first peak is at the skin surface and is caused by the difference between the refractive indices of air and the stratum corneum. The second peak corresponds to the PD which appears hyper-reflective in OCT images, contrasting the DEJ which appears as a hypo-reflective valley. The distance between the entrance peak and the valley preceding the second peak can therefore be taken to represent the epidermis (ED) thickness.13
The highest optical density (OD) at the skin surface was assigned a value of 1, and all the other values calculated as multiples thereof and referred as relative OD. The scale bars were added to the images using ImageJ,17 which converts the distance in pixels into micrometre using a fixed scaling value of 0.74; this conversion factor is based upon measurements of refractive index in tissue at 1310 nm in previous work and automated isotropy correction applied by the VivoSight equipment.
Continuous variables were expressed as mean and standard deviation (SD) or error (SE) as appropriate. The OD values of mean A-scan of each group were correlated with clinical skin score using Pearson's correlation test and compared using unpaired t test or one-way analysis of variance (ANOVA) when appropriate. Bonferroni test was used for multiple comparisons correction. Intraclass correlation coefficient (ICC) was calculated to estimate the intraobserver and interobserver reliability. The ICC values were classified according to the scale from Altman in which values were considered as excellent agreement at >0.80.19 Finally, Bland-Altman bias and 95% limits of agreement (LoA) were calculated as an indication of the typical error associated with these measurements. p Value<0.05 was considered statically significant. Data were analysed employing the SPSS software, V.18 (SPSS, Chicago, Illinois, USA).
Patients and controls
We enrolled 11 SSc patients with diffuse (all at fibrotic phase, five of them with ≤3-year-disease duration from skin sclerosis onset) and 10 with limited subset (at either fibrotic or atrophic phase) according to LeRoy and coworkers.20 All patients had a scleredema score=0. Nineteen were female. The mean age was 53.8±3.5 years. Additionally one patient with morphea (male subject, aged 50 years) and 22 healthy volunteers (16 female subjects) were recruited as controls. The mean age of HC was 49±2.1 years. The mean age of SSc patients and HC was not statistically different (p>0.05). Clinical summary of the SSc patients enrolled is shown in table 1.
Virtual skin biopsy by OCT
The OCT images acquisition allowed the reconstruction of a virtual skin biopsy measuring 4×0.4×2 mm. The main structure of the healthy skin was easily recognisable by OCT (figure 1). As already described,13 the stratum corneum was visualised as quite irregular and highly reflective line on the skin surface. The ED was visualised with lower reflective properties compared with the stratum corneum and the PD underneath. Of particular note, the DEJ was identifiable as a low density line between the ED and the PD. The transition in the RD was similarly recognisable for the less scattering properties of the RD. Blood vessels appeared as signal free tube-like structures (figure 1).
In contrast, the OCT images of SSc patients (patients with ≤3-year-disease duration from skin sclerosis onset and mRSS at the site=3) showed a skin surface without any superficial folds, and the ED was visualised as a homogeneous textured layer and appeared less hypo-reflective than the normal skin (figure 1B). Consequently, the visualisation of the DEJ was difficult. Furthermore, the OD faded quite abruptly in the PD not allowing any clear distinction between PD and RD. Blood vessels were less numerous than in normal skin. The 3D rendering software offered the advantage to perform virtual slides cut in all three axes that could be used for further analysis (see online supplementary figures S1 and S2).
Histological comparison of OCT findings
Quantitative analysis of the mean A-scan OCT signal generated a bimodal curve, already described.13 Histology comparison by overlay of the mean A-scan curve with H&E image from the corresponding skin biopsy confirmed that the first OD peak in the A-scan corresponded to the stratum corneum (figure 2A). The ED showed a consistent decrease in OD along its depth reaching a minimum (Min OD=0.6024) at 80 μm from the surface (green arrow, figure 2B,E). In contrast, the PD showed an increase in OD up to a second peak in the A-scan (OD=0.6392), which was placed at 128 μm of depth in the healthy skin controls (green arrowhead, figure 2B,E). As a consequence, OCT was able to identify the DEJ as a valley depicted by the decreasing OD in the ED and the increasing OD in the PD. The OD then faded gradually through the RD.
Comparison of OCT and skin biopsies from two dcSSc patients with severe skin involvement (mRSS=3 at the site of analysis) indicated that the OD continued to drop through the most superficial part of the PD making the visualisation of the DEJ impossible (figure 2C,D). The minimum OD was reached deeper, at 124 μm from the surface (red arrow, figure 2C–E). Additionally, a weak second peak in OD was visible more deeply in the PD, at 160 μm depth (OD=0.5615) (red arrowhead, figure 2C–E), sketching a subtle valley well within the dermis. The OD then faded more abruptly with a deep decrease of signal before 0.5 mm of depth.
OCT analysis of affected and unaffected skin in a patient with plaque morphea showed a similar pattern of severe SSc and HC, respectively. Indeed, mean A-Scan of the affected skin did not show the typical valley of the DEJ and the second OD peak was remarkably deeper than the unaffected skin (OD=0.4787 at 152 μm depth vs OD=0.6726 at 124 μm depth) (figure 3).
An OCT based algorithm for determining quantitative measures of skin fibrosis
Once compared with histology, to further investigate the potential of OCT skin imaging, we analysed the A-scans of the forearms of all the HC enrolled in the study and all the SSc patients; these latter classified by mRSS at the site of OCT scan. The average of the mean A-scan in the different groups showed that the most apparent differences were within the first 500 μm of depth (figure 4A).
We noted that the increase in OD following the first valley from the surface (DEJ in the healthy skin) was the only positive gradient observable in the mean A-scan (figure 3B). Hence, we elaborated an automated algorithm in Matlab to analyse the Min OD before the gradient, the Max OD of the PD peak (figure 4B) and the area defined by the two gradients (data not shown). Furthermore, we observed that the OD of the HC reached 50% of the maximum value (OD=0.5) at 296 μm of depth (±16 μm). To determine whether this value could be of any utility in quantifying the drop of OD observed in SSc we considered the value of OD at 296 μm (±16 μm) for further analysis (OD300, figure 4B).
Correlation of OCT values with clinical score
The overlay of all the mean A-scans grouped by clinical score at the site of analysis suggested that the classical bi-modal pattern of decrease in OD was progressively lost with increasingly severe mRSS (figure 4C), reflecting the progressive loss of visualisation of the DEJ. Direct comparison of the Max OD values in HC and the four mRSS score subgroups indicated that there was a significant overall difference in the Max OD of the PD across the five groups (p<0.0001). After Bonferroni correction for multiple variables the difference remained significant between HC or mRSS=0 and patients with mRSS=2 or 3, suggesting that this measure could not discriminate between HC and patients with mRSS=0 or 1. Indeed unpaired t test between HC and mRSS=0 showed no significant difference (figure 4D). Nevertheless, Max OD values showed a robust correlation within the mRSS groups (r=−0.7). Similarly, Min OD values were significantly different across the five groups (p<0.001) and showed significant correlation within different mRSS subgroups (r=−0.61). Also in this case there were no significant differences between HC and patients with mRSS=0 or 1 (p>0.05) (figure 4D).
The OD300 showed a complementary discriminative value compared with Max and Min OD. In fact, OD300 did not correlate among mRSS subgroups (data not shown) but was significantly different between HC and patients with mRSS=0 (p<0.001) or mRSS=1 (p<0.0001) (figure 4F).
We were able to assess the reliability of the technique only in a subgroup of HC and SSc patients due to operator availability limitations. Nevertheless, the intraobserver reliability was excellent (ICC>0.80) for Min and Max OD, (ICC=0.96 and 0.91, respectively) (figure 5). Similarly, the interobserver reliability was excellent for Min and Max OD (ICC=0.91; ICC=0.89, respectively). Bland-Altman plots, bias and LoA for both intraobserver and interobserver reliability of Min and Max OD are shown in figure 5.
The current gold standard for semiquantitative assessment of skin fibrosis, the mRSS, suffers from several shortcomings ranging from the subjectivity of skin palpation assessments and the high level of skill required from the clinical investigator. Even more importantly, a meta-analysis of three independent studies determined an overall within patient interobserver SD of five units independently of the mean skin score,6 ,21 which represents an SE ranging from 20% to 26%. A primary outcome measure with 25% of SE entails the recruitment of a large number of patients to attain statistical validity in minimally significant changes, a task often difficult to accomplish given the comparatively low incidence of SSc.
A robust imaging biomarker for the assessment of skin fibrosis in SSc has not previously been reported. Herein we report the first study aimed to validate OCT for the quantitative assessment of skin involvement in SSc.
To date, the limited data on surrogate outcome measures for skin involvement are largely composed of histopathological or molecular changes in affected skin.22 ,23 Despite conceptually very valuable, these studies, involving skin biopsies, are invasive and limited because of a site bias, referring to only one precise body area. Moreover, they are difficult to repeat in longitudinal manner and showed no sensitivity to change over time.24 In this study, we evaluated OCT skin scanning as a reliable and quantitative tool that could be used as a surrogate marker of skin fibrosis. The technique requires minimal operator training, less than 10 s per site examined, and offers the great advantage of saving image files for further or centralised operator independent analysis. This latter is a particularly useful tool limiting the ‘hands on’ time in the clinic office and allowing a centralised, blinded assessment of results in clinical trials.
We observed an excellent correlation of OCT mean A-Scan curves and mRSS score at the site of analysis. More importantly, the corroboration of our OCT findings with pathological changes at the DEJ provides a robust construct validity for the technique. Of interest, we found that the changes of the OD of the dermis in SSc are similar to the ones observed in a case of plaque morphea, corroborating even further the potential value of OCT in measuring skin fibrosis.
Comparative analysis of SSc skin biopsies and OCT images showed that the decrease in OD in the PD and the loss of the typical increase in OD right below the germinal layer of the ED are measures that do reflect the fibrotic skin involvement in SSc. Indeed both Min and Max OD did not differ between HC and patients with no or mild fibrotic involvement, whereas they showed a very robust correlation with the severity of skin thickness. The correlation between the depth at which we observe the valley in OD (Min OD) and the mRSS is still of uncertain clinical relevance but does pave the way for the use of this measure as an imaging biomarker of progressive skin fibrosis. The sensitivity to change over time of this measure needs to be assessed to determine whether this could be used as an outcome of response to therapy or improvement of skin fibrosis over time.
Furthermore, the analysis of the differences in Min OD, Max OD and OD300 indicated that these values could be complementary in measuring different aspects of SSc skin pathology. Min and Max OD showed a regular pattern of decrease with the worsening of clinical score but could not measure the difference between mRSS=0 and HC whereas OD300 was able to detect a significant difference in HC and patients with clinically unaffected or mildly affected skin. In this regard, the significant difference between healthy skin and patients with mRSS=0 as measured by the OD300 may result, if confirmed in larger studies, as extremely useful for clinical studies and may mirror recent microarray studies suggesting that clinically unaffected skin shares the peculiar gene signatures and therefore the pathology of affected SSc skin.23 At the same time, microarray data and our OCT findings may suggest that the employment of mRSS as primary outcome may not reflect potentially important biological changes in SSc. The longitudinal study of OCT in SSc patients should also consider whether the difference that we notice in the RD of patients with mRSS=0 is correlated more with preclinically involved skin or atrophic skin.
With respect to reproducibility, in this study both the intraobserver and interobserver reliability were excellent, again suggesting that OCT could be a very valuable tool in quantitative assessment of skin fibrosis. The SE of the technique stands in average at 5%, which compared with mRSS clearly offers a more accurate tool for skin assessment. However, we plan to perform, within the longitudinal study, a larger prospective reliability study involving two operators to validate these pilot but positive findings.
Additionally, the latest generation OCT probes are extremely easy to handle. The probe is applied directly to the skin, the scanning lasts few seconds and it is easily used at different anatomical sites.15
HFUS has been recently suggested to offer a quantitative assessment of skin thickness in SSc by several studies.8–10 In contrast with ultrasound, OCT does not require any use of gels, is able to give a higher resolution images and the analysis algorithm is automatic, not involving any operator interpretation. Nevertheless, since the penetration of OCT is limited to the first millimetre of skin, OCT and HFUS may be explored as complementary imaging biomarkers in SSc.
The potential of OCT to yield an objective measurement of number of vessels is clearly another point of interest in the technique. We are currently elaborating a software algorithm for the quantification of the total vessel volume and plan to include this measurement in the longitudinal study we are about to begin.
Though we do not have the data to support OCT as a replacement of skin histology for differential diagnosis of skin disease, the results presented here strongly support the use of OCT in longitudinal studies to evaluate response to therapy in patients with SSc and, potentially, to detect preclinical changes in SSc skin.
Our study had some limitations. First of all, since the aim of this study was to explore the potential of a novel imaging tool never used before for SSc skin assessment, we scanned only selected skin regions and were not able to determine the sensitivity in all regions and compare the results with total mRSS. Second, though we did not deem necessary to compare the findings with histology in all 21 patients, both SSc skin biopsies were from sites with mRSS=3, whereas a direct comparison with less affected skin would offer a complete validation.
Nevertheless, our findings strongly support the development of a novel OCT based technique as an outcome measure in skin research. This technique is capable of measuring both the fibrotic aspects of disease and potentially reversible changes including blood vessels number and visualisation of the DEJ region. Longitudinal studies including a larger number of patients with different degrees of fibrosis and assessing the sensitivity to change over time and over treatment are needed and currently in progress to validate the OCT as a quantitative outcome measure in SSc.
The authors wish to acknowledge Dr Alper Aydin for his irreplaceable suggestions that helped in the design of the study.
Handling editor Tore K Kvien
Contributors GA acquired the clinical and OCT data, contributed to experimental plan design, performed all the statistical analysis and drafted the manuscript. SZA, VL and CC-G contributed to the acquisition of OCT data. AM and DW performed Matlab image analysis. SZA, RJW, DGM and PE contributed to experimental plan design and critical revision of the data and the manuscript. FDG conceived and coordinated the study as well as the revision of the data and the drafting of the manuscript.
Funding This work was funded in part by NIHR Leeds Musculoskeletal Biomedical Research Unit grant to PE and FDG, and by EULAR ODP grant to FDG.
Competing interests AM and DW are employees of Michelson Diagnostics.
Patient consent Obtained.
Ethics approval LTHT REC.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.