Introduction

Tocilizumab (TCZ), a humanized anti-IL-6 receptor antibody, has been shown in previous clinical trials to not only improve the symptoms of rheumatoid arthritis (RA) but also prevent progressive joint destruction among patients with moderate to severe RA refractory to conventional disease-modifying antirheumatic drugs (DMARDs) when administered either as monotherapy or in combination with conventional DMARDs [16]. Also, the results of the RADIATE study suggest that TCZ is a safe and effective alternative for patients who fail to respond to antitumor necrosis factor (TNF) therapy [7]. TCZ was approved for clinical use against RA in Japan in April 2008. Since then it has been confirmed in actual clinical practice that TCZ is effective for treating RA patients refractory to conventional DMARDs or anti-TNF agents [810].

However, evaluating clinical activity in RA patients receiving TCZ is difficult because TCZ blocks IL-6 signaling and rapidly suppresses the serum levels of C-reactive protein (CRP) and erythrocyte sedimentation rate (ESR) which are components of the Disease Activity Score in 28 joints (DAS28).

Over the past decade, musculoskeletal ultrasonography (MSUS) has been established as a new imaging modality for assessing RA-affected joints. Ultrasonography is reported to be more sensitive and reliable than physical examination in the detection of synovial hypertrophy, effusion, and inflammatory activity [1114]. Power Doppler ultrasonography (PDUS) in particular detects synovial perfusion in the inflamed joints, and a decrease in composite power Doppler (PD) signal scores in response to treatment, correlates significantly with DAS28 score, and with CRP and ESR [11, 15]. PDUS is also a useful tool in monitoring patients under TNF antagonist therapy, and PDUS findings have a predictive value in radiographic outcomes [1619].

In this study, we prospectively monitored joint lesions by ultrasonography for the first 12 months of TCZ therapy and evaluated the responsiveness of ultrasonography compared with conventional measures of disease activity and structural damage.

Patients and methods

Patients

Seven patients with RA according to the American College of Rheumatology (formerly, the American Rheumatism Association) 1987 criteria [20], who were refractory to DMARDs, including TNF inhibitors, were enrolled in the study. They were treated with TCZ (8 mg/kg) every 4 weeks with/without DMARDs and low-dose prednisolone. The patients underwent clinical, laboratory, and PDUS evaluation at baseline, 1, 3, 6, 9, and 12 months. Radiographs of the hands were obtained at baseline and after 12 months. The study was conducted in accordance with the Declaration of Helsinki, and informed consent was obtained from all patients before study enrollment.

Clinical and laboratory assessment

At each visit, patients were evaluated clinically by the same physicians who assessed 28 joints (the bilateral glenohumeral, elbow, and wrist joints, metacarpophalangeal joints, proximal interphalangeal joints of the fingers, and knee joints) for tenderness and swelling. The general VAS (gVAS, 100-mm visual analog scale) was rated individually for each patient.

CRP as an inflammatory marker and MMP-3 (matrix metalloproteinase-3) were measured. Disease activity was estimated by calculating DAS28 based on CRP and Clinical Disease Activity Index (CDAI).

PDUS assessment

PDUS was performed by two well-trained rheumatologists: One scanned target joints to obtain images, and both agreed on the assessment of the PD score. They were blind to the clinical, laboratory, and radiographic findings. An Aplio SSA-700A (Toshiba, Tokyo, Japan) with linear array transducers (12 MHz for fingers and hands, 7.5 MHz for knees) was used in this study. The ultrasound scanning method has been described previously [2124]. Of the 28 joints, 24 (excluding bilateral glenohumeral and elbow joints) were assessed by PDUS. The joints were scanned longitudinally and transversally from the dorsal view. PD imaging was performed by selecting a region of interest that included the bony margins and synovial site. PD signals in each joint were graded on a semiquantitative scale of 0–3 (0: absent [no synovial flow]; 1: mild [single-vessel signal or isolated signals]; 2: moderate [confluent signals in less than half of the synovial area]; 3: marked [signals in more than half of the synovial area]), corresponding to the maximum score obtained from the synovial sites evaluated in each joint [11]. Total PD score was calculated as the sum of individual scores for each joint at each examination.

For calculating intraobserver reliability, ultrasound investigators scored the PD signals of 50 images randomly selected from stored images and then evaluated again the same selected images arranged in a different order after an interval. Interobserver reliability was evaluated by using the first-assessed scores.

Radiographic assessment

Two radiologists who were unaware of the clinical and ultrasound findings measured structural damage of the hands at baseline and at 12 months by using the Genant-modified Sharp scoring system; total score (with a maximum possible score of 200) was composed of erosion score (maximum possible 100) plus joint-space narrowing score (JSN, maximum possible 100) [25, 26].

Interobserver reliability was assessed by comparing baseline scores.

Statistical analysis

The data are reported as mean ± SE. The paired t-test was used to test for differences. Correlations between each of the clinical, laboratory, and ultrasound parameters were obtained by Pearson’s correlation coefficient. To compare these parameters with radiographic progression, changes in each parameter during the study were evaluated by calculating time-integrated values (TIV) throughout the year using the area under the curve (AUC) method [27]. P values less than 0.05 were considered to be statistically significant.

Intraobserver reliability for the PD score of each joint was estimated by calculating the intraclass correlation coefficient (ICC). Interobserver reliabilities for PD score and Sharp score of each joint were evaluated by using Cohen’s kappa value. Kappa value <0.40 was poor, 0.40–0.50 moderate, 0.50–0.70 good, and 0.70–1 excellent.

Results

Patient characteristics

All patients completed the study without any severe adverse effects. Baseline characteristics of patients are shown in Table 1. All were women and refractory to one or more conventional DMARDs. Five patients had received anti-TNF agents previously, but switched to TCZ because of inefficacy or adverse effects. Methotrexate, azathioprine, and prednisolone were used in combination with TCZ in 5, 1, and 5 patients, respectively.

Table 1 Baseline characteristics of the patients

Course of clinical, laboratory, and PDUS findings

At baseline, total PD score of each patient correlated with tender joint count (TJC) (r = 0.90, P = 0.02), but not with other clinical parameters including CRP and DAS28.

The means of clinical parameters TJC, gVAS, CDAI, and DAS28 rapidly improved within 3 months, and at the 6-month visit all patients had achieved a good response based on DAS28 and the criteria of the European League against Rheumatism (Table 2). Rapid normalization of serum CRP levels of each patient was observed. Serum MMP-3 levels of each patient also decreased, correlating well with serum CRP levels at baseline (r = 0.86, P = 0.03) and at 1 month (r = 0.99, P = 0.0003). The changes in values of each patient’s clinical and laboratory parameters tended to follow the average. On the other hand, although the average total PD score appeared to decline in parallel with clinical improvement, the changes in each patient’s total PD score were diverse (Fig. 1): 1 patient (Pt. 5) with high total PD score at baseline experienced a dramatic decrease in PD signals only after 2 courses of TCZ infusions; another patient (Pt. 2) did not obtain a response until 12 months; and in 1 patient (Pt. 3) the score increased with a clinical exacerbation at 9 months.

Table 2 Mean ± SE values for clinical, laboratory, and PDUS parameters at the baseline and follow-up assessments
Fig. 1
figure 1

Changes in average and individual patients’ total PD scores. The average total PD score appeared to decline gradually in parallel with clinical improvement, but the changes in individual’s total PD score were diverse

Radiographic progression

Radiographic progression of joint destruction was detected in 5 patients; mean Δtotal Sharp score of these patients was 3.78 (range 1.02–10.9), mean Δerosion score was 2.24 (range 1.02–6.12), and mean ΔJSN score was 1.54 (range 0–4.81). Among them, 1 was a flare-up but the rest were evaluated by clinical assessments as responding to TCZ treatment.

Predictors of final activity and joint destruction

To analyze which of the parameters could predict final disease activity and joint destruction, correlations between the TIV of each parameter and DAS28 at 12 months and ΔSharp score were calculated. TIVs of clinical parameters including gVAS, CDAI, and DAS28 correlated significantly with final DAS28 (gVAS: r = 0.90, P = 0.01; CDAI: r = 0.82, P = 0.04; DAS28: r = 0.85, P = 0.03), but no relationship with joint destruction was observed. On the other hand, TIV of total PD scores correlated with ΔSharp score (Δtotal; r = 0.77, P = 0.04, Δerosion; r = 0.78, P = 0.04, ΔJSN; r = 0.75, P = 0.05), but not with final DAS28.

Comparison between 1-year radiographic progression and cumulative PD scores in individual joints

Inflammation remaining in a joint is thought to be the main cause of bone and cartilage destruction, and previous studies have reported that the existence of synovial perfusion detected by PDUS is related to subsequent radiographic progression [17, 28, 29]. Based on these views, we compared the TIVs of the PD scores for individual joints (TIV-individual PD scores) throughout the study with the 1-year radiographic progression of the joint. The cut-off point for TIV-individual PD scores predicting an increase in total Sharp score was 16 as estimated by using receiver operating characteristic (ROC) curve analysis and the Youden index (AUC was 0.953, false positive fraction 0.034, true positive fraction 0.875). All but 1 of the joints showing radiographic progression were joints whose TIV-individual PD scores was 16 or more, and no progression of joint destruction was seen among the joints with no PD signals throughout the year (Fig. 2). Also, TIV-individual PD scores of each joint correlated with ΔSharp score (total score: r = 0.63, P < 0.0001; erosion score: r = 0.64, P < 0.0001; JSN score: r = 0.58, P < 0.0001) (Table 3).

Fig. 2
figure 2

Proportion of joints with damage, grouped by TIV of PD scores of individual joints. Among the 12 joints with a TIV-individual PD score of ≥16, erosion had progressed in 7 joints (58.3%), JSN in 6 joints (50%), and total Sharp score in 7 joints (58.3%). Among the 46 joints with a positive TIV-individual PD score of <16, only 1 joint developed new erosion. No progression was seen among any of the joints with a TIV-individual PD score of 0

Table 3 Correlation (r) between time-integrated value (TIV) of PD scores of individual joints and radiographic progression

Representative images are shown in Fig. 3.

Fig. 3
figure 3

Representative data. a Pt. 1. Residual Grade 2 PD signals were detected in the wrist for at least 9 months (TIV-individual PD score was 33). Carpal joint-space narrowing and erosion of the ulnar head progressed throughout the study. b Pt. 5. After 2 courses of TCZ infusions PD signals decreased dramatically in each joint (TIV-individual PD scores of right 3PIP = 4.5, left 3PIP = 1.5, and left 3MCP = 8.5). No radiographic progression was seen in these joints. PIP proximal interphalangeal joint, MCP metacarpophalangeal joint

Intra and interobserver reliability

The intraobserver ICC for PD signals of each joint was 0.99 (95% confidence interval [95% CI] 0.98–0.99), and the interobserver kappa value was 0.92.

The interobserver kappa values for Sharp score were 0.96 for total score, 0.95 for erosion score, and 0.97 for JSN score.

Discussion

Although all patients enrolled in this study were refractory to previous treatments including anti-TNF blockers, they obtained more than moderate response after TCZ therapy. Only 1 patient experienced an exacerbation. Nonetheless, in some patients, progressive radiographic damage was observed independent of clinical response. In contrast to clinical assessments, the cumulative PD signal indicated by the TIV of total PD scores was a strong predictor for joint destruction in these individuals. Moreover, when we focused on each joint, the relationship between cumulative PD signal and joint destruction was clearer: no joints without a PD signal had radiographic progression of joint damage, whereas a high TIV-individual PD score correlated with radiographic progression both in erosion score and in JSN score. These results suggest that a high cumulative PD signal, which means PDUS detected long-lasting synovitis in spite of TCZ treatment, can directly lead to joint destruction with a high rate.

Naredo et al. reported the relationship between radiographic progression and PDUS findings in 2 studies [17, 28]. Those studies showed that TIVs of PDUS parameters correlated strongly with radiographic progression among early RA treated with DMARDs and patients initiated with anti-TNF blockers. Our observation is consistent with these reports.

TCZ blocks IL-6 signaling and therefore suppresses inflammatory markers such as CRP or ESR without exception, regardless of ongoing synovitis. Although our study presents preliminary data from only small numbers, this is the first time it has been shown that PDUS can evaluate remaining synovitis which relates to joint damage more sensitively than any other assessments included in this study under TCZ therapy.

Increased joint damage may cause functional impairment. In this era of biological agents, the goal of RA treatment is now to achieve not only clinical remission but also radiographic remission and no disability. Anti-TNF agents have excellent efficacy in inhibiting radiographic progression regardless of baseline levels of inflammatory markers, treatment response, or disease activity after treatment [3034]. The SAMURAI study reported that the group receiving TCZ monotherapy showed less radiographic change than the DMARDs group [2, 6]. However, joint damage still increased significantly over time in some patients under these biologic therapies, and in such cases, we might strengthen the treatment. This suggests that to achieve true radiographic remission, the response to treatment should be evaluated on a joint-by-joint basis in addition to using a conventional clinical score such as DAS28. From this perspective, because a high cumulative PD score tends to relate to joint destruction, PDUS is a powerful tool to monitor the change in synovitis in each joint and is helpful in deciding the appropriate treatment plan.

There were several limitations in this study. First of all, the patients’ backgrounds were not uniform. Disease duration, previous treatments, adverse prognostic factors, and concomitant drugs were diverse among the patients. Furthermore, there was no control group. However, the aim of this study was to clarify the usefulness of PDUS in comparison with conventional clinical parameters in evaluating treatment response and in predicting structural damage especially. From this view, all of the enrolled patients were worthy of evaluation because they presented with moderate or greater disease activity at baseline and all had the potential for joint destruction to progress.

Second, although many reports showed that at baseline the composite PD score of several joints or modified DAS28 calculated by using the number of PD positive joints correlated well with DAS28 and CRP, in this study there were no correlations between these parameters. One explanation for this is that the baseline levels of CRP were low in some patients despite high total PD scores.

Third, the PDUS assessment differed from the Sharp scoring system in the method for assessing wrist lesions. We evaluated each wrist by PDUS in 3 areas (carpal joint, radiocarpal joint, and ulnocarpal joint), and the maximum PD score of the 3 was decided as the wrist’s PD score. On the other hand, in the Genant-modified Sharp scoring system each wrist was divided into 4 areas to determine the erosion score and 3 areas to determine JSN, and we regarded damage in the wrist joint as ‘progressed’ when progression was observed in at least 1 area. Therefore, we could not analyze accurately whether the location of residual PD signal corresponded with the site of radiographic progression, and it is possible that the correlation between ΔSharp score and TIV-individual PD score of the wrist was overestimated.

Intra and interobserver reliabilities of PD scoring were excellent. Those reliabilities were calculated by using stored images, and we did not evaluate the reliability of acquiring appropriate PDUS images. But in this study, 2 ultrasound operators were occupied with each scan, and they double checked and conferred with each other to decide the score, thus raising the precision in the PDUS assessment.

In summary, this is the first report of PDUS monitoring of RA joint lesions in patients undertaking TCZ therapy. Although large-scale examinations will be needed to obtain clearer conclusions, we found that ultrasonography can independently evaluate disease activity in RA patients receiving TCZ and is superior to DAS28 especially in predicting joint destruction.