Article Text

Download PDFPDF

2016 American College of Rheumatology/European League Against Rheumatism criteria for minimal, moderate, and major clinical response in adult dermatomyositis and polymyositis
  1. Rohit Aggarwal1,
  2. Lisa G Rider2,
  3. Nicolino Ruperto3,
  4. Nastaran Bayat2,
  5. Brian Erman4,
  6. Brian M Feldman5,
  7. Chester V Oddis1,
  8. Anthony A Amato6,
  9. Hector Chinoy7,
  10. Robert G Cooper8,
  11. Maryam Dastmalchi9,
  12. David Fiorentino10,
  13. David Isenberg11,
  14. James D Katz2,
  15. Andrew Mammen12,
  16. Marianne de Visser13,
  17. Steven R Ytterberg14,
  18. Ingrid E Lundberg9,
  19. Lorinda Chung10,
  20. Katalin Danko15,
  21. Ignacio García-De la Torre16,
  22. Yeong Wook Song17,
  23. Luca Villa3,
  24. Mariangela Rinaldi3,
  25. Howard Rockette1,
  26. Peter A Lachenbruch2,
  27. Frederick W Miller2,
  28. Jiri Vencovsky18
  29. for the International Myositis Assessment and Clinical Studies Group and the Paediatric Rheumatology International Trials Organisation
    1. 1University of Pittsburgh, Pittsburgh, Pennsylvania, USA
    2. 2NIEHS, NIH, Bethesda, Maryland, USA
    3. 3Istituto Giannina Gaslini, Pediatria II - Rheumatologia, PRINTO, Genoa, Italy
    4. 4Social and Scientific Systems, Inc., Durham, North Carolina, USA
    5. 5The Hospital for Sick Children, Toronto, Ontario, Canada
    6. 6Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
    7. 7Central Manchester University Hospitals NHS Foundation Trust, University of Manchester, Manchester, UK
    8. 8University of Liverpool, Liverpool, UK
    9. 9Karolinska University Hospital, Karolinska Institute, Stockholm, Sweden
    10. 10Stanford University, Redwood City, California, USA
    11. 11University College London, London, UK
    12. 12Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
    13. 13Academic Medical Center, Amsterdam, The Netherlands
    14. 14Mayo Clinic, Rochester, Minnesota, USA
    15. 15University of Debrecen, Debrecen, Hungary
    16. 16Hospital General de Occidente de la Secretaría de Salud and University of Guadalajara, Guadalajara, México
    17. 17Graduate School of Convergence Science and Technology and Seoul National University Hospital, Seoul, Korea
    18. 18Charles University, Prague, Czech Republic
    1. Correspondence to Dr Rohit Aggarwal, Division of Rheumatology and Clinical Immunology, Department of Medicine, University of Pittsburgh, 3601 5th Avenue, Suite 2B, Pittsburgh, PA 15261, USA; aggarwalr{at}upmc.edu

    An International Myositis Assessment and Clinical Studies Group/Paediatric Rheumatology International Trials Organisation Collaborative Initiative

    Abstract

    To develop response criteria for adult dermatomyositis (DM) and polymyositis (PM). Expert surveys, logistic regression, and conjoint analysis were used to develop 287 definitions using core set measures. Myositis experts rated greater improvement among multiple pairwise scenarios in conjoint analysis surveys, where different levels of improvement in 2 core set measures were presented. The PAPRIKA (Potentially All Pairwise Rankings of All Possible Alternatives) method determined the relative weights of core set measures and conjoint analysis definitions. The performance characteristics of the definitions were evaluated on patient profiles using expert consensus (gold standard) and were validated using data from a clinical trial. The nominal group technique was used to reach consensus. Consensus was reached for a conjoint analysis-based continuous model using absolute per cent change in core set measures (physician, patient, and extramuscular global activity, muscle strength, Health Assessment Questionnaire, and muscle enzyme levels). A total improvement score (range 0–100), determined by summing scores for each core set measure, was based on improvement in and relative weight of each core set measure. Thresholds for minimal, moderate, and major improvement were ≥20, ≥40, and ≥60 points in the total improvement score. The same criteria were chosen for juvenile DM, with different improvement thresholds. Sensitivity and specificity in DM/PM patient cohorts were 85% and 92%, 90% and 96%, and 92% and 98% for minimal, moderate, and major improvement, respectively. Definitions were validated in the clinical trial analysis for differentiating the physician rating of improvement (p<0.001). The response criteria for adult DM/PM consisted of the conjoint analysis model based on absolute per cent change in 6 core set measures, with thresholds for minimal, moderate, and major improvement.

    • Dermatomyositis
    • Polymyositis
    • Treatment

    Statistics from Altmetric.com

    Request Permissions

    If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

    Idiopathic inflammatory myopathies are a group of acquired, heterogeneous, systemic connective tissue diseases that include adult dermatomyositis (DM) and polymyositis (PM) and juvenile DM.1 Despite significant morbidity and mortality associated with DM/PM, there are currently no therapies approved for these syndromes by the Food and Drug Administration or the European Medicines Agency based on randomised controlled trials. However, with the advancement in novel therapeutic agents that target various biologic pathways implicated in the pathogenesis of DM/PM,2 there is a need for well-designed clinical trials using validated and universally accepted outcome measures. Recently completed clinical trials in adult DM/PM and juvenile DM have used varying response criteria,3–5 again highlighting the need for both data- and consensus-driven criteria to be used uniformly in future studies. Core set measures of myositis disease activity for adult DM/PM clinical trials have been established and validated by the International Myositis Assessment and Clinical Studies Group (IMACS);6–8 these measures were used as the foundation for the current study. We undertook this study because there is a need for composite response criteria in myositis, given the heterogeneity of the disease and the fact that no single core set measure adequately covers all the domains in myositis. For example, muscle enzyme levels can be normal in active DM, and active muscle weakness in DM can occur without active rash.

    Preliminary response criteria for adult DM/PM had been developed and partially validated by IMACS; these criteria were based on at least 20% improvement in 3 of 6 core set measures, with no more than 2 core set measures worsening by at least 25% (which cannot be muscle strength).8 ,9 However, those criteria were considered preliminary, because they were not prospectively validated. Moreover, newer methodologies such as conjoint analysis and other continuous or hybrid approaches for developing response criteria had not been evaluated.10–14 The preliminary criteria had other potential limitations, including equal weights being applied to each core set measure and the lack of quantitative or continuous outcomes. With the growing repertoire of potential therapeutic agents, some of which may yield better results than only minimal clinical improvement, there is also a need to develop criteria for moderate and major clinical improvement.

    For these reasons, and with support from the American College of Rheumatology, European League Against Rheumatism, IMACS, and the Paediatric Rheumatology International Trials Organisation (PRINTO),15 a collaboration was established to develop a data- and consensus-driven process involving multiple clinical data sets and the international myositis community in order to develop and validate response criteria for adult DM/PM and juvenile DM. This effort involved a comprehensive approach to developing candidate definitions for the response criteria, including continuous or hybrid definitions, using conjoint analysis,13 ,14 ,16–19 and for developing criteria for minimal as well as greater degrees of improvement. This article focuses on the criteria for minimal and moderate improvement for adult DM/PM, whereas the threshold for major improvement is considered preliminary. A companion article focuses on the juvenile DM response criteria.20

    Methods

    Core set measures and patient profile consensus

    To develop patient profiles as well as candidate definitions for response criteria in adult PM and DM, we used previously validated IMACS myositis core set measures for patients with adult DM/PM, which include physician and patient global activity on a 10-cm visual analogue scale (VAS), muscle strength measured by manual muscle testing (MMT), physical function measured by the Health Assessment Questionnaire (HAQ),21 extramuscular global activity measured by the physician on a 10-cm VAS, and the most abnormal serum muscle enzyme.8 ,22 The entire process, from the development of these profiles and candidate definitions through final consensus voting, is shown in the flow diagram in figure 1.23 ,24 Details of the methodology used to develop patient profiles, candidate definitions, validation, and expert consensus will be described in a separate publication.24 Briefly, patient data from natural history studies and uncontrolled clinical trials were used to develop patient profiles, which were then rated by adult myositis experts to achieve consensus as to whether improvement was none, minimal, moderate, or major. The expert consensus of improvement was used as the gold standard to validate various candidate definitions. The Bohan and Peter classification as used to designate definite or probable adult DM/PM.25

    Figure 1

    Flow diagram of the entire process used to develop and validate the approved response criteria for adult dermatomyositis and polymyositis.

    Candidate definitions of response criteria

    Six different types of candidate definitions for minimal, moderate, and major response (table 1) were developed:23 ,26 3 types of definitions were traditional (categorical), and 3 were continuous (hybrid). Traditional definitions provide only categorical outcomes of minimal, moderate, and major improvement, or not improved, based on the criteria, whereas continuous definitions yield an improvement score as a continuous outcome measure, with thresholds of minimal, moderate, and major improvement serving as categorical outcomes. Continuous definitions are considered hybrid definitions, because the same definition can be used as a continuous or categorical outcome measure based on the study requirements. Definitions utilising either absolute per cent change (final minus baseline divided by range and multiplied by 100) or relative per cent change (final minus baseline, divided by baseline and multiplied by 100) were evaluated as candidate definitions.

    Table 1

    Types of candidate definitions for response criteria that were developed and tested

    Conjoint analysis surveys

    Conjoint analysis surveys were administered to myositis experts using 1000Minds online software.11 Experts were presented with pairs of hypothetical patient scenarios; each patient had different levels of improvement in the same 2 core set measures, assuming other core set measures remained the same. Experts rated which of the 2 scenarios had greater improvement. Based on the rater's response, all other hypothetical patients that could be pairwise ranked were eliminated via the property of transitivity, thereby significantly reducing the number of scenarios presented. The PAPRIKA (Potentially All Pairwise Rankings of All Possible Alternatives) method was used to determine the relative importance of the core set measures. Relative weights of core set measures and their levels of improvement were used to develop a scoring system by mathematical methods based on linear programming,13 such that when all 6 core set measures are considered together, the maximum score (total improvement score) possible for representing a patient's improvement is 100 and the minimum score is 0. The thresholds for minimal, moderate, and major improvement in the total improvement score were based on optimum sensitivity and specificity (using the Youden index27) in the subset of patient cohort data.

    Validation of candidate response criteria

    The performance characteristics of candidate criteria were evaluated using consensus profile ratings as the gold standard, assessing sensitivity, specificity, and area under the curve (AUC) to compare the performance of these candidate definitions. Those that performed well in the consensus profiles (sensitivity and specificity ≥80%, AUC ≥0.9 for minimal improvement, and AUC ≥0.8 for moderate and major improvement) were externally validated using data for adult DM/PM patients (n=142) enrolled in the Rituximab in Myositis (RIM) trial.3 The treating physician's rating of improvement (0–7 scale) at 24 weeks in the RIM trial was used for validation, and a 1-point change in the physician's rating was considered clinically significant.3 We then selected the top candidate definitions (up to 4 top-performing definitions from each of the 6 different types of candidate definitions) for consideration at the final consensus conference, in order to discuss a manageable number of definitions at the conference.

    Consensus conference

    The nominal group technique (NGT) was applied to develop consensus among experts in adult DM/PM regarding the top-performing candidate definitions for minimal and moderate improvement in adult DM/PM.28–30 Experienced moderators (RA and FWM) led the NGT consensus-development process for the adult working group and the combined adult and paediatric working group (RA, LGR, NR, and FWM). Given the paucity of data on major improvement, we considered the major improvement thresholds as preliminary for the final consensus meeting. For each candidate definition, the methodologic details used to develop them it and its performance characteristics in the consensus patient profiles and the RIM trial were presented to the adult working group. Each of the 12 participants in the adult working group independently reviewed the performance characteristics of all 18 top candidate definitions for adult DM/PM. Detailed data for each candidate definition, including sensitivity, specificity, and AUC as well as kappa values and ORs for minimal, moderate, and major improvement, were provided. The AUC was determined from the receiver operating characteristic curve as a plot of sensitivity versus (1—specificity) for total improvement scores as well as for thresholds.27

    Adult working group

    The primary goal for the adult working group was to develop consensus response criteria for minimal and moderate clinical improvement in adult DM/PM based on the data presented, as well as the face validity, feasibility, and generalisability of the proposed candidate criteria. The experts in the adult working group included internationally recognised rheumatologists, neurologists, and dermatologists who have considerable experience in myositis and with the core set measures. Voting was conducted in an independent, anonymous, and systematic manner via a web-based system developed by staff at the PRINTO coordinating centre.31 ,32 In the initial rounds of voting, participants were asked to rank their top 5 choices. The results were compiled, and aggregate votes and rank of each candidate definition were shared with the group after each round of voting. Participants were then asked in a random manner to discuss their top-ranked and bottom-ranked choices. Candidate definitions receiving a small proportion of votes were eliminated. In subsequent voting rounds, participants were asked to re-rank their choices after reviewing the previous round's voting and discussion. When fewer than 5 candidate definitions remained, each participant selected one as the top response criteria. The objective was to continue the rounds of voting in the same manner until a single candidate definition reached consensus (≥80% of the votes) or until it was clear that consensus would not be reached.

    Combined adult and paediatric working group

    After consensus was achieved by each working group, both groups then came together to vote on common response criteria to be used for both adult DM/PM and juvenile DM20 as the outcome measure for combined clinical trials. For this voting round, the top candidate definitions from the final round of voting in each working group were considered, and a similar online voting system and the NGT were used until consensus of ≥80% was reached (28–30). For determining the thresholds of improvement for the selected definition, the required consensus was ≥70%, which was done by post-conference voting.

    Results

    Candidate definitions

    A total of 287 adult DM/PM candidate response criteria were drafted or derived using data-driven methods. Included were 10 previously published definitions, 134 newly drafted definitions based on expert survey results, 63 weighted definitions, 68 logistic regression definitions, 6 conjoint analysis definitions, and 6 definitions in which differential weights were applied to the improvement achieved in each core set measure. Among these definitions, 163 used relative per cent change and 124 used absolute per cent change in the core set measures.

    Validation

    Candidate definitions with a sensitivity and specificity of ≥80%, AUC ≥0.9 for minimal, and AUC ≥0.8 for moderate and major improvement in the patient profile analysis using expert consensus rating as the gold standard were evaluated for external validation using RIM clinical trial data3 (see online supplementary table S1, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.40064/abstract). Thus, of 122 adult DM/PM candidate definitions evaluated using the RIM trial data, 36 adult DM/PM candidate definitions, including 25 using relative and 11 using absolute per cent change in core set measures, had AUC ≥0.7 and showed validation in the clinical trial analysis.

    Top candidate definitions

    Of 36 validated definitions, 17 top-performing adult candidate definitions and the top paediatric response criteria20 were considered by the adult working group at the consensus conference so that, in total, 18 candidate definitions were evaluated (table 2 and see online supplementary table S2, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.40064/abstract). They included 9 categorical definitions and 9 continuous definitions, in which 14 used relative per cent change and 4 used absolute per cent change in core set measures. In each categorical definition, a patient would either meet or not meet the response criteria of minimal, moderate, or major improvement based on the degree of improvement or worsening in each core set measure. In the continuous definitions, however, each subject generates a total improvement score on a continuous scale, such that a greater degree of improvement corresponds to a higher score. Furthermore, patients could be categorised as achieving minimal, moderate, or major clinical improvement based on reaching the pre-set threshold score on the continuous scale. Table 2 shows the performance characteristics of the top 5 candidate definitions for the response criteria selected at the consensus conference (see online supplementary table S2 for definitions 6–18).

    Table 2

    Detailed performance characteristics of patient profiles and clinical trial data for the top 5 candidate response criteria definitions presented at the consensus conference*

    In the patient profiles, with expert consensus as the gold standard, all top candidate definitions presented at the conference had excellent performance characteristics, with median sensitivity of 87% (IQR 84–90%) and specificity of 94% (IQR 92–95%) for minimal improvement with a median AUC of 0.91 (IQR 0.90–0.92) (table 2 and see online supplementary tables S1 and S2, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.40064/abstract). Sensitivity, specificity, and AUC were similarly high for moderate and major improvement criteria for these definitions (table 2 and see online supplementary tables S1 and S2). All candidate definitions presented at the conference were validated using the RIM trial data at the 24-week time point and were shown to differentiate (p<0.001) between the treating physician's improvement score at week 24 in patients rated as improved versus not improved3 (table 2 and see online supplementary tables S1 and S2).

    Consensus conference voting

    The top-choice definition for the adult working group, which received 80% of the votes, was the conjoint analysis-based continuous definition model 1, which includes relative per cent change in core set measures, including physician and patient global activity, muscle strength, physical function, most abnormal serum enzyme level, and extramuscular activity (see online supplementary table S3, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.40064/abstract). The second-choice definition, receiving 20% of the votes, was the conjoint analysis-based continuous model 2, which also includes relative per cent change in core set measures (see online supplementary table S3). Models 1 and 2 differ only in the scores associated with each level of improvement in each core set measure.

    However, in the final round of voting and discussion, adult working group participants reached unanimous consensus that the response criteria for adult DM/PM would be identical to the top-choice response criteria for juvenile DM, which is a conjoint analysis-based continuous definition (model 3) using absolute per cent change in core set measures (table 3).20 Participants favoured using the same response criteria for adult DM/PM and juvenile DM so that data from different studies can be harmonised more effectively and to facilitate combined trials, especially given that the definitions were similar with similar performance characteristics. Moreover, the absolute per cent change in core set measures (model 3, table 3) was thought to be more representative of meaningful clinical change compared with relative per cent change in core set measures (models 1 and 2, supplementary table 3). Participants also voted to evaluate all top 5 candidate definitions from the adult working group in future clinical trials, with the other 4 as secondary outcome measures. The top 3 of these criteria, the conjoint analysis definitions, are the same for both adult DM/PM and juvenile DM, with different thresholds of improvement.

    Table 3

    Final myositis response criteria for minimal, moderate, and major improvement in adult dermatomyositis/polymyositis (DM/PM) and combined adult DM/PM and juvenile DM clinical trials and studies*

    The sensitivity and specificity of the top-choice criteria, the conjoint analysis absolute percent change (table 3), were 85% and 92% for minimal improvement, 90% and 96% for moderate improvement, and 92% and 98% for major improvement, respectively (table 2). The AUC was 0.96 for the total improvement score and 0.89, 0.93, and 0.95 for minimal, moderate, and major improvement thresholds, respectively (table 2). In the RIM trial,3 these response criteria showed a significant difference in the physician's rating of improvement when the response criteria rated the patient as improved versus not improved for minimal, moderate, and major improvement (p<0.001) (table 2 and see online supplementary table S2, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.40064/abstract). Myositis experts in the consensus conference favoured the conjoint analysis-based continuous response criteria because the total improvement score is a continuous measure that corresponds to the magnitude of improvement in a patient and provides the ability to categorise a patient's degree of improvement as minimal, moderate, or major (making it truly a hybrid definition). Moreover, the differential weights for various core set measures were also thought to be congruent with an expert's assessment of the relative importance of each core set measure. An important consideration in the final selection was that the top-choice definition be based on absolute per cent change in the core set measure, which was favoured by the participants because, given the various VAS measurements used, the absolute per cent change was thought to be more representative of meaningful clinical change.

    Top candidate definitions considered by the combined paediatric/adult working group

    Three candidate definitions were considered by the combined adult/paediatric working group; these included the top adult definitions (see online supplementary table S3) and the top paediatric definitions,20 one of which was identical in both groups. Final consensus was reached for the combined adult DM/PM and juvenile DM response criteria, with 91% of participants voting for the conjoint analysis-based continuous definition, based on absolute per cent change in the core set measure (table 3). The combined working group agreed that the same final response criteria will be used for clinical trials of both adult DM/PM and juvenile DM, but with different thresholds for improvement in adult versus paediatric patients as well as different core set measures for adult patients (IMACS) and paediatric patients (IMACS and PRINTO). Participants favoured using the same response criteria for adult DM/PM and juvenile DM, because the top definition from each working group was very similar (ie, both being conjoint analysis-based continuous models, with excellent and similar performance characteristics) and because it would permit comparison of outcomes in separate studies. Although only the IMACS core set measures were used for adult DM/PM, for further congruence with paediatric core set measures, the experts in adult myositis agreed to include the Short Form-3633 as a health-related quality-of-life measure to correspond to the PRINTO quality-of-life core set measure, the parent form of the Child Health Questionnaire.34–36 In a post-conference final vote, consensus (74%) was reached on threshold values for minimal, moderate, and major response for adult DM/PM patients, which are ≥20 in the total improvement score for minimal improvement, ≥40 for moderate improvement, and ≥60 for major improvement. In contrast, consensus on the final threshold values for minimal, moderate, and major response for juvenile DM were ≥30, ≥45, and ≥70 points, respectively.

    Discussion

    After a systematic data- and consensus-driven process, a conjoint analysis-based continuous (ie, hybrid) definition based on absolute per cent change in core set measures was selected as the response criteria for adult DM/PM for minimal and moderate improvement in future clinical trials and studies (figure 1). Because the total number of cases in the trial data sets and clinical profiles that achieved major improvement was small, it was decided that the thresholds for major improvement would be considered preliminary. The same continuous (or hybrid) definition, but with different thresholds for minimal, moderate, and major improvement in IMACS or PRINTO core set measures, will be used for juvenile DM clinical trials and studies, as well as for combined adult DM/PM and juvenile DM studies and clinical trials in the future.20 ,24

    The process for developing and validating the candidate definitions for the response criteria was extensive and comprehensive, as we used large prospective clinical cohort data sets to develop patient profiles, and myositis expert consensus was used as the gold standard for clinical response. Consequently, we derived six different types of candidate definitions, each with many variations, leading to a total of 287 candidate definitions tested, which were validated using natural history cohorts and data from a randomised clinical trial. Subsequently, a representative number of international myositis experts from various disciplines (rheumatology, neurology, and dermatology) agreed on an innovative continuous (or hybrid) model using absolute per cent change in validated core set measures.

    These response criteria were developed using a novel conjoint analysis methodology, the 1000Minds software.13 Conjoint analysis, or discrete choice experiment, is a statistical technique used to determine expert group decision-making around various measures (and multiple levels within each measure), providing the ability to develop differential weighting of measures and composite criteria using those measures. The 1000Minds software for conjoint analysis has been used recently to develop rheumatologic classification and/or outcome criteria for rheumatoid arthritis (RA), systemic sclerosis,12 ,13 ,37 ,38 and gout.11 ,16 ,17 ,39

    The criteria developed are continuous in nature and generate a total improvement score (on a scale of 0–100), which can provide a quantitative degree of improvement for each patient rather than a dichotomous or categorical assessment of improvement. The total improvement score is the sum of the improvement reflected in each of the 6 core set measures, but the individual core set measures are weighted, such that those deemed more important provide a greater contribution to the final score. For example, changes in the MMT and physician global disease activity scores are weighted more heavily than changes in the most abnormal enzyme or the HAQ. These weights were consistent with our myositis expert survey,26 which was independent of the process used to develop and validate our response criteria.

    There are significant advantages of using continuous response criteria (especially in pilot studies). For example, it might be possible to enrol fewer subjects and still have sufficient statistical power to differentiate between treatment groups by using the mean or median total improvement score. Moreover, continuous measures have the best sensitivity to change, the use of which allows modest treatment differences to be detected as statistically significant, which in turn leads to better clinical trials.10 Moreover, the criteria developed provide thresholds for both minimal and moderate improvement, with a preliminary threshold for major improvement. Therefore, larger, adequately powered clinical trials and studies can use the threshold of minimal clinically significant improvement to differentiate the treatment groups, because this difference will be considered clinically significant. Similarly, the proportions of patients achieving minimal or moderate improvement can be determined and compared between treatment arms. The ability of the same response criteria to be used not only as a continuous measure, where a higher score implies greater improvement, but also as a categorical response of minimal and moderate improvement, results in a unique hybrid aspect to these criteria.

    Another advantage of continuous response criteria over the previous IMACS response criteria is that inclusion criteria for clinical trials will not require minimal severity in any core set measure, because all levels of improvement in each core set measure contribute more or less to the response. However, for each trial the investigators will have to determine the entry criteria for baseline core set measure abnormality, but those will depend on the effect size, disease or organ target, recruitment, and feasibility rather than on the response criteria alone. This is an improvement over the previous IMACS preliminary response criteria, where the clinical trial inclusion criteria required a baseline deficit of at least 20% in each core set measure to enable reaching the threshold of ≥20% improvement in core set measures after treatment.

    Another important aspect of these response criteria is that they are based on an absolute per cent change in core set measures rather than relative per cent change, as used for scoring other rheumatologic diseases such as RA40 ,41 and prior myositis response criteria.9 The panellists strongly believed that absolute per cent change rather than relative per cent change in core set measures more accurately reflects the degree of change. For example, for a patient in whom disease activity improved from 2 to 1 cm on a 10-cm VAS, this was interpreted by experts as more consistent with 10% improvement (absolute per cent change) and not as 50% improvement reflected by relative per cent change. Also, because many of the myositis core set measures arbitrarily have 0 as the lower limit of normal, using 10-cm VAS, the relative per cent change is difficult to calculate if there is a change from 0 to a higher value.

    The myositis experts decided to use similar response criteria for adult DM/PM and juvenile DM, to facilitate combined clinical trials, such as the RIM trial.3 Another advantage of the response criteria is that although they are the same for adult DM/PM and juvenile DM, they address the unique differences in the core set measure responsiveness between the 2 disease entities by specifying higher thresholds for juvenile DM than for adult DM/PM, which reflects the fact that more responsiveness is seen in juvenile DM patients in clinical trials.3 ,5 Additionally, the juvenile DM response criteria allow for the possibility of using the IMACS or PRINTO core set measures and provide a more definitive threshold for major improvement.20

    Some limitations of the new response criteria should be noted. First, most of the core set measures, although proven to have good reliability and validity, are subjective and evaluator dependent. However, similar metrics have been used successfully in RA trials that used a physician global measure similar to that used for myositis.

    Second, only one major clinical trial was available for validation, and it failed to meet its primary end point and was not truly placebo controlled. Thus, we validated the results using the treating physician's improvement scores in the clinical trial.

    Third, the threshold for major improvement in the response criteria is considered preliminary due to an insufficient number of adult DM/PM cases showing major improvement. We believe that future studies using therapeutic agents that have a greater impact on myositis disease activity will lead to better clinical responses, thus allowing investigators to determine a final threshold for major improvement. We plan to validate major improvement in future studies.

    Fourth, given that the criteria are focused on improvement and thus fail to differentiate between no change and worsening, these criteria might not be applicable in studies of worsening disease activity (ie, disease flare designs) in myositis. However, in the future, it will be necessary to develop criteria for flare in myositis.

    Fifth, the response criteria were developed using a PM diagnosis based on the Bohan and Peter classification criteria, but experts now recognise that PM, according to those criteria, may include different syndromes, such as necrotising myopathy, the antisynthetase syndrome, and others.42 ,43 We believe that these response criteria will still be applicable to these newer entities given that the data- and consensus-driven processes described herein were inclusive of those syndromes. In the future, with changes in classification criteria terminology,44 the response criteria terminology will need to be modified accordingly.

    Sixth, because the criteria are complex and might be difficult to apply in research studies, we are developing a web-based tool as well as a downloadable calculator that will allow easy application of the response criteria. The time required to apply these criteria is estimated to be 25 min to complete the core set measures at each visit6 and 3 min to hand-calculate the total improvement score and degree of response, while with a computer-based system the calculation time is negligible. Moreover, although the criteria may appear to be complicated, the core set measures to be collected by any study or investigators are simple and are essentially the same as those in previous myositis studies and trials.

    Finally, patient-reported outcomes as core set measures, with the exception of the HAQ and patient global assessment, were not part of the response criteria, perhaps due to the paucity of sensitive and responsive patient-reported outcomes for DM/PM.45

    In conclusion, the development of data- and consensus-driven conjoint analysis-based continuous response criteria with quantitative assessment of improvement on a scale of 0–100 and with thresholds for minimal, moderate, and major (preliminary threshold) improvement marks a major advancement in assessing response in myositis clinical trials and studies. These response criteria are sensitive and specific and provide a way to determine clinically meaningful change corresponding to degree of clinical improvement. These response criteria were valid in a clinical trial and had excellent face validity and acceptance among myositis experts from various specialties who care for adult DM/PM patients in different parts of the world. A conjoint analysis-based definition with a continuous improvement score using absolute per cent change in core set measures with thresholds for minimal, moderate, and major improvement was selected as the response criteria to be used for adult clinical trials.

    This criteria set has been approved by the American College of Rheumatology (ACR) Board of Directors and the European League Against Rheumatism (EULAR) Executive Committee. This signifies that the criteria set has been quantitatively validated using patient data, and it has undergone validation based on an independent data set. All ACR/EULAR-approved criteria sets are expected to undergo intermittent updates.

    The ACR is an independent, professional, medical and scientific society that does not guarantee, warrant, or endorse any commercial product or service.

    Acknowledgments

    We thank the following individuals for providing invaluable input and feedback on project development and support: members of the American College of Rheumatology Criteria Committee; Dr Daniel Aletaha (European League Against Rheumatism), Drs Suzette Peng and Sarah Yim (US Food and Drug Administration), Drs Thorsten Vetter and Richard Vesely (European Medicines Agency), Bob Goldberg and Theresa Curry (The Myositis Association), Rhonda McKeever and Patti Lawler (Cure JM Foundation), and Irene Oakley (Myositis UK). We also thank Drs Michael Ward, Steven Pavletic, and Adam Schiffenbauer for their critical review of the manuscript. Paul Hansen, who with Franz Ombler owns and co-invented the 1000Minds software referred to in the article, provided intellectual and logistic support for this project.

    References

    View Abstract

    Footnotes

    • Handling editor Tore K Kvien

    • See Appendix A for members of the International Myositis Assessment and Clinical Studies Group and the Paediatric Rheumatology International Trials Organisation who contriuted to developing the response criteria.

    • RA and LGR contributed equally and FWM and JV contributed equally.

    • This article is published simultaneously in the May 2017 issue of Arthritis & Rheumatology.

    • Twitter Follow Hector Chinoy @drhectorchinoy

    • Collaborators APPENDIX A: MEMBERS OF THE INTERNATIONAL MYOSITIS ASSESSMENT AND CLINICAL STUDIES GROUP AND THE PAEDIATRIC RHEUMATOLOGY INTERNATIONAL TRIALS ORGANISATION WHO CONTRIBUTED TO DEVELOPING THE RESPONSE CRITERIA: Steering committee: Lisa G Rider (co-principal investigator), Nicolino Ruperto (co-principal investigator), Rohit Aggarwal (methodology lead), Frederick W Miller, Jiri Vencovsky. Statistical team: Rohit Aggarwal, Brian Erman, Nastaran Bayat, Angela Pistorio, Adam M Huber, Brian M Feldman, Paul Hansen, Howard Rockette, Peter A Lachenbruch, Nicolino Ruperto, Lisa G Rider. Adult core set survey group: Anthony A Amato, Hector Chinoy, Lisa Christopher-Stine, Lorinda Chung, Robert G Cooper, Lisa Criscione-Schreiber, Leslie Crofford, Mary E Cronin, Katalin Dankó, David Fiorentino, Ignacio García-De la Torre, Patrick Gordon, Gerald Hengstman, James D Katz, Andrew Mammen, Galina Marder, Neil McHugh, Chester V Oddis, Elena Schiopu, Albert Selva-O'Callaghan, Yeong Wook Song, Jiri Vencovsky, Gil Wolfe, Robert Wortmann. Clinical trial or natural history study data set contributions: Anthony A Amato, Hector Chinoy, Lorinda Chung, Robert G Cooper, Katalin Dankó, David Fiorentino, Ignacio García-De la Torre, Mark Gourley, Ingrid Lundberg, Frederick W Miller, Chester V Oddis, Paul Plotz, Lisa G Rider, Yeong Wook Song, Jiri Vencovsky. Adult patient profile working group: Rohit Aggarwal, Anthony A Amato, Dana Ascherman, Richard Barohn, Olivier Benveniste, Jan De Bleecker, Jeffrey Callen, Christina Charles-Schoeman, Hector Chinoy, Lisa Christopher-Stine, Lorinda Chung, Robert G Cooper, Leslie Crofford, Mary E Cronin, Katalin Dankó, Sonye Danoff, Maryam Dastmalchi, Marianne de Visser, Mazen Dimachkie, Steve DiMartino, Lyubomir Dourmishev, Floranne Ernste, David Fiorentino, Ignacio García-De la Torre, Takahisa Gono, Patrick Gordon, Mark Gourley, David Isenberg, Yasuhiro Katsumata, James D Katz, John Kissel, Richard L Leff, Todd Levine, Ingrid Lundberg, Andrew Mammen, Herman Mann, Galina Marder, Isabelle Marie, Neil McHugh, Joseph Merola, Frederick W Miller, Chester V Oddis, Marzena Olesinska, Nancy Olsen, Nicolo Pipitone, Sindhu Ramchandren, Seward Rutkove, Lesley Ann Saketkoo, Adam Schiffenbauer, Albert Selva-O'Callaghan, Samuel Katsuyuki Shinjo, Rachel Shupak, Yeong Wook Song, Katarzyna Swierkocka, Jiri Vencovsky, Julia Wanschitz, Victoria Werth, Irene Whitt, Robert Wortmann, Steven R Ytterberg. Conjoint analysis, adult group: Rohit Aggarwal, Anthony A Amato, Hector Chinoy, Lisa Christopher-Stine, Lorinda Chung, Robert G Cooper, Mary E Cronin, Katalin Dankó, Mazen Dimachkie, Steve DiMartino, David Fiorentino, Ignacio García-De la Torre, Patrick Gordon, Ingrid Lundberg, Herman Mann, Frederick W Miller, Chester V Oddis, Albert Selva-O'Callaghan, Jiri Vencovsky, Victoria Werth, Robert Wortmann, Steven R Ytterberg. Participants in consensus conference, adult working group: Anthony A Amato, Hector Chinoy, Robert G Cooper, Maryam Dastmalchi, Marianne de Visser, David Fiorentino, David Isenberg, James D Katz, Andrew Mammen, Chester V Oddis, Jiri Vencovsky, Steven R Ytterberg. Participants in consensus conference, paediatric working group: Rolando Cimaz, Rubén Cuttica, Sheila Knupp Feitosa de Oliveira, Brian M Feldman, Adam M Huber, Carol B Lindsley, Clarissa Pilkington, Marilynn Punaro, Angelo Ravelli, Ann Reed, Kelly Rouster-Stevens, Annet van Royen-Kerkhof.

    • Contributors All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Drs Aggarwal and Rider had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. All authors: Study conception and design; Acquisition of data; and Analysis and interpretation of data.

    • Funding Supported in part by the American College of Rheumatology, the European League Against Rheumatism, Cure JM Foundation, Myositis UK, Istituto G. Gaslini and the Paediatric Rheumatology International Trials Organisation (PRINTO), the Myositis Association, and the NIH (National Institute of Environmental Health Sciences (NIEHS), National Center for Advancing Translational Sciences, and National Institute of Arthritis and Musculoskeletal and Skin Diseases). IG-DlT's work was supported in part by CONACYT (Programa Nacional de Posgrados de Calidad). YWS's work was supported by the Korea Health Technology R & D Project through the Korea Health Industry Development Institute funded by the Ministry of Health & Welfare, Republic of Korea (grant HI14C1277). JV's work was supported by the Ministry of Health, Czech Republic (Institute of Rheumatology project for conceptual development of a research organisation, 00023728).

    • Competing interests None declared.

    • Provenance and peer review Not commissioned; internally peer reviewed.