Article Text

Extended report
Interrater reliability and aspects of validity of the myositis damage index
  1. Shabina M Sultan1,
  2. Elizabeth Allen2,
  3. Robert G Cooper3,
  4. Sangita Agarwal4,
  5. Patrick Kiely4,
  6. Chester V Oddis5,
  7. Jiri Vencovsky6,
  8. Ingrid E Lundberg7,
  9. Maryam Dastmalchi7,
  10. Michael G Hanna8,
  11. David A Isenberg1
  1. 1University College London Hospitals, London, UK
  2. 2University College London, London, UK
  3. 3Hope Hospital, Manchester, UK
  4. 4St George's Hospital, London, UK
  5. 5University of Pittsburgh, School of Medicine, Pittsburgh, Pennsylvania, USA
  6. 6Institute of Rheumatology, Prague, Czech Republic
  7. 7Rheumatology Unit, Karolinska University Hospital, Stockholm, Sweden
  8. 8Institute of Neurology, University College London, London, UK
  1. Correspondence to Dr Shabina Sultan, Airedale General Hospital, Skipton Road, Keighley, West Yorkshire BD20 6TD, UK; ssultan{at}


Objective To test the interrater reliability, internal consistency and aspects of validity of the myositis damage index (MDI) in the assessment of damage in adult patients with idiopathic inflammatory myopathy (IIM).

Methods 95 patients were assessed in six centres as part of this cross-sectional international study. Two parts of a MDI were used to assess disease damage, the MDI and the myositis damage score (MYODAM). The myositis disease activity assessment tool (MDAAT) was used to assess disease activity. Interrater reliability was assessed using intraclass correlation coefficient (ICC). Spearman's rank correlation coefficient was used to measure the convergent validity of cross-sectional scores between the two parts of the damage tool and to determine the correlation between the respective components of the damage and activity tools.

Results In general, the damage index appears to have good interrater reliability for most of the systems with an ICC greater than 0.65. Convergent validity between the two parts of the damage tool showed good correlation for the individual organ systems (r>0.8). There were weak correlations between some parts of the MDI and corresponding components of the MDAAT.

Conclusion The MDI is a comprehensive tool to assess damage in patients with myositis. With physician education and emphasis to record items that have been diagnosed since the myositis diagnosis, the MDI will provide a valuable tool to assess damage in future clinical trials and longitudinal studies.

Statistics from

Idiopathic inflammatory myopathies (IIM) are a heterogeneous group of autoimmune diseases that include dermatomyositis and polymyositis. The mortality from these diseases appears to have declined over the past five decades.1 However, despite therapy, full remission is unusual and patients often remain on either corticosteroids alone or in combination with immunosuppressive drugs. It has been reported that long-term treatment with corticosteroids is associated with significant morbidity.2 It has become clear that a method to measure cumulative morbidity (ie, damage) in these patients is also necessary. Damage is defined as a persistent/permanent change in anatomy, physiology, pathology or function, which is considered to have occurred after the diagnosis of myositis and has been present for at least 6 months.3 The current lack of a standardised tool to record disease damage in IIM reflects the absence of any gold standard by which to judge individual organ involvement. This is an impediment to interpreting the existing literature, in which a variety of measures have been used, and to the design of future clinical trials. Furthermore, with the emergence of new therapies for patients with IIM, there is an immediate need to reach an international consensus with respect to standardised assessment tools, to allow comparison of clinical trial data.

An international group of specialists with expertise in IIM, the International Myositis Assessment and Clinical Studies Group (IMACS), has made substantial progress in the development and validation of several measures to capture the totality of the effect of the mulitsystem nature of the disease in an individual.3 These measures include the assessment of disease activity, damage and quality of life. Initial discussions regarding the development of the activity and damage tools and a comprehensive overview of these tools have already been published.4

Three core measures to assess damage have been suggested:3 (1) the physician global damage assessments (as measured by a visual analogue scale (VAS)); (2) the health assessment questionnaire/childhood health assessment questionnaire as measures of physical functionthis has been shown to measure a cumulative decline in function in adult patients;2 (3) the myositis damage index (MDI).

Initial development of the tool has been published using real patients interviewed and examined by a group of myositis experts, but not in a clinical setting.4 The aims of this study were to determine: (1) the interrater reliability of the MDI in the assessment of patients in clinical practice; (2) the internal consistency of the MDI; (3) the convergent validity between the MDI and the myositis damage score (MYODAM) tools; and (4) the validity of the MDI with respect to disease activity. Disease activity and damage are distinct domains; however, a weak association would be expected between these two domains as continuous disease activity would be expected to result in a progressive degree of damage over time. We hypothesised that the actual components of the damage index would correlate weakly or not at all with disease activity of the respective organ system during the 4 weeks before the assessment visit. We anticipated that these associations may be weak or absent, as disease activity and damage measure different aspects of the disease, and also a weak association would demonstrate that the two tools measure two distinct constructs.

Patients and methods

Study design

This was a prospective cross-sectional international multicentre study conducted in the setting of routine clinical practice. All patients were categorised as having probable or definite polymyositis or dermatomyositis according to the Bohan and Peter criteria5 at the onset of the study.

Data collection

Ninety-five patients were assessed in seven centres (University College London Hospital (DAI), Hope Hospital, Manchester (RGC), St Georges' Hospital, London (PK, SA), University of Pittsburgh (CVO), Rheumatology Unit, Karolinska University Hospital, Sweden (IEL, MD), Prague (JV) and University College London, Centre for Neuromuscular Disease – Queen Square (MGH)). Patients were assessed by an external rater (SMS) as well as the local physician, independently of each other. In each centre, access to the case history, clinical notes and laboratory investigations were available to both assessors. The patients were seen by the local physician and the external rater to obtain a history and for a physical examination. Although there is a potential for bias, as the treating physicians' decisions were available from the case notes, an independent assessment (SMS) was made based on history, physical examination and laboratory investigations in order to complete the disease activity and damage tools. The assessments were based on a judgement at the time point when the patient was seen for the study. The MDI (see supplementary material S1, available online only) was used to record damage and the myositis disease activity assessment tool (MDAAT;4 see supplementary material S2, available online only) was used to record disease activity. Assessments were based on patients' histories and physical examinations, taking into account basic laboratory tests and, if performed, chest x-ray, pulmonary function tests and high-resolution CT of the thorax.

Assessment of damage

The MDI measures cumulative organ damage that has occurred since the onset of the disease. Both parts of the tool (S1) take a comprehensive approach to capture the presence and extent of organ involvement. No attempt is made to attribute the cause to the disease itself, treatment or comorbidity. The MDI counts the items of damage in 11 organ systems; this portion of the MDI is in essence a modification of the Systemic Lupus International Collaborating Clinics (SLICC)/American College of Rheumatology (ACR) damage index.6 Briefly, damage is defined as being present or absent in 11 organ systems: muscle (0–3); skeletal (0–4); cutaneous (0–5); gastrointestinal (0–3); pulmonary (0–4); cardiovascular (0–4); peripheral vascular disease (PVD) (0–4); endocrine (0–6); ocular (0–2); infection (0–2) and malignancy (0–1). Damage within each organ system is measured by the satisfaction of specific attributes for each organ system. Each item is scored 0 if it has never been present or 1 if it has been present for at least 6 months or NA if it does not fit 0 or 1. The number of attributes (items) ranges from 1 for malignancy to 6 for endocrine. The sum of the 0–1 item scores is divided by the maximum possible score (excluding items scored NA); the maximum possible score is 38; giving the MDI score for the individual. The definitions and determinants of the index are mainly based on clinical grounds, or on the results of readily available investigations, such as chest radiographs and if clinically indicated high-resolution CT of the chest. The MYODAM consists of a series of 10 cm VAS used to quantitate the severity of damage in the same organ/systems as the MDI. The total of the 11 VAS scores is divided by the maximum possible score (excluding systems that were not assessed); giving the severity score for the individual. The maximum total possible score for the MYODAM is 110.

Assessment of disease activity

The MDAAT is divided into two tools, and consists of the myositis intention to treat index (MITAX) and the myositis activity assessment (MYOACT) by VAS (S2). Seven target organs/systems are assessed: constitutional; mucocutaneous; articular; gastrointestinal; respiratory; cardiovascular and muscle. A physician global VAS and an extramuscular global VAS are also obtained for the disease activity. Initial development of the MDAAT has been reported.4 7

Statistical considerations

Data were collected using the appropriate tool and immediately entered into a database. Each individual value was entered (ie, no summary measures were calculated). There are a number of measures that can be used to assess reliability; however, for presentation purposes, it was decided that a commonality in analysis would be useful. Therefore, for each system, the interrater reliability of the MDI and the MYODAM scale was assessed using an intraclass correlation coefficient (ICC). The ICC is well defined for all outcomes, but the distributional properties are best understood for the more continuous measures. Although the numerical values must therefore be interpreted with some caution, they should provide qualitative guidance for the comparison of the behaviour of the different tools. A three-way model was used and, following the approach of Shrout and Fleiss,8 and a pooled within-centre ICC with 95% CI was defined on the basis of physician, patient and error variation. (The variance components were calculated from the individual values entered in the database.) This is a slight generalisation of the ICC, as described by Shrout and Fleiss.8 Although centre was adjusted for in the analysis, it can be considered to be an artefact of the design and has not been incorporated into the ICC. An ICC greater than 0.65 has been used as indicating good reliability between physicians.

Analysis of variance was used to estimate the variance components, under the assumption that patients and physicians were chosen randomly from larger populations. This assumption allows the results to be generalised beyond this study. This approach has been used before.4 Details of the statistical considerations have been published.9

To assess the convergent validity of the MDI and MYODAM for each outcome, each participant's scores (MDI and MYODAM) were correlated using Spearman's rank correlation coefficient to measure the convergent validity of the cross-sectional scores. Spearman's rank correlation was used because, although both the MDI and MYODAM are continuous measures, they are not normally distributed. Spearman's rank correlation was also used to determine the correlation between the individual components of the damage and activity tools (to assess construct validity). Statistical analyses were performed using S-Plus and Stata.


Patients and disease characteristics

Ninety-five myositis patients were assessed. There was a 2:1 female to male ratio. The mean age at diagnosis was 45.3 years (range 6–70) and the mean disease duration was 7.5 years (range 0.4–23). Eighty per cent of patients were Caucasian, 10% Afro-Caribbean or African-American and 8% Asian (from the Indian subcontinent) and 2% other. Fifty-eight per cent of patients had polymyositis, 36% dermatomyositis and 6% of patients had overlap with systemic lupus erythematosus (SLE) or scleroderma. Tables 1 and 2 show the disease activity and damage indices of the patients. A total of 92% of patients had a score greater than 0 for at least one category in the MDI, the majority in the muscle (81%), endocrine systems (51%) and pulmonary system (39%). The mean total damage score was 5 (range 0–14) (the maximum possible total damage score is 38). The mean score for the total damage as scored on the MYODAM was 5.67 (SD5.06), the median score was 4.5 (range 0–22.7) (the maximum score for the total MYODAM is 110). The mean global damage score was 2.69 (SD1.91), median 2.7 (range 0–7). The global damage is recorded on a 0–10 cm VAS.

Table 1

Myositis damage index scores in 95 patients

Table 2

Disease activity scores as measured by MITAX in 95 patients

A section of ‘other damage’ is available to record items not specifically recorded under the 11 organ systems. Depression was the most frequently recorded item, highlighting the significant impact of the disease on patients.

Eighty-four per cent of patients had some degree of disease activity as defined by the MITAX. Four per cent had grade A (denotes disease thought to be sufficiently active to require high-dose daily corticosteroids alone or in combination with high doses of other immunosuppressive drugs or intravenous gammaglobulin) and 66% had grade B (denotes disease that is less active than in ‘A’; requiring moderate doses of prednisolone, ie, <20 mg, if immunosuppressive agents or intravenous gammaglobulin were used to treat signs and symptoms of category A; the doses of at least one agent would be reduced from levels required in category A). Two thirds of patients had active muscle disease at the time of assessment and a third of patients had active respiratory disease. Fatigue was the most common constitutional symptom recorded.

Analysis of reliability of the damage index

Data from the seven centres were pooled and tested for interrater reliability using ICC. In general, the MDI and MYODAM tools appeared to have good interrater reliability for most of the systems with an ICC greater than 0.65 (table 3). Only three systems had relatively low reliability: (1) gastrointestinal, as a result of omission in specifically asking about gastrointestinal symptoms; (2) carciovascular, as a result of physicians recording events such as hypertension and ischaemic heart disease diagnosed before the diagnosis of myositis, as often there was disagreement as to when the pathology for the disease may have occurred; and finally (3) PVD, as a result of disagreement in recording what represented a rare event. For the remainder of the organ/system assessments, agreement was good, in particular for the global assessment as measured by a 0–10 cm VAS (ICC 0.72).

Table 3

Interrater reliability for each organ system

The results reported for the validity exercise are from the independent assessor.

Analysis of the convergent validity of the damage index

There was good correlation between the MDI and the MYODAM with r greater than 0.8 in all systems, suggesting that they measure the same phenomenon (table 4).

Table 4

Correlations between MDI and MYODAM

Analysis of association among the MDI components

No association was found between the different MDI domains (data not shown).

Analysis of association between the activity and damage components (construct validity)

There was evidence of weak correlations between the MDI scores and the corresponding MITAX or between the MYODAM and MYOACT components (table 5), suggesting relative independence of the damage and activity assessment instruments.

Table 5

Spearman's rank correlations between activity and damage scores


These findings demonstrate that the MDI is a reliable tool for assessing cumulative damage in patients with polymyositis/dermatomyositis in clinical practice. Previously published data on the use of these indices have been performed using real patients, but not in a clinical setting.4 In the study by Isenberg et al4 an hour was allowed for each patient assessment. The assessors were provided with a one-page synopsis of patients' histories and relevant investigations.

The current study has been performed in the setting of clinical practice in an outpatient setting, with the usual constraints of time and note keeping. However, despite these constraints, the results have demonstrated that the MDI is a reliable tool for assessing damage in routine clinical practice, with an ICC of greater than 0.6 in most organ systems (table 3). The raters involved were physicians experienced in the care of myositis patients. Since the initial study4 physician education and familiarity with the instrument has contributed to an improvement in its reliability.

The majority of the disagreements between raters were not related to the index. Instead, almost all disagreements were due to rater errors, for instance: (1) incorrectly including items diagnosed before the diagnosis of myositis; (2) recording errors (misclassifications); (3) failing specifically to address items on the index (eg, not asking about hirsutism, irregular menses and sexual dysfunction in the endocrine section). In addition, there was less variability in the cardiac and PVD systems, this could account for the smaller ICC, as reliability is harder to measure in homogeneous populations.

There was good convergent validity between the MDI and the MYODAM with r greater than 0.8, thus suggesting that they measure the same phenomenon (table 4). There was also a good correlation between the total MDI score and the total MYODAM scores (r=0.87), and between the total MDI and the physician global scores (r=0.84). This suggests that the summary scores measure the same phenomenon-accumulated damage. However, the completion of the MYODAM relies on information recorded from the MDI and so we would suggest that the MYODAM should not be used alone. We appreciate that both indices were recorded simultaneously by the same rater at any one visit, and thus the above correlation is not surprising. However, due to study limitations, we were not able to have them recorded by different observers. However, we would expect in any clinical setting only one rater would be involved in the completion of these tools.

No association was found between the different MDI domains (data not shown). This has also been found for the SLICC/ACR damage index (SDI) when used as a measure of accumulated damage in SLE.10 It was therefore suggested that the total SDI would not fulfil the minimal metric criteria for an internally consistent scale.10 In SLE it was found that the renal component of the damage index was predictive of renal failure while the pulmonary component was predictive of mortality.11 In contrast, the total damage scores were of little prognostic value. Further investigation is underway to assess the prognostic value of the MDI tool.

An international study in SLE patients showed that there was no correlation between disease activity and damage at a single point in time.11 However, a single centre assessing a larger number of patients reported a weak but significant relationship between the corresponding damage and activity for the cardiovascular, pulmonary, peripheral vascular and musculoskeletal systems in SLE patients.12 It should be pointed out though that the tools used to determine disease activity differed between these two studies. The current study in myositis patients demonstrates that, at a single point in time, there are weak correlations between the muscular, pulmonary, skeletal and gastrointestinal components of the damage index and the activity tool (table 5). A weak association cross-sectionally would be expected as continuous disease activity would be expected to result in a progressive degree of damage over time. This would support the validity of these components of the MDI. The low correlations between the activity and damage indices indicate that the two tools measure different aspects of the disease, and so are distinct tools measuring separate but related constructs. This supports the complementary value of the activity and damage assessment tools in IIM.

In summary, the MDI is a comprehensive tool for use to assess damage in patients with myositis. This is the first major attempt to assess the interrater reliability and aspects of the validity of a damage index in myositis in an outpatient setting. This was an international study involving physicians from different healthcare systems, demonstrating that physicians were able to record damage in patients in a similar way. This information is potentially important for future collaborative studies of patients with myositis that include the assessment of damage. Emphasising to physicians that they only record items that have been affected since the diagnosis of myositis will provide a valuable tool to assess damage in future clinical trials. However, incorporation of the MDI into clinical trials would require formal training of the users. To facilitate the scoring of these indices a computerised version is being developed.


Supplementary materials


  • Funding The UK Myositis Support Group supported SMS for this study. JV obtained institutional support (MSM 0021620812) from the Ministry of Education, Youth and Sports. Grant support for IEL was received from the Swedish Rheumatism Association.

  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.