Background Different diagnostic and classification criteria are available for hereditary recurrent fevers (HRF)—familial Mediterranean fever (FMF), tumour necrosis factor receptor-associated periodic fever syndrome (TRAPS), mevalonate kinase deficiency (MKD) and cryopyrin-associated periodic syndromes (CAPS)—and for the non-hereditary, periodic fever, aphthosis, pharyngitis and adenitis (PFAPA). We aimed to develop and validate new evidence-based classification criteria for HRF/PFAPA.
Methods Step 1: selection of clinical, laboratory and genetic candidate variables; step 2: classification of 360 random patients from the Eurofever Registry by a panel of 25 clinicians and 8 geneticists blinded to patients’ diagnosis (consensus ≥80%); step 3: statistical analysis for the selection of the best candidate classification criteria; step 4: nominal group technique consensus conference with 33 panellists for the discussion and selection of the final classification criteria; step 5: cross-sectional validation of the novel criteria.
Results The panellists achieved consensus to classify 281 of 360 (78%) patients (32 CAPS, 36 FMF, 56 MKD, 37 PFAPA, 39 TRAPS, 81 undefined recurrent fever). Consensus was reached for two sets of criteria for each HRF, one including genetic and clinical variables, the other with clinical variables only, plus new criteria for PFAPA. The four HRF criteria demonstrated sensitivity of 0.94–1 and specificity of 0.95–1; for PFAPA, criteria sensitivity and specificity were 0.97 and 0.93, respectively. Validation of these criteria in an independent data set of 1018 patients shows a high accuracy (from 0.81 to 0.98).
Conclusion Eurofever proposes a novel set of validated classification criteria for HRF and PFAPA with high sensitivity and specificity.
- classification criteria
- inherited periodic fevers
- mevalonate kinase deficiency
Statistics from Altmetric.com
What is already known about this subject?
Hereditary recurrent fever (HRF) syndromes are genetic disorders secondary to mutations in genes involved in the innate immune response.
A number of classification or diagnostic criteria have been developed in the past.
Overall, these criteria lack accuracy and do not consider the results of genetic analyses, now an essential tool for the accurate diagnosis and classification of HRF.
What does this study add?
We developed and validate new evidence-based classification criteria for HRF and periodic fever, aphthosis, pharyngitis and adenitis, combining international expert consensus, statistical evaluation of real patients from a large data set of patients in the Eurofever Registry.
The new classification criteria combine for the first time clinical manifestations with genotype.
How might this impact on clinical practice or future developments?
The use of these classification criteria is recommended for inclusion of patients in translational and clinical studies, but they cannot be used as diagnostic criteria.
In the last 20 years the discovery of the inflammasome and the related genes of the now called systemic autoinflammatory diseases (SAIDs) has led to a completely new line of research. SAIDs are caused by exaggerated activation of the innate immune system, in the absence of high-titre autoantibodies or antigen-specific T-cells.1 1 Recurrent (or periodic) fevers are characterised by inflammatory flares separated by intervals of general overall well-being. Some conditions are caused by a genetic defect and are collectively referred to as hereditary recurrent fever (HRF). Familial Mediterranean fever (FMF) is caused by mutations of MEFV 2 3; mevalonate kinase deficiency (MKD), by mutations of the mevalonate kinase gene (MVK)4 5; tumour necrosis factor (TNF) receptor-associated periodic fever syndrome (TRAPS), by mutations of type I TNF receptor (TNFSRF1A)6; and cryopyrin-associated periodic syndromes (CAPS), by mutations of NLRP3.7 8 More common forms of recurrent fever syndromes include periodic fever, aphthosis, pharyngitis and adenitis (PFAPA) syndrome, which is a multifactorial disorder.9 So far, several clinical diagnostic and classification criteria have been proposed for HRF10–15 and PFAPA.9 16 Overall, these criteria lack accuracy and do not consider the results of genetic analyses, now an essential tool for the accurate diagnosis and classification of HRF.
This distinction between classification and diagnostic criteria is not always clear in clinical practice, and the two terms are often (wrongly) used interchangeably.17 Classification criteria facilitate accurate identification of diseases for clinical or epidemiological studies, in this context reliably differentiating one autoinflammatory disease from another, but are not designed to diagnose that autoinflammatory disease; hence, classification criteria make the assumption that important disease mimics (eg, chronic infection or malignancy) have already been excluded. In contrast, diagnostic criteria are designed to positively rule in a specific diagnosis in an individual patient, while excluding all conditions with different overlapping disease manifestations based on derivation and validation in cohorts that include important disease mimics. As such, classification criteria cannot be used as diagnostic criteria.18 19 The purpose of this study was to develop and validate new evidence-based classification criteria for HRF and PFAPA, combining international expert consensus and statistical evaluation of real patients from a large data set of patients in the Eurofever Registry.
A multistep process using consensus formation techniques (Delphi and nominal group technique (NGT)) and statistical evaluations on real patients was used to develop and test the classification criteria17 (online supplementary figure 1 and supplementary material), based on a methodological framework used successfully in previous studies in rheumatology.20–25
Step 1: selection of clinical, laboratory and genetic candidate variables
A panel of 162 international adult and paediatric experienced clinicians completed successive Delphi questionnaires in order to propose and then select and rank the variables (clinical manifestations, genetic analyses, laboratory examinations) from 1 (less important) to 10 (most important), for classification of each HRF26 and PFAPA.27
Step 2: classification of patients from the Eurofever Registry
After selection (online supplementary figure 2), a random sample of 360 patients, 60 patients for each disease (FMF, TRAPS, MKD, CAPS, PFAPA and undefined recurrent fevers (uRF)) were selected from the Eurofever Registry.28 The inclusion criteria for the enrolment in the registry have been previously described28 (see online supplementary material).
Twenty-five international experienced clinicians/researchers and eight geneticists (total of 33 panellists) in the field of SAID blinded on patients’ original diagnosis were invited to participate in a multiround, secured web process to classify each of the 360 patients into one of six mutually exclusive diagnoses.29 Clinicians and geneticists worked separately in the first steps (clinicians blinded to genetic results and geneticists blinded to clinical data) and then together to reach consensus ≥80% on all classifiable patients.
Step 3: statistical analysis for the selection of the best candidate classification criteria
The statistical analysis plan (full details in the online supplementary material) foresaw the following steps:
Evaluation through a univariate logistic regression of the relationship between each individual top variable identified in step 1 and each disease as derived from the panel’s classification.
Computer generation of more than 30 000 new candidate sets of classification criteria through linear combinations of genetic and clinical variables with improper linear modelling. Additionally, 11 sets of criteria were derived from the literature9–16 or proposed by members of the panel based on their expertise.
Identification of the top-performing criteria through ranking according to the Akaike information criterion (AIC), with best model having the lowest AIC.
Step 4: NGT Consensus Conference for the selection of the final classification criteria
The Consensus Conference was held in Genoa, Italy, on 18–21 March 2017. Clinicians and geneticists, who participated in the step 2 web consensus classification exercise, attended a meeting. The overall goal of the meeting was to decide on the final set of criteria, using a combination of statistical and consensus (≥80%) formation techniques with the consensus panel classification as reference standard.
Step 5: cross-sectional validation of the final classification criteria
The performance of the final set of classification criteria to discriminate patients with the different HRF and PFAPA was tested, using the original treating physician patients’ diagnosis as reference standard for the cross-sectional validation, postconsensus, in a separate set of 1018 patients selected after random computer generation from the Eurofever Registry, which contains all variables included in the final set of classification criteria.
A total of 100 different genotypes were reported in the 360 patients included in the classification process as reported in online supplementary table 2.
Nine patients with CAPS and two with TRAPS had no mutations detected using Sanger sequencing; thus, at the time of enrolment, somatic mosaicism could not be formally excluded in them. Low penetrant or incidental (non-confirmatory) genetic findings were also reported in 7 patients with PFAPA and 14 with uRF (online supplementary table 3).
Classification of patients from the Eurofever Registry
In the first two rounds, evaluation of clinical data by clinicians (blinded to genetic results) resulted in consensus of ≥80% for a total of 216 of 360 (60%) patients (figure 1); consensus was reached in 51 patients with MKD, 43 with TRAPS, 29 with FMF, 29 with CAPS, 26 with PFAPA and 38 with uRF. Similarly evaluation of demographic and genetic data by geneticists (blinded to clinical data) in two separate rounds reached consensus on 319 of 360 (89%) with 278 (77%) patients with 80% consensus after the first round. At the end of the two initial rounds, 128 (36%) patients were concordant between the independent evaluation of both the clinicians and the geneticists. At the end of the fourth round, consensus was achieved in 281 of 360 (78%) as follows: 56 (95%) MKD, 39 (76%) TRAPS, 37 (70%) PFAPA, 36 (71%) FMF, 32 (63%) CAPS and 81 (85%) uRF (figure 1, online supplementary table 4). K (concordance coefficient) agreement between the panel reference standard classification and the original patient diagnosis by the treating physician was 0.99 for MKD, 0.87 for TRAPS, 0.86 for CAPS, 0.84 for FMF and 0.68 for PFAPA.
Statistical analysis for the selection of the best classification criteria
The top variables arising from step 1 (see the Methods section) were included in a univariate logistic regression analysis using the 281 patients for which consensus was achieved by the panel as outcome. Clinical variables positively and negatively associated with each disease are reported in online supplementary table 5 with the related OR and 95% CI. The strategy for the classification of the genotypes is described in online supplementary table 6.
A total of 198 over >30 000 possible new sets of classification criteria (available on request; 50 for CAPS, 45 for FMF, 44 for TRAPS, 32 for MKD and 22 for PFAPA) were retained, based on their AIC, for further evaluation at the Consensus Conference, together with 11 criteria from the literature (online supplementary figure 4).
NGT Consensus Conference for the selection of the final classification criteria
The first disease discussed was FMF. After multiple voting sessions, all three tables of experts, which worked independently from each other, showed a complete convergent validity selecting the same top definition number 38 (online supplementary figure 4, session A), including genetic and clinical variables with a positive association (table 2). After general discussion, a second set of criteria based solely on clinical criteria was selected to be used as a possible tool for the indication for molecular analysis or as classification criteria in case genetic testing is not locally available (online supplementary figure 4, session B). To this aim, definition number 12, including clinical variables with both positive and negative association with the disease, was chosen (table 3).
The same approach was followed for the other HRFs (CAPS, TRAPS, MKD), leading to the selection of criteria with genetic and clinical variables (number 32 for CAPS, number 46 for TRAPS, number 37 for MKD) (table 2, online supplementary figures 6-8). As per the process to establish FMF criteria, a set of purely clinical criteria (ie, without genetic results) was also selected for each HRF, namely definitions number 20 and number 1 for MKD and TRAPS, respectively (table 3). For CAPS classification, the experts reached consensus on a modified version of recently published criteria.14 The performance of the original Kummerle criteria (using two out of six criteria) in the context of the present study population displayed a good sensitivity (0.91), but a low specificity (0.80).14 In contrast, when the variable ‘musculoskeletal pain’ was excluded, a higher specificity (0.94, with a sensitivity of 0.80) was achieved, if two out of five criteria are present (table 3). The most severe form of CAPS, chonic infantile neurological cutaneous articular (CINCA)/neonatal onset multisystemic inflammatory disorder (NOMID), displays a chronic rather than a recurrent disease course. Patients with CINCA were not included in the validation process described above. However, when the new genetic and clinical CAPS criteria were tested in a separate set of 70 patients with CAPS with chronic disease course enrolled in the Eurofever Registry, the sensitivity was 100% for the genetic and clinical criteria and 80% for the clinical criteria (not shown).
Clinical classification criteria for PFAPA were discussed between the 25 clinical panellists distributed in two tables (no geneticists). After discussion (online supplementary figure 8), definition number 13 (clinical variables with both positive and negative association) was chosen (table 3). During the Consensus Conference, the panel agreed on a few suggested mandatory criteria that should be fulfilled in all the patients before the application of the new classification criteria (table 3) with a consensus of 100% for point 1 and 96% for point 2.
Globally, convergent validity among the three tables of experts was obtained for the genetic and clinical definitions of FMF and CAPS, whereas for all the other definitions a partial convergent validity (agreement in two out of three tables) was reached, with the need for a final plenary voting session (online supplementary figures 4-8 and online supplementary table 8).
Cross-validation of the final classification criteria
The ability of the new classification criteria to discriminate among the different recurrent fevers and uFR was further tested in the validation data set of 1018 patients extracted from the Eurofever Registry (online supplementary table 9) using as reference standard for each patient the diagnosis given by the treating physician. In the last column of table 4, the genotype (score 0=negative/not done; score 1=not confirmatory; score 2=confirmatory) of patients not identified by the clinical criteria for HRF is reported. Notably, almost all the patients not classified by the clinical and genetic criteria displayed a negative or not confirmatory genotype (table 4). The performance of the new classification criteria (either clinical and genetic or clinical only) was generally superior (accuracy ranging from 0.81 to 0.98; table 4) to those already available in the literature (accuracy 0.56–0.94) (online supplementary table 10).
The present study provides new evidence-based classification criteria for the four ‘classical’ HRF (FMF, MKD, TRAPS, CAPS) and PFAPA, incorporating combined consensus expertise of clinicians and geneticists with statistical analyses in real patients from the Eurofever Registry. At variance with past work15 these new classification criteria combine genetic and clinical variables to overcome the paradox of the absence of a role of the molecular analysis for the proper identification of patients affected by these (mainly) genetic conditions. As defined by the American College of Rheumatology, the proposed classification criteria have selected clinical and genetic findings able to identify the defined diseases and separate from other confounding autoinflammatory conditions.18 19 Although these criteria may at times be helpful in clinical practice, they are explicitly not meant to be employed as diagnostic criteria. The advent of the so-called next-generation sequencing era resulted on one side to an increased availability of the molecular analysis at reduced costs but might often lead to difficulties in the proper interpretation of this large set of bioinformatic data. In fact, besides the identification of clearly pathogenic variants, in many circumstances (ie, low penetrance variants or variants of unknown significance, monoallelic variants in autosomal recessive diseases, presence of variants in more than one gene) the genetic results are not unequivocal and should be placed in the context of a pertinent clinical setting. In these latter cases, the classification of the patient is usually problematic, as clearly shown in the process of patients’ validation in this study. For these reasons, the panel decided to introduce a distinction between a confirmatory (namely, surely or likely pathogenic variants) and not confirmatory (variants of unknown significance) genetic test. For the daily use of the new criteria, a parallel consensus classification effort by the genetic subcommittee of the INSAID project has established the pathogenicity of each currently known variant associated to HRF.30 A differential approach for the interpretation of the biallelic variants was chosen for the two autosomal recessive diseases, namely MKD and FMF. MKD is caused by loss-of-function mutations in the MVK gene. The panel excluded the possibility of classifying a patient as an MKD in the absence of biallelic mutations of the MVK gene. Conversely, recent evidence has clarified that FMF is secondary to gain-of-function mutations of the MEFV gene, with a clear dose effect,31 32 and therefore FMF could be classified with identification of either one or two pathogenic variants in exon 10 of MEFV in the presence of a consistent clinical phenotype. The same possibility was also considered for the two autosomal dominant diseases, CAPS and TRAPS, in the absence of confirmatory phenotype. In patients carrying variants of unknown pathogenic significance (such as R92Q and P46L for TNFRSF1A, or V198M for NLRP3),33–36 only the presence of some very specific clinical variables would support the proper disease classification. In parallel with the elaboration of the definitive criteria that include genetic/clinical variables, the panel agreed on additional clinical criteria that should be used to (1) identify patients with recurrent fevers that would need to undergo genetic testing for molecular confirmation; (2) search for possible somatic mosaicism using NG in patients with a clear phenotype, but negative Sanger sequencing results; and (3) classify patients (eg, for epidemiological studies) even in those countries where routine genetic testing is not possible. For PFAPA, the contemporary evaluation of either positive (presence) and negative (absence) clinical variables yielded a much higher accuracy when compared with the classical modified Marshall’s criteria.16 Following the consensus meeting, the new sets of criteria were further validated in a large group of additional patients from the Eurofever Registry, showing a very high specificity when compared with previous literature criteria. As noticed, most of the diagnoses refuted by the new criteria had been in patients with either non-confirmatory or negative genetic tests results. It is therefore conceivable that the present new criteria will be more stringent in the classification of patients, by excluding a substantial proportion of patients carrying variants of unknown origin. The classification criteria we propose are accurate for the discrimination of one form of autoinflammation from another in the context of the six conditions considered herein, but very much have to be applied judiciously, after careful consideration of confounding diseases, as highlighted in table 2. These classification criteria are therefore intended for use for clinical, epidemiological or translational studies, but not for routine diagnostic purposes in individual patients.37 That said, the purely clinical classification criteria might guide molecular testing approaches for individual cases, although this point requires future validation. One possible limitation of the present study is the lack of comparison groups including possible confounding conditions (chronic infections, neoplasms, immune deficiencies, autoimmune disease and metabolic diseases) presenting sometimes with a recurrent disease course. In daily practice confounding diseases with a true recurrent disease course are rather infrequent outside HRF and PFAPA, while the most challenging group of confounding conditions are the large emerging group of patients with uRF, many of whom may have a true monogenetic cause other than the four genetic diseases considered herein. For these reasons, the different HRFs have been used as controls for each individual condition with PFAPA and uRF as genetically negative controls. The panel of experts unanimously decided that the presence of elevation of acute phase reactants during disease flares (recorded at least in one occasion) should be considered as mandatory preliminary criterion for the use of the new classification criteria.14 Some other relevant pathognomonic laboratory examinations, such as urinary mevalonic acid in MKD, were not available in the Eurofever Registry, probably reflecting the fact that it is not widely available for testing routinely. As such the panel recommended the importance for the diagnostic work-up, for example, with intracellular MVK enzyme activity and/or urinary mevalonic acid in MKD,38 particularly for patients with convincing phenotypes but non-confirmatory genotype for MKD. Similarly, the response to some specific treatments (such as colchicine in FMF or anti-interleukin (IL)-1 in CAPS) or ethnic background (for FMF) could certainly be considered as additional elements to be considered in daily practice, especially for patients with non-confirmatory genotype, but are not good discriminators of the different forms of autoinflammatory disease considered herein. In conclusion, the present work allowed the proposal of novel evidence-based classification criteria for HRF and PFAPA with a high specificity. The use of these classification criteria is highly recommended for inclusion of patients in translational and clinical studies, including clinical trials, and should not be misused as diagnostic criteria.17 The possible identification of new genetic entities in the heterogeneous group of undefined periodic fevers could require an update of the criteria in the future.
The authors are in debt with Dr Daniel Lovell for his precious support during the Consensus Conference and all PRINTO’s staff for the precious technical support.
MG and MH contributed equally.
Handling editor Josef S Smolen
Correction notice This article has been corrected since it published Online First. Table 4 has been amended.
Contributors MG, MH and NR coordinated the study, analysed the data and drafted the manuscript. SF, FV and CG analysed the data. FB and MPS performed statistical analysis. IA, JA, JIA, KB, EB-C, PAB, LC, IC, FDB, FD, ED, JF, RG-M, AG, VH, HH, TK, IK-P, JK-D, HJL, RML, AL, LO, DR, RR, YS, AS, NT, IT, YU, MvG, DK, DF and AM participated in the Delphi, patient evaluation and Consensus Conference. All authors evaluated and approved the manuscript.
Funding The project has been funded by E-Rare-3 project (INSAID, grant 003037603). Eurofever was supported by the Executive Agency For Health and Consumers (EAHC, Project No 2007332) and by Istituto G Gaslini. Novartis and SOBI provided unrestricted grants for the Consensus Conference.
Competing interests None declared.
Patient consent for publication Not required.
Ethics approval Independent ethical committee approval for enrolling patients into the registry was obtained from the participating centres in accordance with the local requirements. The study was performed according to the principles of the Declaration of Helsinki.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available in a public, open access repository. There are no data in this work. Data are available upon reasonable request. Data may be obtained from a third party and are not publicly available. No data are available. All data relevant to the study are included in the article or uploaded as supplementary information.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.