Background The objective was to compare different definitions of remission and low disease activity (LDA) in patients with psoriatic arthritis (PsA), based on both patients’ and physicians’ perspectives.
Methods In ReFlap (Remission/Flare in PsA; NCT03119805), adults with physician-confirmed PsA and >2 years of disease duration in 14 countries were included. Remission was defined as very low disease activity (VLDA), Disease Activity index for PSoriatic Arthritis (DAPSA) ≤4, and physician-perceived and patient-perceived remission (specific question yes/no), and LDA as minimal disease activity (MDA), DAPSA <14, and physician-perceived and patient-perceived LDA. Frequencies of these definitions, their agreement (prevalence-adjusted kappa), and sensitivity and specificity versus patient-defined status were assessed cross-sectionally.
Results Of 410 patients, the mean age (SD) was 53.9 (12.5) years, 50.7% were male, disease duration was 11.2 (8.2) years, 56.8% were on biologics, and remission/LDA was frequently attained: respectively, for remission from 12.4% (VLDA) to 36.1% (physician-perceived remission), and for LDA from 25.4% (MDA) to 43.9% (patient-perceived LDA). Thus, patient-perceived remission/LDA was frequent (65.4%). Agreement between patient-perceived remission/LDA and composite scores was moderate to good (kappa range, 0.12–0.65). When patient-perceived remission or LDA status is used as reference, DAPSA-defined remission/LDA and VLDA/MDA had a sensitivity of 73.1% and 51.5%, respectively, and a specificity of 76.8% and 88.0%, respectively. Physician-perceived remission/LDA using a single question was frequent (67.6%) but performed poorly against other definitions.
Conclusion In this unselected population, remission/LDA was frequently attained. VLDA/MDA was a more stringent definition than DAPSA-based remission/LDA. DAPSA-based remission/LDA performed better than VLDA/MDA to detect patient-defined remission or remission/LDA. Further studies of long-term outcomes are needed.
- psoriatic arthritis
- patient perspective
- disease activity
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
What is already known about this subject?
In Psoriartic arthritis (PsA), remission or alternatively, low disease activity (LDA) is the treatment ojective.
Remission/LDA can be assessed using 2 main composite scores in PsA: very low disease activity (VLDA)/minimal disease activity (MDA) or Disease Activity iIndex for PSoriatic Arthritis (DAPSA).
What does this study add?
Investigating an unselected, standard of care population of 410 patients with psoriatic arthritis, both remission and low disease activity (LDA) were frequently attained: from 12.4% to 36.1% for remission and from 25.4% to 46.8% for LDA.
Patient-perceived remission/LDA was frequent (65.4%), indicating patients often reported themselves in a low level of disease activity.
Patient-perceived remission was as frequent as remission based on composite scores (very low disease activity (VLDA)/minimal disease activity (MDA) or Disease Activity index for PSoriatic Arthritis (DAPSA)); both were less frequent than physician-reported remission using a single question.
VLDA/MDA showed a lower sensitivity than DAPSA versus patient perspective (52% vs 73%) but had a higher specificity (88 vs 77%).
How might this impact on clinical practice or future developments?
DAPSA-based status had both sensitivity and specificity of around 75%, indicating that this score appears to better reflect patient-perceived LDA.
Psoriatic arthritis (PsA) is a complex inflammatory disease that spans a wide spectrum to include the peripheral joints, skin, entheses, spine and other adjacent tissues.
Recent management recommendations state that remission (REM), or in some cases low disease activity (LDA), is the treatment goal in PsA.1–4 Several composite disease activity measures have been developed, and the currently discussed treatment target definitions for REM/LDA are VLDA (very low disease activity)/MDA (minimal disease activity)5–7 and DAPSA (Disease Activity index for PSoriatic Arthritis) cut-offs of ≤4/≤14 (or clinical DAPSA, cDAPSA).8–10 These definitions each has strengths and weaknesses which hamper achieving consensus on one definition.11 12 To briefly summarise some of the issues, on the one hand, VLDA/MDA includes a measure of function (Health Assessment Questionnaire, HAQ) which can be influenced by factors other than disease activity—this may be a methodological issue. On the other hand, DAPSA only assesses joints and not directly any other domain of PsA, such as entheses or skin, and MDA does not assess dactylitis, and both do not assess all patient-important domains. While the Outcome Measures in Rheumatology Core Set states that all domains mentioned are of importance,13 the various multidimensional composite measures have major differences in their components and none uses all components. The question of unidimensional versus monodimensional scores has been widely addressed; however, there is currently no consensus in this respect. To this end we have used a unidimensional (DAPSA) and a multidimensional (MDA) instrument. Three recent studies have compared the VLDA/MDA outcomes with DAPSA outcomes in terms of frequency but did not assess the patient’s perspective in parallel.14–16
REM/LDA from the patient’s perspective has not been defined. The above composite measures factor in patient-reported outcomes including pain and patient global assessment.5–10 However, they were developed with little patient involvement, and cut-offs for REM/LDA were not patient-driven.17 18 This may be important since disagreements in the assessment of disease activity have a potential impact on treatment decisions and shared decision-making.19–21 The only data available regarding the patient’s assessment of REM/LDA are issued from studies on aspects of disease impact.22 23 However patient-perceived LDA or REM can be approached by specific designated questions, by the ‘patient acceptable symptom state’ or using low values of patient global assessment (PGA).24–26 REM/LDA can also be defined, from the physician’s perspective, as achieving an REM/LDA based on a global assessment of the physician (yes/no). Such single questions may have clinical relevance, although they have not yet been assessed formally.
Since alignment between patients and health professionals in terms of treatment targets is thought to be a key component for shared decision-making,27 28 it is of great interest to compare physician-perceived REM/LDA and composite scores with patient-perceived REM/LDA in the assessment of PsA.
The objectives of the present study were to assess the frequency of REM/LDA using different definitions according to the patient’s and physician’s perspective, and to assess agreement between these definitions.
Study population and study design
The ReFlaP (Remission/Flare in PsA) study was a prospective, multicentre, international, longitudinal, observational study which took place in 21 centres in 14 countries (including 7 countries across Europe, the UK, Russia, Canada, the USA, Brazil, Turkey and Singapore) between June 2017 and August 2018 (NCT03119805). The objective of the study was to assess REM/LDA in PsA. Patients were seen twice; here, baseline data were used.
Adult patients with a diagnosis of PsA as defined by their rheumatologist and more than 2 years of disease duration were recruited. Investigators were advised to consider the Classification Criteria for Psoriatic Arthritis (CASPAR) criteria for classification of PsA. Patients with no definite PsA or less than 2 years of disease duration, patients who did not speak or read the local language, or were not comfortable filling in a paper form in the local language were excluded. The inclusion of patients was performed consecutively.
The collected data included patient demographic variables (age, gender, work status, level of education) and the following disease characteristics: disease duration, predominant type of PsA (peripheral, axial or entheseal), current treatment (conventional synthetic disease-modifying antirheumatic drugs (csDMARDs) and/or biologic disease-modifying antirheumatic drugs (bDMARDs)). The Functional Comorbidity Index and the last available result (<4 weeks) for C reactive protein (CRP) were collected.29 Physical examination included assessment with 66 swollen joint count, 68 tender joint count, tender entheseal points (by the Leeds Enthesitis Index), body surface area of psoriasis and physician global assessment (on a scale of 0–10).30
PGA with a wording focused on disease activity was collected on a 0–10 numeric rating scale, as follows—‘How active was your rheumatic disease on average during the last week?’ (from ‘Not active’ to ‘Very active’)—and was used to calculate the composite scores.31 This wording refers to the concept of disease activity and has been used in other rheumatic diseases.31 As sensitivity analyses, this wording was replaced in the composite scores by wordings referring to global joint and global skin assessments.32 Also collected were the HAQ disability index and Patient Acceptable Symptom State (PASS) (in the absence of a standardised PASS question, the following wording was used: ‘If you were to remain for the next few months as you were during the last 48 hours, would this be acceptable or unacceptable for you?’ yes/no).33 34 The PsA Impact of Disease assesses the impact of PsA on 12 aspects, with a final result between 0 and 10 (higher results indicate a worse condition).35
The patient data collection form was translated by two persons into each local language according to usual procedures.
REM and LDA definitions
VLDA/MDA, DAPSA and cDAPSA were used to define REM and LDA (table 1).
Physicians were asked two separate single questions for REM/LDA, formulated by the steering committee as ‘At this time, is the psoriatic arthritis in remission, if this means: the absence of clinical and laboratory evidence of significant inflammatory disease activity?’ and ‘At this time, is the psoriatic arthritis in low or minimal disease activity?’
Of note the physicians answered these questions unblinded to other results (eg, they could consult the patient questionnaires and CRP results if they wished as in their routine clinical practice). No instructions were given as to which aspects of the disease should be considered when answering these questions, but the rheumatologists including patients into this study were all experienced in treating PsA and the question was related to PsA rather than to skin involvement, which was addressed in a separate question.
REM/LDA separate questions for patients were developed with input from four patient research partners with PsA and were based on previous work in the field of rheumatoid arthritis.36 37 The phrasing was the following: ‘At this time, is your psoriatic arthritis in remission, if this means: you feel your disease is as good as gone?’ (for REM) and ‘At this time, are you in low disease activity, if this means: your disease is in low activity but it’s not as good as gone?’ (for LDA).
From patients’ perspective, two potential definitions for REM were used: patient-perceived remission (single question as above) and PGA ≤1. Also, two definitions for LDA were used: patient-perceived LDA (single question) and PGA ≤3. The PGA cut-offs were informed, for REM, by the rheumatoid arthritis international REM criteria since no cut-off has been defined in PsA.38 For LDA, the cut-off of PGA ≤3 was selected by the steering committee. Such cut-offs are arbitrary, and given issues around circularity between PGA and the composite scores, the PGA external criterion should be considered as indicative only.
As a comparison outcome, the PASS was compared with a state of LDA.
All patients with items available to calculate REM/LDA with all definitions were analysed. Demographic, clinical and biologic variables were expressed as mean±SD for continuous variables and as frequencies (percentages) for categorical variables. No imputation of missing data was performed; data were analysed on complete cases. To obtain an overview of the meaning of patient-defined disease states, patient characteristics in each self-defined disease state were described. Proportions achieving each REM/LDA criterion were calculated, and for the composite score definitions REM and LDA groups were analysed separately and then also combined. Venn diagrams were used to represent the number of patients meeting different REM/LDA criteria. To assess performances of the composite scores, their sensitivity and specificity were calculated versus the reference definition, which was here patient-perceived status (ie, REM or REM/LDA). Thus, sensitivity was the percentage of patients in self-reported good status who was found in good status using the composite score, and specificity was the percentage of patients in self-reported lack of good status who were found in lack of good status using the score.
The agreement between the tested definitions was established using 2×2 tables and calculation of Cohen’s kappa and prevalence-adjusted bias-adjusted kappa (PABAK) where necessary, using Bennett’s method.39 40 In cases of discrepancy between Cohen’s kappa and PABAK, the paradox of the kappa may apply and PABAK should be analysed preferentially. Usual cut-offs to interpret kappas were used, namely 0.00–0.20 slight agreement, 0.21–0.40 fair, 0.41–0.60 moderate and 0.61–0.80 good agreement. R V.3.4.3 software was used for all statistical analyses.
Demographic and clinical characteristics
A total of 466 patients were included: 56 were ineligible (no confirmation of diagnosis, n=11; age below 18, n=1) or had missing data (mainly CRP, n=27; entheseal assessment, n=6; or HAQ, n=2; other criteria were missing in 9 patients). Thus, 410 with complete data were analysed (table 2). Of these, 50.7% were male and the mean disease duration was 11.2 years. Disease activity was moderate and the majority were receiving csDMARDs (59.3%) and/or bDMARDs (56.8%). Disease activity was lower in patients in self-defined REM or LDA, supporting validity of the questions applied in the present study (table 2).
Prevalence of REM/LDA according to the different definitions
The most frequent REM status was obtained using physician single question: 148 (36.1%) patients. cDAPSA (25.6% REM) and both of the patient-defined REM (single question, 21.5%; or PGA ≤1, 24.4%) were of similar frequency. DAPSA (19.0% REM) and especially VLDA (12.4%) were more stringent.
Low disease activity
This status was frequent, in particular when using the patient single question (43.9%; figure 1). The definition leading least frequently to this status was MDA (25.4%).
Remission + low disease activity
VLDA/MDA was difficult to reach with only 37.8% in REM/LDA; DAPSA was less limiting with 58.5% of patients. Patient-perceived REM/LDA and physician-perceived REM/LDA were also less limiting than VLDA/MDA and had similar frequencies (65.4% and 67.6%, respectively)
Of note, 269 (65.6%) patients were in PASS.
Agreement between REM/LDA definitions
Agreements between definitions are shown in table 3.
There was a very high agreement between DAPSA and cDAPSA REM, reflecting the similarity of the two definitions.12 13 The agreement between DAPSA/cDAPSA and VLDA and between PGA≤1 and VLDA, cDAPSA and DAPSA was high; however, the latter may reflect some circularity since PGA is a component of these measures.4–10 The agreement between VLDA/cDAPSA/DAPSA and patient-perceived REM was moderate to good and comparable (table 3).
Low disease activity
Excluding expected high agreement between DAPSA and cDAPSA LDA, agreements were lower for LDA than for REM (table 3).
Agreement between PASS and composite scores was moderate (kappa 0.56 and 0.59 and PABAK 0.33 and 0.58 for VLDA or MDA and DAPSA REM or LDA, respectively; data not shown).
Sensitivity/specificity of different REM/LDA definitions versus the patient’s assessment of status
When patient-perceived REM is used as a reference, the sensitivity of DAPSA-defined REM and VLDA was, respectively, 47.7% and 38.6%, and specificity was, respectively, 88.8% and 94.7% (table 4). Physician-perceived REM was less stringent, thus leading to higher sensitivity but with lower specificity (table 4).
Low disease activity
There were 180 patients in patient-perceived LDA. Of these, 62 (sensitivity, 34.4%) met the MDA criteria, 101 (56.1%) were in DAPSA-LDA and 60 (33.3) were not in LDA according to any composite score (table 4).
When analysing as outcome, either patient-perceived REM or LDA (ie, the sum of patients in these outcomes), the sensitivity of DAPSA-defined REM/LDA and VLDA/MDA versus patient-perceived status was, respectively, 73.1% and 51.5% (figure 2). Conversely, the specificity for DAPSA-defined REM/LDA and VLDA/MDA was, respectively, 76.8% and 88.0%.
This unique cohort of unselected patients with PsA brings important information on REM/LDA concepts and adds a dimension related to the patient’s perspective. Defining a specific target for REM/LDA is of importance because a treat-to-target approach with either REM or LDA as the target is now recommended in standard care by guidelines for patients with PsA.1 8 We were able to explore patient-perceived and physician-perceived REM/LDA using novel questions. We found that patient-perceived REM/LDA was frequent (65.4%); thus, patient-perceived REM/LDA was similar in terms of prevalence to physician-perceived REM/LDA (67.6%) and to DAPSA-based REM/LDA (58.5%) compared with a lower frequency of MDA/VLDA (37.8%). When comparing patient-perceived status and composite scores, we found neither DAPSA-REM nor VLDA could detect all patients in self-reported REM, although DAPSA performed better (sensitivity 47.7% and 38.6%, respectively). When analysing the status of pooled REM/LDA, agreement with composite scores was moderate to good; sensitivity was low for VLDA/MDA (51.5%) and higher for DAPSA-based cut-offs (73.1%), whereas specificity was high for both scores, although higher using VLDA/MDA (88.0% and 76.8%, respectively). Physician-perceived status appeared too lenient when using a single question, with low agreements with other definitions of REM. Finally, agreements between definitions were moderate for LDA (when analysed alone), indicating the concept of LDA may need further exploration.
This study had strengths and weaknesses. Recruitment occurred in tertiary care centres as reflected by a high percentage of patients under biologics, which may limit external validity. Nevertheless, it is generalisable due to the international large-scale recruitment strategy of consecutive patients with PsA. Furthermore, frequencies of REM/LDA were similar to other studies, which supports the validity of the present findings.14–16 Another difficulty was to choose among many possible definitions of REM/LDA since no consensus exists. The instruments investigated in this study, DAPSA and MDA, are the ones recommended by an international task force to be applied when measuring disease activity in PsA.3 This study brings new information on these instruments. Other possible definitions of REM/LDA provided by other measures41 42 were not assessed, since they did not obtain a majority vote in the treat-to-target recommendations which were developed by a large international task force.3 However, further research may explore such other instruments.
The scores were calculated using a wording for PGA, referring to disease activity and referring more to joints than skin; however, the results were overall similar when performing the analyses with patient global questions referring to either joints or skin. It is noteworthy that missing data were low (<15%) even though no queries were sent to the investigators, which supports the feasibility of these scores in clinical practice. A potential weakness is the use of non-validated single questions to explore patient-perceived and physician-perceived REM/LDA. It was not possible to use consensual questions since none exist. Thus, questions were developed for the purpose of this study. Of note, great attention was paid to their elaboration process by involving patient research partners to ensure face and content validity, while physician-perceived REM/LDA questions were developed by the steering committee. Thus, these questions were developed with relevant input and support the REM/LDA concepts. However, they reflect more PsA concepts than skin psoriasis concepts—this ought to be taken into account when interpreting the study. It should also be recognised that the present population had limited skin involvement, as is often the case in patients with PsA seen in rheumatology clinics.43 The results may differ in patients with more severe skin disease, for example, patients with PsA seen predominantly in dermatology offices or in patients with less well-controlled disease.
This study focused on patient-perceived REM/LDA. Patients defined themselves as in REM/LDA in around 65% of cases (figure 1). This is encouraging in terms of the overall disease burden of PsA44 and should be interpreted in the context that many of our study patients were receiving biologics. These results are in line with recent efforts to identify patients’ priorities.13 45 Interestingly, similar frequencies of low activity were found using REM/LDA questions and the PASS single question; this does not mean we suggest a PASS should be used as treatment target though; this criterion was used as grounding element only. Patient-perceived status refers to the disease process but also to patient expectations.23 Considering recruitment occurred in 14 countries for the present study, it is interesting to note that patient status was self-reported as satisfactory so often, since recent data have indicated high patient expectations in countries with higher gross domestic product.46 Such notions should be further explored.
When considering REM as the treatment target, we found composite scores to be only moderately in agreement with the patient perspective. In particular, 48.8% of patients in self-reported REM were not in VLDA or DAPSA-based REM, and 33.3% of those in self-reported LDA were not considered in LDA by composite measures. These figures lead to low sensitivities of composite scores to detect patient-defined REM, although DAPSA performed better than VLDA in this respect. Concordance was higher when pooling REM and LDA concepts. This may indicate limits of the composite scores to perfectly distinguish REM from LDA, and/or difficulties for patients to distinguish these states. LDA may be a personal concept and is more likely to carry different meanings for different people depending on their disease phenotype. Another explanation is that patients’ and physicians’ opinions on REM/LDA may differ and that composite measures may not entirely consider patients’ priorities.13 47 Patients probably do not only refer to disease activity when considering the concept of REM; thus, some discordance is expected. It would be interesting to further investigate the connection between achieving different disease activity states and long-term prognosis.
In the present study, physician-perceived REM/LDA was explored using designated specific questions. We found that physicians defined patients as in REM much more often than composite scores or patients themselves. This indicates physicians’ expectations for REM may be low, as has been previously suggested.19–21 23 47
Cross-tabulation of patient-perceived and physician-perceived REM/LDA is a novel contribution of our work. Agreement between patient-perceived and physician-perceived REM was not high, and as stated physicians were more lenient to define REM. However, the tendency was reversed for LDA: the frequency of patient-perceived LDA was 43.9% vs 31.5% for physician-perceived LDA. Perhaps the concept of LDA needs to be further defined with both patients and physicians. Considerably higher agreement and concordance of patient-perceived REM/LDA with composite REM/LDA definitions versus physician perceived REM/LDA confirms that physicians should not base medical decisions or their global assessment/gestalt (as this may underestimate disease activity) but use validated scores instead.48
In the present study, we confirmed that the frequency of REM and LDA was very variable according to the definition used, and in particular REM and LDA were more difficult to reach using VLDA/MDA than DAPSA-based cut-offs, as has been previously reported.14–16 This may be because of the inclusion of diverse domains of PsA (and in particular skin involvement), or because of low cut-offs for each measure. The psychometric properties of VLDA/MDA with Boolean features also make them more strict.38 49 Concerning agreements between these scores, kappas were also similar to the literature, with moderate agreement for REM but fair for LDA whatever the definition used.14 15
An original feature of our study was to cross-tabulate these composite measures with the patient’s perspective as an external anchor. To provide data on using one measure over another is of great importance since there is no consensus on what measure should be used in PsA. Kappa agreements were moderate to good for both of the scores and did not allow us to conclude. However, the comparison of these scores against patient-defined status, performed here for the first time, was very informative. We found that more patients in patient-perceived good status were also in DAPSA-based good status, both for REM, LDA and the combination. Of note, we advocate that REM should be the treatment goal, in accordance with recommendations; however, the exploration of REM/LDA was also valuable.1–4 In our study, patient-perceived REM/LDA occurred slightly more frequently as DAPSA-based definitions, with VLDA/MDA being rarer. DAPSA-based REM or REM/LDA had much higher sensitivity than VLDA/MDA against the reference of the patient-defined status, with only a slight loss of specificity. This means that DAPSA-based definitions correctly ‘detected’ much more patients in patient-defined REM or REM/LDA than VLDA/MDA. However, there were slightly more patients in DAPSA-based good status who did not report themselves in good status than among patients in VLDA/MDA (as illustrated for REM/LDA in figure 2). Thus each of these scores has different strengths depending on if the objective is sensitivity (ie, to detect patient-defined good status: here DAPSA performed better) or specificity (ie, to avoid overdetecting patients who did not self-report as doing well: here, VLDA/MDA performed better). However overall DAPSA-based cut-offs seemed to align better with the patient’s perspective. These results suggest that DAPSA-based status is closer to patients’ expectations than VLDA/MDA.
In conclusion, this international study of PsA disease activity highlights several important concepts regarding REM and LDA, including the aspect of truthfulness of the measures evaluated. Further studies of patients’ expectations and studies demonstrating the prognostic value of different disease states/definitions for long-term outcomes are needed to inform treatment targets.
We wish to thank all the patients who participated in the study and the medical staff of all participating centres (in particular K Fedorov, Germany). We gratefully acknowledge the help of our patient research partners: Heidi Bertheussen (Norway), Laurence Carton (France) and Jim Walker (Scotland).
Disclaimer : This is a summary of a scientific article written by a medical professional (“the Original Article”). The Summary is written to assist non medically trained readers to understand general points of the Original Article. It is supplied “as is” without any warranty. You should note that the Original Article (and Summary) may not be fully relevant nor accurate as medical science is constantly changing and errors can occur. It is therefore very important that readers not rely on the content in the Summary and consult their medical professionals for all aspects of their health care and only rely on the Summary if directed to do so by their medical professional. Please view our full Website Terms and Conditions.
Copyright © 2019 BMJ Publishing Group Ltd & European League Against Rheumatism. Medical professionals may print copies for their and their patients and students non commercial use. Other individuals may print a single copy for their personal, non commercial use. For other uses please contact our Rights and Licensing Team.
Handling editor David S Pisetsky
Contributors All authors except MdW were responsible for acquisition of data. CG, DP-Z, A-MO, LCC, JSS, MdW and LG contributed to study conception and design and data analysis. All authors contributed to data interpretation. CG and LG take responsibility for the integrity of the data and the accuracy of the data analysis. All authors were involved in drafting the article or revising it critically for important intellectual content, and approved the final version to be submitted for publication.
Funding The study received financial support from Pfizer through an unrestricted research grant. The fellow CG was additionally supported by a master’s bursary from Societe Francaise de Rhumatologie. LCC is funded by a National Institute for Health Research Clinician Scientist Award. Her research was supported by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre (BRC). A-MO is a Jerome L Greene Foundation Scholar and is supported in part by a research grant from the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) of the National Institutes of Health (NIH) under award number P30-AR070254 (Core B), a Rheumatology Research Foundation Scientist Development Award, and a Staurulakis Family Discovery Award.
Disclaimer All the statements in this report including its conclusions are the opinions of the authors and do not necessarily reflect those of NIH or NIAMS or the Foundation. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.
Competing interests None declared.
Patient consent Obtained.
Ethics approval Ethics approval was sought and obtained in each country or centre
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement The dataset of ReFlap study is available upon request from LG.