Article Text

Use of non-steroidal anti-inflammatory drugs and risk of death from COVID-19: an OpenSAFELY cohort analysis based on two cohorts
  1. Angel YS Wong1,
  2. Brian MacKenna2,
  3. Caroline E Morton2,
  4. Anna Schultze1,
  5. Alex J Walker2,
  6. Krishnan Bhaskaran1,
  7. Jeremy P Brown1,
  8. Christopher T Rentsch1,
  9. Elizabeth Williamson1,
  10. Henry Drysdale2,
  11. Richard Croker2,
  12. Seb Bacon2,
  13. William Hulme2,
  14. Chris Bates3,
  15. Helen J Curtis2,
  16. Amir Mehrkar2,
  17. David Evans2,
  18. Peter Inglesby2,
  19. Jonathan Cockburn3,
  20. Helen I McDonald1,
  21. Laurie Tomlinson1,
  22. Rohini Mathur1,
  23. Kevin Wing1,
  24. Harriet Forbes1,
  25. Rosalind M Eggo1,
  26. John Parry3,
  27. Frank Hester3,
  28. Sam Harper3,
  29. Stephen JW Evans1,
  30. Liam Smeeth1,
  31. Ian J Douglas1,
  32. Ben Goldacre2
  33. The OpenSAFELY Collaborative
  1. 1Department of Non-Communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
  2. 2The DataLab, Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, Oxfordshire, UK
  3. 3TPP, Leeds, UK
  1. Correspondence to Dr Angel YS Wong, London School of Hygiene & Tropical Medicine, London, UK; angel.wong{at}


Objectives To assess the association between routinely prescribed non-steroidal anti-inflammatory drugs (NSAIDs) and deaths from COVID-19 using OpenSAFELY, a secure analytical platform.

Methods We conducted two cohort studies from 1 March to 14 June 2020. Working on behalf of National Health Service England, we used routine clinical data in England linked to death data. In study 1, we identified people with an NSAID prescription in the last 3 years from the general population. In study 2, we identified people with rheumatoid arthritis/osteoarthritis. We defined exposure as current NSAID prescription within the 4 months before 1 March 2020. We used Cox regression to estimate HRs for COVID-19 related death in people currently prescribed NSAIDs, compared with those not currently prescribed NSAIDs, accounting for age, sex, comorbidities, other medications and geographical region.

Results In study 1, we included 536 423 current NSAID users and 1 927 284 non-users in the general population. We observed no evidence of difference in risk of COVID-19 related death associated with current use (HR 0.96, 95% CI 0.80 to 1.14) in the multivariable-adjusted model. In study 2, we included 1 708 781 people with rheumatoid arthritis/osteoarthritis, of whom 175 495 (10%) were current NSAID users. In the multivariable-adjusted model, we observed a lower risk of COVID-19 related death (HR 0.78, 95% CI 0.64 to 0.94) associated with current use of NSAID versus non-use.

Conclusions We found no evidence of a harmful effect of routinely prescribed NSAIDs on COVID-19 related deaths. Risks of COVID-19 do not need to influence decisions about the routine therapeutic use of NSAIDs.

  • arthritis
  • rheumatoid
  • COVID-19
  • epidemiology
  • osteoarthritis

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See:

Statistics from

Key messages

What is already known about this subject?

  • There have been concerns that non-steroidal anti-inflammatory drugs (NSAIDs) may increase the risk of COVID-19 disease. Recent observational studies reported no evidence of a harmful effect of NSAID use on COVID-19 severity among patients with COVID-19.

  • However, most studies were of much smaller sample size, not general population based or did not specifically investigate individual NSAIDs (eg, naproxen and ibuprofen).

  • In addition, limited clinical data are available to advise patients using long-term NSAID treatment (including people with rheumatoid arthritis and osteoarthritis) whether the treatment should be continued or stopped in the context of COVID-19 pandemic.

What does this study add?

  • We identified two study populations (2 463 707 people who ever used NSAIDs in the past 3 years from the general population and 1 708 781 people with rheumatoid arthritis/osteoarthritis) in England using OpenSAFELY platform. We then grouped them into current users and non-users, respectively, in each study population.

  • In both populations, no association between NSAIDs and COVID-19 related death was found.

How might this impact on clinical practice or future developments?

  • This study does not support the hypothesis of any harmful effect of NSAIDs on COVID-19 related deaths among regular NSAID users.

  • Treatment decisions about the routine use of NSAIDs do not need to be influenced by fears of an effect on COVID-19 outcomes.


COVID-19, caused by the SARS-CoV-2, has been diagnosed in approximately 18 million patients with >690 000 deaths in >200 countries as of 5 August 2020.1

Non-steroidal anti-inflammatory drugs (NSAIDs) are widely prescribed for relief of pain and inflammation with nearly 11 million NSAID prescriptions dispensed in primary care in England in the last 12 months.2 Additionally, some NSAIDs (eg, ibuprofen and aspirin) are available for sale without a prescription with a single brand of ibuprofen alone having sales of approximately £100 million per annum.3 Nine non-interventional studies have suggested that NSAIDs may be associated with increased risk of complications of lower respiratory tract infections4–12; though there is evidence that indometacin may have protective antiviral effects reported from a single animal study.13

There is now a debate over whether NSAIDs may worsen the prognosis of COVID-19. On 14 March, it was recommended in France that patients should avoid NSAID use due to an apparent worsening of COVID-19 in those taking ibuprofen, based on unpublished reports.14 This gained worldwide attention and resulted in the National Health Service (NHS) England medical director issuing a directive that paracetamol should be used in preference to NSAIDs14 for symptoms of COVID-19. Subsequent reviews by USA, UK and EU drug regulators15–17 recommended that individuals currently using NSAIDs for the management of chronic diseases should continue the treatment while calling for more evidence of the impact of NSAIDs in patients with COVID-19. Two systematic reviews highlighted a lack of studies investigating the effect of NSAIDs on COVID-19, demonstrating the urgent need of new studies.18 19 One cohort study was recently conducted to investigate such association, but individual NSAIDs were not specifically investigated.20

We therefore investigated the association between NSAID use and deaths from COVID-19 using linked data from >17 million patients in England. We further examined whether the association varied by types of NSAID.


Study design

We conducted two cohort studies using primary care electronic health record data linked to death data from the Office for National Statistics between 1 March 2020 and 14 June 2020.

Data source

Primary care records managed by the software provider The Phoenix Partnership (TPP) were linked to Office for National Statistics death data through OpenSAFELY, a data analytics platform created by our team on behalf of NHS England.21 The dataset analysed within OpenSAFELY is based on 24 million people currently registered with primary care practices using The Phoenix Partnership SystmOne software, representing 40% of the English population. It includes pseudonymised data such as coded diagnoses, prescribed medications and physiological parameters.

Study populations

We identified two cohorts, anticipating that underlying factors influencing NSAID use and therefore potential biases would differ between them. The first cohort was all people with ≥1 oral NSAID prescription within the 3 years before study start (1 March 2020), identified from the general population. It was chosen to minimise confounding by restricting to people who were currently prescribed NSAIDs and those who recently stopped NSAIDs as their characteristics were likely to be more comparable than never-users. The second cohort was all people with a diagnosis of rheumatoid arthritis (RA)/osteoarthritis (OA) before study start. It was chosen because they were potential NSAID users with similar underlying diseases to reduce confounding by indication. A patient could be included in both cohorts.

In both cohorts, people with missing data for gender, index of multiple deprivation, <1 year of primary care records or aged <18 or >110 years were excluded. Aspirin is used at lower doses as an antiplatelet to prevent cardiovascular disease,22 indicating aspirin users constitute a different population from other NSAID users. We therefore excluded people ever prescribed aspirin in the 10 years before study start or a record of either stroke or myocardial infarction before study start. We excluded people with a record of gastrointestinal bleeding or current asthma before the study start, as they are contraindications to NSAIDs.22


In the main analysis, we defined current NSAID users as those ever prescribed NSAID in the 4 months prior to study start, and non-users are those with no record of NSAID prescription in the same time period.

We examined whether the association varied by types of NSAID, specifically: (1) naproxen dose (categorised as non-use, high-dose naproxen (500 mg), low-dose naproxen (250 mg) and other NSAIDs based on the strength of the formulation), (2) COX-2 specific NSAIDs (categorised as non-use, COX-2 specific (celecoxib/etoricoxib) and non-specific NSAIDs) and (3) ibuprofen (categorised as non-use, ibuprofen and other NSAIDs).


Follow-up for each cohort began on the 1 March 2020 and ended either on date of death or study end date (14 June 2020). If people in the non-user group received a NSAID prescription after 1 March 2020, they were censored at the date of this prescription (online supplemental figure S1).

The outcome was COVID-19 related death as registered in Office for National Statistics data using International Classification of Diseases (ICD)-10 codes U07.1 (‘COVID-19, virus identified’) and U07.2 (‘COVID-19, virus not identified’) listed either as the underlying or any contributing cause of death. The latter ICD-10 code is used when laboratory testing is inconclusive or unavailable.23


Figure 1 presents the final list of potential confounders. Our methodology for creating codelists for variables has been previously described.21 All codelists for identifying exposures, covariates and outcomes are openly shared at

Figure 1

Prespecified hypothetical confounders. A&E, accident & emergency; DMARD, disease-modifying antirheumatic drugs; HbA1c, hemoglobin A1c; GP, general practice.

Statistical methods

Baseline characteristics in each cohort were summarised using descriptive statistics, stratified by exposure status. Time to COVID-19 related death was displayed in Kaplan-Meier plots. We present adjusted cumulative mortality curves and the difference between curves using the Royston-Parmar model. We estimated HRs with 95% CIs for the association between current NSAID use and COVID-19 related death using Cox regression with time since cohort entry as the underlying timescale. We accounted for competing risk by modelling the cause-specific hazard (ie, censoring non-COVID-19 deaths). We used graphical methods and tests based on Schoenfeld residuals to explore violations of the proportional hazards assumption.

Unadjusted models, models adjusted for age (using restricted cubic splines) and sex and multivariable-adjusted models including covariates listed in figure 1 were fitted. We stratified the multivariable-adjusted models by geographical regions, defined by Sustainability and Transformation Partnerships,24 to account for between-region variations. We evaluated the variation by age (under and 70+ years old) and performed likelihood ratio tests to analyse effect modification.

Quantitative bias analysis

We used e-value formulae to calculate the minimum necessary strengths of association between an unmeasured confounder and exposure or outcome, conditional on measured covariates, to fully explain observed non-null adjusted associations (ie, to move the observed non-null association to the null).25

Sensitivity analyses

Table 1 shows the list of sensitivity analyses.

Table 1

List of sensitivity analyses

Software and reproducibility

Data management was performed using Python V.3.8 and SQL, with analysis carried out using Stata V.16.1. All study analyses were preplanned unless otherwise stated. All code for data management and analyses in addition to the prespecified protocol are archived at:

Patient and public involvement

Patients were not formally involved in developing this specific study design that was developed rapidly in the context of a global health emergency. We have developed a publicly available website through which we invite any patient or member of the public to contact us regarding this study.


Online supplemental figure S3 shows the flow chart of inclusion of participants. A total of 561 027 (13%) individuals were included in both study populations. Of them, 175 495 (25%) were current NSAID users and 385 532 (11%) were non-users.

Main analysis

Study population 1: general population

Patient characteristics

We included 536 423 current NSAID users and 1 927 284 non-users (table 2). Median age was 53 years (IQR 42–64) among current users and 49 years (IQR 36–60) among non-users. More women were current users (59.2%) than non-users (56.7%).

Table 2

Demographic and clinical characteristics

Current users were more likely to be obese, former smokers and have a medical history of hypertension, diabetes, other respiratory diseases, cancer, chronic kidney disease, OA and RA than non-users. Current users were also more likely to have a prescription for statins, proton pump inhibitors and disease-modifying antirheumatic drugs and to have had more primary care consultations and vaccinations than non-users.

Unadjusted and multivariable results

Online supplemental figures S4 and S5 present time to COVID-19 related death in Kaplan-Meier plots and adjusted cumulative mortality plots. We identified 832 COVID-19 related deaths in the general population (online supplemental table S1). The unadjusted HR for current NSAID use was 1.26 (95% CI 1.08 to 1.47), compared with non-use in the unadjusted model (figure 2). In the multivariable-adjusted model, we observed no evidence of difference in risk (HR 0.96, 95% CI 0.80 to 1.14). There was no evidence suggesting that the HR differed by age in all adjusted models (online supplemental table S2). We did not detect deviations from the proportional hazards assumption (online supplemental table S3 and figure S6).

Figure 2

HRs of the association between current use of NSAIDs and COVID-19 related deaths in the general population. NSAIDs, non-steroidal anti-inflammatory drugs.

Study population 2: RA/OA population

Patient characteristics

We included 175 495 current NSAID users and 1 533 286 non-users (table 2). A higher proportion of people aged 70+ years were included in this population than the general population. Median age was 63 years (IQR 55–71) among current users and 68 years (IQR 58–76) among non-users. Relative to current users, non-users were older at study start date. Approximately 60% of individuals were women in both groups.

Current users were more likely to be obese, more deprived, former/current smokers and to have had more primary care consultations and a prescription for proton-pump inhibitors and disease-modifying antirheumatic drugs than non-users. However, non-users were more likely to have comorbidities than current users.

Unadjusted and multivariable results

Online supplemental figures S7 and S8 present time to COVID-19 related death in Kaplan-Meier plots and adjusted cumulative mortality curves, respectively. We identified 2573 COVID-19 related deaths in the RA/OA population (online supplemental table S1). The unadjusted HR for current use was 0.43 (95% CI 0.36 to 0.52), compared with non-use (figure 3). In the multivariable model, we observed a lower risk of COVID-19 related death associated with current use (HR 0.78, 95% CI 0.64 to 0.94). Post hoc analyses, after adjustment for age and sex, showed most variables had minimal impact, though adjustment for PPI moved the estimate away from the null (online supplemental table S4). There was no evidence suggesting that HR differed by age in all adjusted models. We did not detect deviations from the proportional hazards assumption (online supplemental table S3 and figure S9).

Figure 3

HRs of the association between current use of NSAIDs and COVID-19 related deaths in the rheumatoid arthritis or osteoarthritis population. NSAIDs, non-steroidal anti-inflammatory drugs.

Analyses investigating different types of NSAIDs

Online supplemental tables S5–S10 present the baseline characteristics, stratified by different types of NSAIDs. Online supplemental figures S10 and S11 present time to COVID-19 related deaths by types of NSAIDs in Kaplan-Meier plots. There was no evidence that the association with COVID-19 death varied by: (1) naproxen dose, (2) COX-specific status and (3) ibuprofen versus other NSAIDs in either study population (figures 2 and 3 and online supplemental tables S11–S13).

Sensitivity analyses

After we excluded people who were ever prescribed aspirin, we observed no difference in risk of COVID-19 related death associated with current use compared with non-use (HR 0.84, 95% CI 0.69 to 1.02) in RA/OA population (online supplemental table S14). In the post hoc analysis when we used a directed acyclic graph (DAG) approach to select covariates, we observed a marginal decreased risk of COVID-19 in the complete case analysis, additionally adjusted for ethnicity (HR 0.79, 95% CI 0.64 to 0.99) (online supplemental table S15). The results of all other sensitivity analyses were broadly similar to those of the main analyses (online supplemental tables S16–S21).

Quantitative bias analysis

To fully explain the multivariable-adjusted HR (0.78) or the upper bound of the 95% CI (0.94) in the RA/OA population, an unmeasured confounder would need to be associated (conditional on measured covariates) with either non-use, relative to current use or COVID-19 mortality by at least risk ratio (RR) of 1.88 (effect estimate) or 1.29 (upper bound) and with both non-use and COVID-19 mortality by at least RR of 1.28 (effect estimate) or 1.06 (upper bound) (online supplemental figure S12).



Based on routinely collected data, our study showed no overall increased risk of COVID-19 related death associated with current NSAID use in adults, compared with non-use. This was consistently seen across all analyses.

In this study, we used two different populations to explore the potential impact of confounding. Current users were generally older and had more comorbidities than non-users in the general population cohort. As expected, this was associated with an increased risk of COVID-19 related death in current users compared with non-users in the unadjusted model. In contrast, current NSAID users were younger and had more comorbidities than non-users in the RA/OA population, associated with a decreased risk of COVID-19 related death in the unadjusted model. Notably, both associations were largely removed on adjustment for age. We observed a small decreased risk of COVID-19 related death among current users in the RA/OA population but not in the general population in the multivariable-adjusted models. In a post hoc analysis informed by a DAG that captures the complexity of relationships between variables, this protective effect was somewhat attenuated, suggesting it is not a robust finding and is subject to model variable selection. Moreover, our main analysis in the RA/OA population might also be subject to residual confounding. As demonstrated in quantitative bias analysis, an unmeasured confounder of only moderate strength could potentially fully explain this observed association. As we consistently found no evidence of harmful effect of NSAIDs on COVID-19 related death, using two populations provides a useful context for result interpretation.

Findings in context

It was postulated that NSAIDs might delay diagnosis and thus clinical care by masking the symptoms of a worsening infection.4 8–10 26 In vivo and in vitro cellular studies show that NSAIDs weaken the immune response to pathogens by limiting the local recruitment of innate immune cells and reducing antibody synthesis, but the immunomodulatory effects of NSAIDs are not fully understood.27 28 Notably, these proposed mechanisms are not specific to COVID-19. Recently, it has been suggested that ibuprofen upregulates ACE 2,29 which has a role in binding SARS-CoV-2 to target cells and could increase the risk of developing severe COVID-19 disease through this route.30 Some animal studies reported that administration of soluble recombinant ACE 2 might alleviate lung injury in people with respiratory infection.31 32 It remains unknown whether the findings can be generalised to humans.

In line with our results, five observational studies reported no evidence of a harmful effect of NSAID use on COVID-19 severity among patients with COVID-1933–36 but most were of much smaller sample size and not all were general population based, limiting generalisability.34 A case–control study that investigated the association between renin–angiotensin–aldosterone system blockers and COVID-19 diagnosis found no association between NSAIDs and COVID-19 diagnosis.37 In contrast, a US cohort study reported a lower odds of mortality associated with NSAID use prior to hospitalisation among patients with COVID-19 (adjusted OR 0.56, 95% CI 0.40 to 0.82).38 However, patient characteristics, stratified by NSAID exposure and the covariates adjusted for, were not clear. A recent cohort study demonstrated that NSAIDs were not associated with 30-day mortality or other severe COVID-19 outcomes in Danish people who tested positive for SARS-CoV-2.20 This study was well conducted with robust methodology and of large sample size but it might still be subject to potential issues around selective testing for COVID-19. Furthermore, specific types of NSAIDs were not explored in the analyses, limiting the interpretation of the results.

Notably, we assessed exposure as NSAID use prior to the outbreak in England to establish who were current users, but we did not evaluate any potential therapeutic role of NSAIDs to treat patients with COVID-19. While our study mainly focused on current NSAID use for routine clinical care, there are some ongoing clinical trials investigating the role of NSAIDs in management of COVID-19. They are due to complete later this year or next year (NCT0432563339; NCT0438276840; NCT0433462941; and NCT04344457).42

Strengths and limitations

The greatest strength of this study was the power we had to examine the association between NSAIDs and COVID-19 death, particularly on types of NSAID as our dataset included medical records from 24 million individuals. We also used two different study populations for comparisons to understand the impact of confounding by indication. The breadth of data available in primary care allows us to account for a wide range of potential confounders. We prespecified our analysis plan and have openly shared all analytical code.

We recognise possible limitations. First, we do not know whether patients truly took the medications as prescribed. Second, the supply of NSAIDs ‘over the counter’ is not captured . However, ‘over the counter’ purchases are likely to be for ibuprofen, used for acute, irregular conditions and may mean some non-users were in fact taking ibuprofen. This would tend to bias results towards the null. However, this is unlikely to impact the result in the RA/OA population as GPs in England prescribe NSAIDs for long-term conditions such as RA/OA.43 In our study, information on indications is not readily available; therefore, we cannot distinguish whether the NSAID use was for long-term or short-term conditions for further investigation. Notably, our results from the RA/OA population can generalise the findings to long-term NSAID users as these people receive prescriptions regularly to manage their medical condition. Additionally, we do not capture all additional medicines commonly used in the treatment of RA. In England, a small number of medicines for long-term conditions are supplied routinely by hospitals directly to patients.44 This includes biological treatments such as adalimumab and infliximab, and we have advocated for the release of these data but access remains restricted.45 46 Access to these data is important, as biological treatments might be preferentially prescribed in patients with more comorbidities, resulting in unmeasured confounding in our RA/OA population. Notably, our outcome reflected the probability of both COVID-19 infection and, once infected, COVID-19 mortality. If there was a strong harmful effect of NSAIDs on either of these endpoints, we would have observed a higher hazard of COVID-19 mortality among current users compared with non-users. However, we acknowledge that behavioural differences between our comparison groups may have led to a difference in the risk of infection, for example, if the NSAID exposed group were more risk avoidant. This could have attenuated any increased risk of harmful outcomes if differences in risk behaviour were substantial.


We found no evidence of a harmful effect of routinely prescribed NSAIDs on COVID-19 related death. People currently prescribed NSAIDs for their long-term conditions should continue their treatment as part of their routine care.

Information governance

NHS England is the data controller; TPP is the data processor; and the key researchers on OpenSAFELY are acting on behalf of NHS England. This implementation of OpenSAFELY is hosted within the TPP environment, which is accredited to the ISO 27001 information security standard and is NHS IG Toolkit compliant47 48; patient data have been pseudonymised for analysis and linkage using industry standard cryptographic hashing techniques; all pseudonymised datasets transmitted for linkage onto OpenSAFELY are encrypted; access to the platform is via a virtual private network connection, restricted to a small group of researchers, their specific machine and IP address; the researchers hold contracts with NHS England and only access the platform to initiate database queries and statistical models; all database activity is logged; only aggregate statistical outputs leave the platform environment following best practice for anonymisation of results such as statistical disclosure control for low cell counts.49 The OpenSAFELY research platform adheres to the data protection principles of the UK Data Protection Act 2018 and the EU General Data Protection Regulation 2016. In March 2020, the Secretary of State for Health and Social Care used powers under the UK Health Service (Control of Patient Information) Regulations 2002 to require organisations to process confidential patient information for the purposes of protecting public health, providing healthcare services to the public and monitoring and managing the COVID-19 outbreak and incidents of exposure.50 Taken together, these provide the legal bases to link patient datasets on the OpenSAFELY platform. General practices (GP), from which the primary care data are obtained, are required to share relevant health information to support the public health response to the pandemic and have been informed of the OpenSAFELY analytics platform.


We are very grateful for all the support received from the The Phoenix Partnership (TPP) Technical Operations team throughout this work and for generous assistance from the information governance and database teams at National Health Service England/NHSX.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Handling editor Josef S Smolen

  • AYW, BM and CEM contributed equally.

  • Contributors BG is the guarantor. LS, BG and ID contributed to conceptualisation of the study. CB, JP, JC, SH, SB, DE, PI and CM contributed to data curation. AYSW, BM, CM and JB did the formal analysis. BG and LS acquired funding for the study. AM, BG, CB and JP were responsible for work relating to information governance. ID, AYSW, AS, LT, KW, KB, CR, EW, SE, LS, JB, CM, AJW, BM, SB and BG contributed to the study methods. BM, CM, AJW, RC, AS, CR, PI, SB, DE, CB, JC, JP, SH, HD, HC, KB, SB, AM, LT, ID, HM, RM, HF and RE contributed to disease category conceptualisation and codelists. HC, EW, LS and BG acquired ethics approval for this study. AYSW, BM, CM, AS, AJW, CR, WH, CB, SB, AM, LS and BG contributed to project administration. BG and LS acquired resources. SB, DE, PI, AJW, CM, CB, FH, JC and SH created and maintained software. ID, LS and BG supervised the study. AYSW, JB and KB did the visualisation. AYSW, BM, CM, ID and JB wrote the original manuscript draft. All authors contributed to reviewing and editing of the manuscript. All authors were involved in design and conceptual development and reviewed and approved the final manuscript. ID and BG are joint principal investigators.

  • Funding TPP provided technical expertise and infrastructure within their data centre pro bono in the context of a national emergency. BG’s work on better use of data in healthcare more broadly is currently funded in part by: National Institute for Health Research (NIHR) Oxford Biomedical Research Centre, NIHR Applied Research Collaboration Oxford and Thames Valley, the Mohn-Westlake Foundation, NHS England and the Health Foundation; all DataLab staff are supported by BG’s grants on this work. LS reports grants from Wellcome, MRC, NIHR, UKRI, British Council, GlaxoSmithKline, British Heart Foundation, and Diabetes UK outside this work. AYSW holds a fellowship from British Heart Foundation. JPB is funded by a studentship from GlaxoSmithKline. AS is employed by London School of Hygiene and Tropical Medicine on a fellowship sponsored by GlaxoSmithKline. KB holds a Sir Henry Dale fellowship jointly funded by Wellcome and the Royal Society (107731/Z/15/Z)). HIM is funded by the National Institute for Health Research (NIHR) Health Protection Research Unit in Immunisation, a partnership between Public Health England and London School of Hygiene and Tropical Medicine. RM holds a Sir Henry Wellcome fellowship (201375/Z/16/Z)). EW holds grants from MRC. RG holds grants from NIHR and MRC. ID holds grants from NIHR and GlaxoSmithKline. HF holds a UKRI fellowship.

  • Disclaimer The views expressed are those of the authors and not necessarily those of the NIHR, NHS England, Public Health England or the Department of Health and Social Care.

  • Competing interests BG has received research funding from Health Data Research UK, the Laura and John Arnold Foundation, the Wellcome Trust, the NIHR Oxford Biomedical Research Centre, the NHS National Institute for Health Research School of Primary Care Research, the Mohn-Westlake Foundation, the Good Thinking Foundation, the Health Foundation and the World Health Organisation; he also receives personal income from speaking and writing for lay audiences on the misuse of science. ID has received unrestricted research grants and holds shares in GlaxoSmithKline.

  • Patient consent for publication Not required.

  • Ethics approval This study was approved by the Health Research Authority (REC reference 20/LO/0651) and by the LSHTM Ethics Board (reference 21863).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement All data relevant to the study are included in the article or uploaded as supplementary information. All code for data management and analyses in addition to the prespecified protocol are archived at: All codelists for identifying exposures, covariates and outcomes are openly shared at Access to the platform is via a virtual private network connection, restricted to a small group of researchers. All data relevant to the study are included in the article or uploaded as supplementary information

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.