Article Text

Download PDFPDF

Mining social media data to investigate patient perceptions regarding DMARD pharmacotherapy for rheumatoid arthritis
  1. Chanakya Sharma1,
  2. Samuel Whittle2,
  3. Pari Delir Haghighi3,
  4. Frada Burstein3,
  5. Roee Sa'adon4,
  6. Helen Isobel Keen5
  1. 1 Rheumatology, Fiona Stanley Hospital, Murdoch, Western Australia, Australia
  2. 2 Rheumatology, The Queen Elizabeth Hospital, Woodville South, South Australia, Australia
  3. 3 Information Technology, Monash University, Clayton, Victoria, Australia
  4. 4 Treato Ltd, Or Yehuda, Israel
  5. 5 Medicine and Pharmacology, UWA, Murdoch, Western Australia, Australia
  1. Correspondence to Dr Chanakya Sharma, Rheumatology, Fiona Stanley Hospital, Murdoch, WA 6009, Australia; chanakya_s{at}


Objectives We hypothesise that patients have a positive sentiment regarding biological/targeted synthetic disease modifying anti-rheumatic drugs (b/tsDMARDs) and a negative sentiment towards conventional synthetic agents (csDMARDs). We analysed discussions on social media platforms regarding DMARDs to understand the collective sentiment expressed towards these medications.

Methods Treato analytics were used to download all available posts on social media about DMARDs in the context of rheumatoid arthritis. Strict filters ensured that user generated content was downloaded. The sentiment (positive or negative) expressed in these posts was analysed for each DMARD using sentiment analysis. We also analysed the reason(s) for this sentiment for each DMARD, looking specifically at efficacy and side effects.

Results Computer algorithms analysed millions of social media posts and included 54 742 posts about DMARDs. We found that both classes had an overall positive sentiment. The ratio of positive to negative posts was higher for b/tsDMARDs (1.210) than for csDMARDs (1.048). Efficacy was the most commonly mentioned reason in posts with a positive sentiment and lack of efficacy was the most commonly mentioned reason for a negative sentiment. These were followed by the presence/absence of side effects in negative or positive posts, respectively.

Conclusions Public opinion on social media is generally positive about DMARDs. Lack of efficacy followed by side effects were the most common themes in posts with a negative sentiment. There are clear reasons why a DMARD generates a positive or negative sentiment, as the sentiment analysis technology becomes more refined, targeted studies could be done to analyse these reasons and allow clinicians to tailor DMARDs to match patient needs.

  • rheumatoid arthritis
  • patient perspective
  • DMARDs (biologic)
  • DMARDs (synthetic)

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Key messages

What is already known about this subject?

  • Clinicians views regarding the disease modifying anti-rheumatic drugs (DMARDs).

What does this study add?

  • First study conducted that analyses public opinion on DMARDs used in rheumatoid arthritis at such a massive scale.

  • This study shows that:

    • Social media analysis can improve our understanding of patient beliefs towards DMARDs.

    • Public sentiment is positive towards the biological DMARDs.

    • Public sentiment is slightly negative towards methotrexate primarily due to side effects.

    • Efficacy is the strongest cause of a positive sentiment followed by lack of side effects.

How might this impact on clinical practice or future developments?

  • More and more patients are researching their medications online and discussing them on social media. This is having a significant impact on their beliefs and compliance. This study educates clinicians about the prevailing sentiment as it exists towards various DMARDs, and the specific concerns that patients have about the DMARDs. Thus, allowing them to better counsel their patients and prepare them for what they might encounter on their search online.


Rheumatoid arthritis (RA) is an incurable disease with an incidence of ~1%.1 It is characterised by inflammation leading to irreversible destruction of the joints. It is associated with considerable morbidity, mortality and health related costs.2 Current management of RA involves the early institution of disease modifying anti-rheumatic drugs (DMARDs), initially with conventional synthetic agents (csDMARDs), followed by biological DMARDs/targeted synthetic DMARDs (bDMARDs/tsDMARDs) if required. The number of b/tsDMARDs available for RA has rapidly increased over the past few years, as have the total healthcare costs associated with them. Despite there being an improvement in outcomes for RA patients, medication adherence rates, especially with csDMARDs, have been poor with some studies showing full adherence in as few as 30% of patients.3 4 Evidence is emerging that some patients are progressing to b/tsDMARDs without using csDMARDs as prior or cotherapy, in contrast to guidelines and typical regulatory rules.5

Patient concordance with medications is associated with improved outcomes in RA.6 7 One of the biggest factors affecting concordance is the patient’s personal belief about the disease and medications.8 Studies have shown that in order to improve adherence with DMARDs, clinicians should focus less on provision of medical information and be more aware of patients’ beliefs.9 Understanding patient beliefs however is difficult and often relies on qualitative studies which are excellent at providing an in-depth thematic analysis of a specific issue, but are traditionally conducted on a small scale.

Social media is widely used by patients to discuss medical issues10; in 2012, 26% of internet users were using social media for health issues, making it a rich source of information about patient beliefs.11 A common technique for analysing social media content is sentiment Analysis (SA), which involves analysing the sentiment expressed in textual content.12 Such analysis has already been shown to have utility in industries such as entertainment and stock market.13 14

We aimed to understand patient perceptions about DMARD therapy as expressed on social media. Our primary objective was to undertake SA of all available DMARDs to assess the aggregate sentiment towards each category. Our secondary objective was to identify themes within the positive and negative sentiment that could shed light on patient beliefs.


We used the services of the web analytics firm Treato. The Treato platform automatically identifies, collects and analyses publicly available user-generated content on health-related topics from over 10 000 sources. These sources include the publicly available data on social networks such as Facebook and Twitter, discussion forums and blogs. Over 3 billion posts were analysed from these sources. The data are then analysed using a patented algorithm that applies natural language processing to this content to identify medical concepts mentioned in text, and extract patients’ self-reported descriptions of their experiences with various health conditions and medications. These medical experiences were then mapped on to formal concepts in a medical ontology. Treato’s algorithms combine various medical ontologies including those used by the Food and Drug Administration for coding. This process includes resolving conceptual synonyms of medical terms (eg, ‘fatigue’ and ‘tired’ were assigned the same concept code); resolution of patient-specific phrases (eg, ‘pain in my joints’ and ‘my joints hurt’) to medical terms; word-sense disambiguation algorithms (eg, ‘BP’ could refer to bi-polar disorder, blood pressure or a bisphosphonate medication); and medication synonyms (eg, generic and brand names for the same medication).

The data used in this study were limited to posts written in the English language. The unit of analysis for this study was an individual post. In order for a post to be included in the final analysis it needed to be user generated content mentioning at least one of the thirteen current DMARDs (methotrexate, leflunomide, sulfasalazine, hydroxychloroquine, adalimumab, etanercept, certolizumab, golimumab, tocilizumab, tofacitinib, rituximab, abatacept and infliximab) in the context of RA.

Included posts were then subject to Treato’s SA algorithms for further categorisation into posts with positive or negative sentiment. The two most common reasons for a positive post were DMARD efficacy and lack of side effects. Conversely, the most common reasons for a negative post were lack of efficacy and side effects. Therefore, the positive and negative tagging is not mutually exclusive since a post may contain both positive and negative experiences about the same medication. Treato also compiled data on the most common concerns that were frequently listed by patients on various DMARDs. These data were then provided to us for interpretation.

The overall sentiment for each DMARD was expressed as the ratio of the positive to negative posts for that DMARD. A ratio greater than one indicated an overall positive sentiment. Demographic information was collected where available.

While the algorithms were able to assign sentiment and extract information regarding efficacy and side effects for all the DMARDs, the final numbers were not available for hydroxychloroquine and abatacept, which were then manually extracted. In order to ensure that the results were valid for hydroxychloroquine and abatacept, this process of manual extraction was repeated for all the other DMARDs. There were negligible differences (0%–3%) between the algorithm and manual extraction across the categories of the DMARDs which likely reflect the difference in dates when the data were provided by Treato’s algorithms and when it was manually extracted (additional posts on social media). This difference was not felt to be large enough to have a significant impact on the overall interpretation of the results.


We used Cohen’s kappa coefficient to assess inter-rater agreement between Treato and manual assessment of sentiment. A comparison in proportions test was conducted to search for significant differences in positive sentiment for efficacy across b/tsDMARDs and concerns raised by patients on both csDMARDs and bDMARDs. Statistical significance was assumed at p<0.05.


Treato collected data prospectively from July 2017 till October 2018, and also analysed available data retrospectively. We collected 28 261 posts on b/tsDMARDs and 26 841 posts on csDMARDs, with some overlap. The individual breakdown of the DMARDs and the positive and negative posts is shown in table 1. Treato’s algorithms identified majority (89.6% and 88.8%, respectively) of the posts on b/tsDMARDs and csDMARDs as being written by patients. As a validation exercise, 200 posts were manually assessed and assigned a sentiment. This was compared with the sentiment assigned by Treato’s algorithms for these posts. Agreement between sentiment assessed by machine and human was moderate (csDMARDs k=0.49 and b/tsDMARDs k=0.52).15

Table 1

Aggregate sentiment


Content about b/tsDMARDs was collected from 497 publicly available forums. The greatest proportion (7969/28 261 posts) were obtained from Facebook. The 10 most popular social media platforms used to publish these posts are shown in table 2. Geolocation data were available on 1837 posts which identified users from 34 countries. Majority of the posts (95.4%) were from USA (1349), UK (162), Canada (155), Australia (55) and Mexico (15).

Table 2

Social media platforms

The ratio of total positive to negative posts was 1.21, thus indicating an overall positive sentiment. Each of the b/tsDMARDs had a greater number of positive than negative posts. Efficacy was the most common theme identified within posts assigned a positive sentiment (>80% of positive posts), followed by lack of side effects (13% of positive posts) (table 3). Comparing b/tsDMARDs to each other in terms of the proportion of patients who posted a positive post due to efficacy, revealed etanercept as being the most popular by having a significantly superior difference in proportion to three other b/tsDMARDs (rituximab, infliximab and tofacitinib) (table 4).

Table 3

bDMARD positive and negative sentiment for efficacy and side effects

Table 4

Comparison of proportion of positive sentiment for efficacy among biological disease modifying anti-rheumatic drugs

While lack of efficacy was also the most common theme in posts with a negative sentiment, side effect concerns were a more prominent cause of negative sentiment posts than lack of side effects were for positive sentiment posts (table 3).

The most common concerns raised by patients who wrote a negative post on b/tsDMARDs are depicted in table 5. Joint pain was the most common but the next three reasons for a negative sentiment were due to side effects (‘rash’, ‘nausea’ and ‘itching’). Infections were also a prominent reason for a negative sentiment, with four of the top 20 reasons being occupied by infectious causes (‘fever’, ‘pneumonia’, ‘common cold’ and ‘sinus infections’).

Table 5

Concerns: percentage of posts with a negative sentiment


Posts about csDMARDs were collected from 515 social media sites. Ten websites contributed 69% (18 503) of all the posts (table 2). Geolocation was only available for 5% (1441) of the posts. Among these, however, 36 countries were represented. The majority of the posts (93.3%) came from USA (904), UK (174), Canada (142), Australia (90) and New Zealand (35).

The ratio of total positive to negative posts was 1.048, indicating an overall positive sentiment. The individual ratios revealed a negative sentiment for sulfasalazine (0.97) and methotrexate (0.995), and positive for leflunomide (1.09) and hydroxychloroquine (1.26) (table 1).

Efficacy was the most common theme in posts with a positive sentiment for all the csDMARDs (table 6). While lack of efficacy was the most common theme in posts with a negative sentiment, its overall share was lower than what was seen in posts with a positive sentiment. Approximately half of the negative posts regarding methotrexate discussed either lack of efficacy (50.08%) or side effects (44.94%). For hydroxychloroquine and sulfasalazine, a higher proportion of negative posts discussed lack of efficacy (56.42% and 53.81%, respectively) versus side effects (40.28% and 31.68%, respectively). Leflunomide saw a slightly larger share of negative sentiment posts discussing side effects (18.15%), with discussions on lack of efficacy accounting for 16.86% of the negative sentiment posts. Of the patients who gave methotrexate an overall negative sentiment, 7.18% still felt that it was effective, these numbers were lower for sulfasalazine (4.96%) and leflunomide (3.2%) (table 6).

Table 6

Positive/negative sentiment csDMARDs reasons

The most common concerns associated with a negative sentiment are shown in table 5. ‘Nausea’ was the most common, closely followed by ‘joint pain’. The remainder of the list was strongly populated with side effect mentions including ‘hair loss’ ‘allergy’ ‘rash’ and ‘stomach problems’.

b/tsDMARDs versus csDMARDs

More patients on b/tsDMARDs were significantly more likely to positively post due to efficacy (85.74%) as compared with csDMARDs (78.71%), difference of 7.03% (95% CI 6.15% to 7.91%; p<0.0001). However, patients on csDMARDs were significantly more likely to assign a positive sentiment due to lack of side effects (17.47%) as opposed to those on b/tsDMARDs (13.14%), difference of 4.33% (95% CI 3.5% to 5.16%; p<0.0001).

Concerns about medications were broadly similar in posts about either csDMARDs or b/tsDMARDs (table 5). However, posts about b/tsDMARDs were significantly more likely to contain descriptions of joint pain, drug reactions (rash and itching) and cancer, whereas posts about csDMARDs contained more descriptions of weight loss, hair loss and nausea. Posts on csDMARDs were more likely to be on gastrointestinal issues such as ‘stomach problems’, ‘diarrhoea’ and ‘vomiting’. Allergic reactions to the medications were also a common reason for negative sentiment with csDMARDs, particularly sulfasalazine (10.1% of all negative posts, vs 3.66% for all other csDMARDs). Infections were mentioned more frequently in posts on b/tsDMARDs (10.54% vs 5.76%; p<0.0001). Among the b/tsDMARDs, shingles was more frequently mentioned in association with tofacitinib than the other b/tsDMARDs combined (5.4% vs 0.7% of negative posts; p<0.0001).


Our study supports our hypothesis that the collective sentiment was skewed positively in favour of the b/tsDMARDs over the csDMARDs. While all the b/tsDMARDs had a positive sentiment, this was only true for hydroxychloroquine and leflunomide among the csDMARDs.

We found efficacy and side effects to be the most commonly discussed topics in posts with positive and negative sentiment. These findings mirror those of a recent study that investigated the reasons for bDMARD discontinuation in RA patients and found that lack of efficacy followed by side effects as the two biggest factors.16 The ratio of positive to negative posts for b/tsDMARDs ranged from 1.71 for tofacitinib to 1.08 for adalimumab. Tofacitinib had 81.21% of its positive posts discussing efficacy, this was lower than the other b/tsDMARDs and methotrexate. However, tofacitinib also had the highest percentage of positive posts discussing lack of side effects (20.23%) which contributed to its overall high ratio of positive to negative posts. However, side effects were also the most common theme in posts with a negative sentiment towards tofacitinib with 50% of negative posts describing side effects, the highest across both the categories of DMARDs. Tofacitinib appears to have a polarising effect on patients with regards to side effects with both significant positivity and negativity associated with it. The literature regarding side effects with tofacitinib however does not reveal any such polarising factors.17–19 Tofacitinib had the least number of posts (548) across both categories of DMARDs, which likely played a role in the occurrence of such diverse results.

All the b/tsDMARDs had at least 80% of their positive posts discussing efficacy. While etanercept had significantly higher posts commenting positively due to efficacy than some of the other b/tsDMARDs, the absolute difference in proportions was small and unlikely to be clinically meaningful. It is interesting to note that the three b/tsDMARDs that had a lower proportion of efficacy posts than etanercept (rituximab, tofacitinib and infliximab) all had a different mechanism of action to one another and a different mode of administration. This comparison also highlights a powerful potential use of SA technology. Despite the ever-increasing number of bDMARDs, there are few head to head trials that directly compare these agents. The use of SA provides us with a large scale, real-world summary measure of effectiveness and tolerability that acts as an (in)direct comparison.

While methotrexate did have over 80% of its positive posts discussing efficacy, only marginally below the b/tsDMARDs, it still generated an overall negative sentiment ratio due to the high incidence of posts mentioning side effects. Almost half of the negative posts against methotrexate discussed side effects, which was one of the highest across both the categories of DMARDs. Our study demonstrates that majority of patients find methotrexate to be efficacious yet have assigned it a negative sentiment primarily due to gastrointestinal side effects. While clinical trial data have shown that less than 10% of patients stop methotrexate due to side effects, longer term studies however have demonstrated that over a third of the patients who take methotrexate for more than 2 years will discontinue the medication.20 21 Sulfasalazine also had a high percentage of patients posting about side effects, with allergic reactions being the frequently mentioned, however the percentage of positive posts discussing efficacy were lower than that of methotrexate or the bDMARDs. It was a combination of poor (perceived) efficacy along with side effect concerns that generated the overall negative sentiment for sulfasalazine. Trials that have previously compared sulfasalazine to methotrexate have demonstrated comparable efficacy and side effects.22 23

One of the most common concerns raised by patients on b/tsDMARDs were injection site reactions. Studies have shown that patients have a strong preference for orally administered medications over injectables and this likely contributed towards the reduced side effect related sentiment.24 Frequency of administration might also explain the relatively fewer negative posts due to side effects for golimumab which has a monthly dosing interval. Studies of RA patients have shown this to be the preferred frequency of administration. While other drugs such as infliximab, tocilizumab and rituximab had similar or longer frequency of administration, their intravenous route of administration is known to less desired by patients.25

The most common concerns raised by patients on csDMARDs were hair loss, gastrointestinal issues and allergic reactions. Shingles was a higher cause of negativity in patients on tofacitinib than on the other b/tsDMARDs, which mirrors the findings in the studies.26

More patients posted a positive comment for b/tsDMARDs regarding efficacy than for csDMARDs, this was demonstrated in a network meta-analysis, which showed that 16% more patients on biological/DMARD combination achieved an American College of Rheumatology 50 (ACR50) response than those on csDMARDs.27

The most important limitations of this study are reflective of the nascent state of the technology. The first being the quality of the data. Despite using strict filters, without conducting a manual analysis of the 3 billion posts it is impossible to know how relevant the information contained within the post is to the topic being studied. Second, SA itself is evolving with no current gold standard approach. There are various methods by which SA can be conducted, with each having certain advantages and disadvantages and none providing an absolute guarantee of accuracy. Due to these issues, it would not be surprising to have similar studies produce different results based on the platforms being analysed (as some allow patients to post large amounts of information and others, like Twitter, only allow small amounts, thus influencing the accuracy of the algorithms) and the technique used to conduct SA. We also excluded posts made in languages other than English as SA is not as well developed for other languages. Therefore, the results of this study might not be applicable to countries where English is not the primary language.


This study is the first to conduct a SA of all available social media posts generated by RA patients for 13 DMARDs. Our study has been able to capture unprompted sentiment as directly expressed by the patient. The sentiment was positive for all the b/tsDMARDs with efficacy being the primary driver of this, followed by lack of side effects. Methotrexate and sulfasalazine had an overall negative sentiment, and descriptions of side effects were particularly common for methotrexate.

As big data analytic technology becomes more advanced, there is potential for this methodology to rapidly capture broad-spectrum patient sentiment towards medications. This may act as a valuable addition to existing qualitative methods, which allow for a more nuanced assessment than is currently possible with SA. This complementary approach will generate novel insights and improve various aspects of patient–physician interaction, from shared decision-making regarding DMARD selection, to patient adherence.


The authors would like to acknowledge the patient research partners who provided valuable input by reviewing the findings of this study in the light of their lived experience.



  • Handling editor Josef S Smolen

  • Contributors CS was responsible for the study design and drafting the manuscript. RS collected the data. All authors were responsible for interpretation of the data and for revising and approving the final submitted manuscript.

  • Funding This study has been funded by a grant from Arthritis Australia.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

  • Patient consent for publication Not required.

  • Ethics approval Ethics approval was obtained from Human Research Ethics Committee at Monash University and the University of Western Australia.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data are available upon reasonable request.