Background Sjögren’s syndrome (SS) is an autoimmune disease with unknown aetiology. SS shares similarity with other autoimmune diseases such as systemic lupus erythematosus (SLE) and rheumatoid arthritis (RA). Our objective was to use bioinformatics tools to identify both unique and common genes involved in the pathogenesis of SS, SLE and RA.
Materials and methods Literature mining tool PubTator was used to identify genes associated with each disease. Python scripts were used to extract information from PubTator and integrate it into a MySQL database management system. Three salivary glands (SGs) and one peripheral blood mononuclear cells (PBMCs) gene expression datasets (NCBI-GEO) of SS patients were analysed using GEO2R. MySQL tables were joined and organised to retrieve gene sets, which were utilised for molecular network analysis using STRINGdb. A web interface was designed and implemented to provide unique platform for search and retrieval of genes associated with SS.
Results Text mining yielded 847 genes in common in SS, SLE and RA. A total of 15 990 differentially expressed (DE) genes were represented in at least one dataset, including 683 in common with the 847 previously identified. Of these, 86 genes were differentially expressed in all SGs datasets. To identify DE genes unique to SGs of SS patients, PBMCs dataset was used to subtract DE genes common to SGs and PBMCs from the 86 genes. Remaining 71 genes were integrated into a molecular interaction network including 44 representing 22 SS-related biological processes. The biological processes included cytokine signalling, herpes virus infection, and toll-like receptor signalling. A web interface was established to aggregate and correlate gene expression and text mining data.
Conclusions By combining text mining and gene expression analysis, we identified a molecular interaction network that contained DE genes representing candidate biomarkers for SS. Previous studies had reported no clear link between some of the identified genes and SS. These genes may be further investigated for their role in the progression of SS in experimental models. Further, the web interface provides dynamic platform to the investigator to correlate different data sources, which may provide additional insights on the development of SS.