Article Text

Download PDFPDF

Quantitative planar array screen of 1000 proteins uncovers novel urinary protein biomarkers of lupus nephritis
  1. Kamala Vanarsa1,
  2. Sanam Soomro1,
  3. Ting Zhang1,
  4. Briony Strachan1,
  5. Claudia Pedroza2,
  6. Malavika Nidhi1,
  7. Pietro Cicalese1,
  8. Christopher Gidley1,
  9. Shobha Dasari1,
  10. Shree Mohan1,
  11. Nathan Thai1,
  12. Van Thi Thanh Truong2,
  13. Nicole Jordan3,
  14. Ramesh Saxena4,
  15. Chaim Putterman3,5,6,
  16. Michelle Petri7,
  17. Chandra Mohan1
  1. 1 Department of Biomedical Engineering, University of Houston, Houston, Texas, USA
  2. 2 Center for Clinical Research and Evidence-based Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, Texas, USA
  3. 3 Division of Rheumatology, Albert Einstein College of Medicine, Bronx, New York, USA
  4. 4 Division of Nephrology, Department of Medicine, UT Southwestern Medical, Dallas, Texas, USA
  5. 5 Azrieli Faculty of Medicine, Bar-Ilan University, Zefat, Israel
  6. 6 Research Institute, Galilee Medical Center, Nahariya, Israel
  7. 7 Division of Rheumatology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
  1. Correspondence to Dr Chandra Mohan, Department of Biomedical Engineering, University of Houston, Houston, TX 77204, USA; cmohan{at}


Objective The goal of these studies is to discover novel urinary biomarkers of lupus nephritis (LN).

Methods Urine from systemic lupus erythematosus (SLE) patients was interrogated for 1000 proteins using a novel, quantitative planar protein microarray. Hits were validated in an independent SLE cohort with inactive, active non-renal (ANR) and active renal (AR) patients, in a cohort with concurrent renal biopsies, and in a longitudinal cohort. Single-cell renal RNA sequencing data from LN kidneys were examined to deduce the cellular origin of each biomarker.

Results Screening of 1000 proteins revealed 64 proteins to be significantly elevated in SLE urine, of which 17 were ELISA validated in independent cohorts. Urine Angptl4 (area under the curve (AUC)=0.96), L-selectin (AUC=0.86), TPP1 (AUC=0.84), transforming growth factor-β1 (TGFβ1) (AUC=0.78), thrombospondin-1 (AUC=0.73), FOLR2 (AUC=0.72), platelet-derived growth factor receptor-β (AUC=0.67) and PRX2 (AUC=0.65) distinguished AR from ANR SLE, outperforming anti-dsDNA, C3 and C4, in terms of specificity, sensitivity and positive predictive value. In multivariate regression analysis, urine Angptl4, L-selectin, TPP1 and TGFβ1 were highly associated with disease activity, even after correction for demographic variables. In SLE patients with serial follow-up, urine L-selectin (followed by urine Angptl4 and TGFβ1) were best at tracking concurrent or pending disease flares. Importantly, several proteins elevated in LN urine were also expressed within the kidneys in LN, either within resident renal cells or infiltrating immune cells, based on single-cell RNA sequencing analysis.

Conclusion Unbiased planar array screening of 1000 proteins has led to the discovery of urine Angptl4, L-selectin and TGFβ1 as potential biomarker candidates for tracking disease activity in LN.

  • autoimmunity
  • cytokines
  • lupus nephritis
  • systemic lupus erythematosus

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Key messages

What is already known about this subject?

  • As opposed to previous urine biomarker studies in lupus nephritis (LN), a comprehensive, quantitative screen of 1000 specific proteins has been conducted using urine from patients with LN.

What does this study add?

  • Based on array-based screening and ELISA-based validation, urine Angptl4, L-selectin, TPP1, transforming growth factor-β1 (TGFβ1), thrombospondin-1 (TSP-1), FOLR2, platelet-derived growth factor receptor-β (PDGF-Rβ) and PRX2 emerged as novel biomarkers of LN.

  • Urine Angptl4, L-selectin, TPP1, TGFβ1, TSP-1, FOLR2 and PDGF-Rβ successfully distinguished active LN patients from active non-renal lupus patients, despite both groups having comparable Systemic Lupus Erythematosus (SLE) Disease Activity Index scores.

  • Urine Angptl4, L-selectin and TPP1, in combination, offered the best discrimination of active LN from active non-renal lupus, with a area under the receiver operating curves of 0.97.

How might this impact on clinical practice or future developments?

  • In SLE patients with serial follow-up, urine L-selectin (followed by urine Angptl4 and TGFβ1) shows the most promise at tracking concurrent or pending disease flares.


Systemic lupus erythematosus (SLE), a chronic autoimmune disease, has been widely investigated for biomarkers in recent years.1–3 Of importance is discovering biomarkers from biological samples that do not require an invasive procedure to acquire and that can differentiate between SLE patients with or without lupus nephritis (LN), a leading cause of morbidity and mortality in these patients. This could potentially result in the gold standard of a renal biopsy being replaced with urine testing, as this could greatly facilitate longitudinal monitoring at frequent intervals in order to closely track disease progression and tailor therapy accordingly.4

Most methods used previously for biomarker detection have adopted a biased philosophy, based on exploring established pathophysiological pathways associated with SLE, for example, inflammation5 or growth factor pathways.6 While useful, these approaches limit the discovery of novel biomarkers and their associated pathways. In the recent years, several approaches have been used for unbiased biomarker screens, including mass spectrometry7 8 and electrospray ionisation quadrupole time-of-flight tandem mass spectrometry.9

In contrast to the above mass-spectrometry-based approaches, which are typically overwhelmed with high abundance proteins, affinity-based approaches using various ligands (such as antibodies) have the potential to uncover lower abundance proteins, due to the use of specific, high-affinity ligands. A few screening studies in lupus have used affinity-based techniques such as antibody-based or aptamer-based assays.10–13

In this study, we have used a novel glass slide based protein microarray to screen and quantify 1000 proteins in order to identify potential urinary biomarkers for renal involvement in patients with SLE. Unlike most planar arrays, which only provide relative expression between samples, this array uniquely uses an 8-point curve for each protein, allowing for a concentration to be derived for each of the 1000 proteins interrogated. This platform has been successfully used in previous investigations including the study of breast cancer14 and cell culture secretomes.15 16 As of writing, this paper will be the first to use this novel platform not only for the unbiased screening of an autoimmune disease, but also the first to use this assay as a biomarker screening technique.

Using this platform, an initial cohort of SLE patients was interrogated for the concentrations of 1000 proteins in the urine. From this initial screening, 17 urine proteins were selected for ELISA validation in an independent cohort. Eight urine proteins exhibited the ability to significantly distinguish active renal (AR) from active non-renal (ANR) SLE: L-selectin, Angptl4, transforming growth factor-β1 (TGFβ1), TPP1, thrombospondin-1 (TSP-1), PRX2, FOLR2 and platelet-derived growth factor receptor-β (PDGF-Rβ). The combination of urine Angptl4 and TPP offered the best discrimination of active LN from ANR lupus, with an area under the curve (AUC) of 0.98. Angptl4, L-selectin, TGFβ1 and TPP1 also showed potential in monitoring disease activity during longitudinal follow-up. Published single-cell RNA sequencing (scRNA-seq) data from LN renal biopsies shed further light on potential cellular sources of each of these urinary biomarker candidates within the kidneys.

Materials and methods

Patients, sample collection and sample preparation

Urine samples from four cohorts of SLE patients were used in this study, including: (1) a discovery SLE cohort for the 1000-plexed protein array screen, (2) a cross-sectional cohort of SLE patients for primary validation, (3) a concurrent cohort of LN with urine samples obtained at the time of renal biopsy and (4) a longitudinal cohort of SLE patients who had been serially followed up five times or more, with at least one intervisit interval being ≤2 months, for further validation. An additional cohort of patients with chronic kidney diseases (CKD) was used as controls, in addition to healthy controls. Urine samples were obtained from John Hopkins University School of Medicine (Baltimore, MD), Albert Einstein College of Medicine (Bronx, New York, USA), and UT Southwestern Medical centre (UTSW, Dallas, Texas, USA) with informed consent. Patients or the public were not involved in the design, or conduct or reporting, or dissemination of this research. Lupus patients with renal failure and paediatric patients were excluded from this study. ‘Clean-catch midstream’ urine samples were collected in sterile containers and either placed on ice or refrigerated within 1 hour of sample collection. The samples were then aliquoted and stored at −80°C.

During the sample collection visit, the lupus patients were assessed by the clinician and the following data were documented: SLE Disease Activity Index (SLEDAI), renal domains of SLEDAI (rSLEDAI), physician global assessment (PGA), weight, blood pressure, complete blood count, erythrocyte sedimentation rate, creatinine, cholesterol, C3, C4, anti-dsDNA, urinalysis and urine protein/creatinine ratio (uPCR). For all lupus patients, the hybrid SLEDAI was used, where proteinuria was scored if >0.5 g/24 hours. The rSLEDAI summates the renal components of the SLEDAI, including hematuria (>5 red blood cells/high-power field), pyuria (>5 white blood cells/high-power field), proteinuria (>0.5 g/24 hours) and urinary casts. In the cross-sectional SLE cohort for primary validation, SLE patients were classified into three groups. Active LN (AR) was defined as biopsy-proven LN with rSLEDAI ≥4. None of the active LN patients in this study had isolated hematuria or pyuria. ANR SLE patients had SLEDAI ≥5 and rSLEDAI=0. The inactive or low disease activity group included SLE patients with SLEDAI ≤2, all of whom had no clinically activity, except for one patient with alopecia, and one with thrombocytopenia. In the concurrent cohort of LN patients where urine samples were obtained at the time of renal biopsy, patients were categorised according to the LN classes.

1000-plexed protein array screen

A discovery cohort of 24 subjects (HC=9; SLE=15 (SLEDAI >0)) was used for the initial screen. Please see online supplementary table S1 for demographics and clinical features of these subjects. All 15 SLE patients used for screening had active SLE, 12 of whom had renal involvement and rSLEDAI ≥4. All urine samples were clarified by centrifugation before application to the arrays. All urine samples were screened using the Kiloplex Quantibody protein array platform purchased from Raybiotech Life (QAH-CAA-X00, Norcross, Georgia, USA). The list of the 1000 proteins assayed is available at: The capture antibody for each protein was spotted in quadruplicate onto a glass surface to create the array. Therefore, an n=4 is used for each protein concentration measurement, as these arrays use an 8-point standard curve for each of the 1000 proteins. Briefly, all samples are diluted to yield a total protein concentration within the working range (500 ug/mL—1 mg/mL). Protein standards and urine samples were incubated on the array for 2 hours, to allow the proteins to bind to the antibodies. Following washing, a biotinylated antibody cocktail (comprised 1000 detection antibodies) was added and left to incubate for 2 hours. Finally, streptavidin-Cy3 was added and left to incubate for 1 hour. After a final wash and dry, the slides were read with a fluorescent scanner. Data were then extracted from the image using vendor-provided GAL file using compatible microarray analysis software. All data were creatinine normalised before analysis (KGE005, R&D Systems, Minneapolis, Minnesota, USA).

Validation studies using ELISA

For the primary validation study, urine from 78 subjects in the cross-sectional cohort were included, composed of 16 healthy controls and 17 inactive or low disease activity (SLEDAI ≤2), 16 ANR (SLEDAI ≥5, rSLEDAI=0) and 29 AR (rSLEDAI ≥4) SLE patients, as described above (online supplementary table S2). Importantly, both the ANR and AR groups had comparable SLEDAI scores. Identified biomarkers were validated using ELISA assays, following manufacturer instructions. Briefly, to assay each protein, urine samples at an optimal dilution were added to a microplate precoated with capture antibody, incubated, washed and followed by addition of capture antibody, horseradish peroxidase and substrate. The absolute levels of urine protein biomarkers were determined using standard curves run on each ELISA plate, and normalised by urine creatinine concentration. Promising biomarkers were further validated using ELISA in additional patient cohorts, including a renal-biopsy concurrent cohort, a longitudinal cohort and a CKD control cohort.


All data were plotted and analysed using GraphPad Prism V.7 (GraphPad, San Diego, California, USA), Microsoft Excel, or R. All data in this study were analysed using the Mann-Whitney U test, as several datasets were not normally distributed. Likewise, the Spearman and Pearson methods were used for the correlation analysis. One was added to all biomarker measurements, then log-transformed to base 2. To examine the relationship between an individual biomarker and outcomes, we performed logistic regression models for active LN, and linear regression models for continuous outcomes including PGA scores and rSLEDAI. For each outcome, models were ran to control for race, race and age. For each model, q values were computed for each biomarker. Sensitivity, specificity, AUC receiver operating curve (ROC) analysis from, positive predictive value (PPV) and negative predictive value (NPV) were calculated using the easy ROC software.

For analysis of longitudinal data, two approaches were taken. To determine which urine biomarker best tracked with disease activity metrics serially, we first ran multilevel linear models with patient IDs as random intercepts regressing on each individual biomarker and compared the Akaike information criterion values among models, using the lme4 and bbmle packages in R. Next, we also ran LASSO for multilevel linear models with patient IDs as random intercepts for biomarker selection, using glmmLasso and lmmen packages in R.

Heatmap, cluster analysis, volcano plot, random forest classification and Bayesian network analysis

Data from the protein array screening assay were used to generate a heatmap which clustered proteins with similar expression patterns together. The data from each group were imported into R for clustering analysis and heatmap generation. For clustering, proteins were clustered in an unsupervised manner based on Euclidean distance. R was then used to generate a volcano plot of log2 fold change (FC) of expression vs the -log10 p value, as determined by Mann-Whitney U test. Three hundred and two urine proteins were elevated at FC >2, at p<0.05, and 82 proteins were found to be elevated at FC >5, at p<0.001, comparing SLE samples to healthy control urine.

Random Forest Classification analysis, a machine learning algorithm for dimensionality reduction, was executed using R, with 1000 bootstrap sampling iterations, in order to identify the relative importance of each biomarker in disease classification, using the GINI index. For the top 20 urine biomarker proteins identified by random forest classification, the FC in SLE versus healthy controls were plotted as a radial plot, using Excel.

Ingenuity pathway analysis (IPA) was used to determine which known pathway networks were enriched by ranking the proteins based on FC and p values that were significantly over-expressed in SLE urine compared with controls. The ranked genes were searched using the QIAGEN Knowledge Base to find pathways that contained these proteins. This knowledge base was curated from various literature sources and included direction of effect of molecule in a network as documented in the referencing publications. The Canonical Pathways were ranked by p values of overlap, calculated using right-tailed Fisher’s exact test.

Bayesian network analysis was performed using the BayesiaLab software (Bayesia, V.7.0.1), and the following parameters: the urine levels of nine protein biomarkers, demographics, disease features or measures (proteinuria, pyuria, hematuria, SLEDAI, rSLEDAI, PGA), as well as various laboratory measures. Continuous data were discretised into three bins using the R2-GenOpt algorithm and the Maximum Weight Spanning Tree algorithm (α=0.45) was used for unsupervised learning of the network.

Querying scRNA-seq data from LN renal biopsies

Publicly available scRNA-seq data from patients with biopsy-proven LN and healthy controls were obtained from ImmPort using accession number SDY997 EXP15077 and from the NCBI database of Genotypes and Phenotypes under the accession number phs001457.v1.p1. For both datasets, post quality control expression datasets contained both skin and kidney cells and were subsetted to only include kidney/immune cells for downstream analysis, yielding 1401 cells from 21 LN patients and 3 healthy control biopsies,17 or renal-infiltrating immune cells from 24 LN biopsies.18 Graph-based clustering, t-distributed stochastic neighbor embedding (tSNE) and uniform manifold approximation and projection (UMAP) were performed on the kidney single-cell profiles using the Seurat package for R. Principal component analysis yielding 50 principal components was used to derive the clusters. Cluster identity was generated by comparing differentially expressed genes between the clusters to canonical markers. Feature and violin plots were also created using the Seurat package for R.


Protein array screen

Urine samples from 24 subjects (HC=9, active SLE=15, female, black, age range 23–33 years, see online supplementary table S1) were used for the screening of 1000 distinct human proteins using a prefabricated, commercially available capture-antibody coated protein array. All 15 SLE patients used for screening had active SLE, 12 of whom had renal involvement and rSLEDAI ≥4. The expression of all 1000 urine proteins in these subjects, normalised by urine creatinine, was used to generate a heatmap, comparing HC and SLE urine, using R (figure 1A). The heatmap shows that some urine proteins are overexpressed in SLE patients compared with the controls and vice versa. To better visualise these increases a volcano plot was generated in R, plotting FC in SLE versus HC, against the corresponding p value (figure 1B). Three hundred and two proteins met the significance threshold of p<0.05 and FC>2.

Figure 1

Interrogation of 1000 urinary proteins in SLE patients and healthy controls. Urine samples from healthy controls and SLE patients (total=24; HC=9; SLE=15, all with active disease) were interrogated for the levels of 1000 proteins, using a quantitative array platform, and creatinine normalised. (A) Heatmap of patient-group-supervised clustering (the columns) reveals the landscape of protein expression across the 24 urine samples (HC vs SLE), as determined from the protein array. The yellow-blue colour scheme indicates the expression of each of the 1000 proteins (each row representing one protein), with yellow indicating overexpression and blue indicating under expression, compared with the median expression level for that protein. (B) Volcano plot showing expression differences of 1000 proteins in the urine, when comparing log2 fold change of protein expression versus the negative log10 p value, that is, biological significance versus statistical significance. Each dot represents a protein and its average value for that subset (SLE vs HC). Horizontal lines depict significance with p<0.05 (red) and <0.001 (green). Vertical lines depict fold change of 2 (orange) and 5 (yellow), comparing the levels in SLE to the corresponding levels in HC. All biomarker data were normalised by creatinine concentration and analysed using a two-tailed Mann-Whitney U test. (C–E) Protein expression pathways encompassed by the 302 significantly upregulated urinary proteins in SLE versus health controls (FC >2, at p<0.05) as determined using Ingenuity Pathway Analysis included molecular networks regulated by NFkB signalling, p38 mitogen-activated protein kinase, and Akt signalling, depicted in C–E, respectively, as well as other pathways (not shown). Proteins that are coloured red were upregulated in SLE urine, with the intensity of the redness being proportional to the fold change. HC, healthy control; SLE, systemic lupus erythematosus.

IPA generated 22 networks from these 302 elevated urine proteins, with 8 networks incorporating a minimum of 20 elevated proteins each. These eight networks of elevated urine proteins encompassed pathways relating to NFKB activation, p38 mitogen-activated protein kinase activation and AKT activation (figure 1C–E), among others. The NFKB regulated proteins elevated in LN urine included multiple members of the TNF family including TNFRSF- 11B (OPG), 17 (BAFF), 1A, 1B (TNFα), and 8 (CD30), and TNFSF- 14 (LIGHT, HVEML) and 18 (GITRL), interleukin-1 (IL-1) family members, IL-17, and several chemokines, many of which have been implicated in different autoimmune diseases, including lupus. The AKT/PI3K regulated network included various cell adhesion molecules, PDGF/R and vascular endothelial growth factor (VEGF) family members, selectins, many of which have also been implicated in lupus pathogenesis. The observation that several proteins previously implicated in lupus biology were ‘rediscovered’ in LN urine in this study offers independent support for the validity of the novel screening approach used; these rediscovered proteins (numbering >40) and the supporting literature are detailed in online supplementary table S3.

More stringent criteria were used to determine which urinary proteins should be selected for further validation using orthogonal platforms. Sixty-four proteins were elevated in SLE urine at a FC > 5, at p<0.05, with the marker concentration in SLE urine exceeding 2000 pg/mg, as plotted in figure 2A. Of these, 54 proteins exhibited a q<0.05 after multiple testing correction.

Embedded Image

Embedded Image

The mean/median and FC of protein biomarkers in SLE versus HC urine, as well as the multiple testing correction results, are summarised in online supplementary table S4. To generate a more selective set of candidate biomarkers, only those with FC>15 were considered for further validation, which reduced the list of candidates for validation to 34. Random forest analysis (RFA), a machine learning algorithm, was also used to select urinary proteins that best discriminate between HC and SLE samples (figure 2B). Of the top 20 proteins identified using RFA, glypican-5 was solely identified using this analytical approach, but not by exercising FC and q value cutoffs (as listed in figure 2A).

Figure 2

Elevated urine proteins in SLE, as ascertained by fold change, statistical significance and machine learning algorithms, based on screening of 1000 proteins. (A) Horizontal dot plot depicting the top-most 64 proteins elevated in SLE urine, all of which exhibit a fold change >5 in SLE urine, p<0.05, and an average concentration >2000 pg/mg in SLE urine, based on the array-based screen of 1000 proteins. Blue dots indicate urine protein levels in healthy controls and red dots indicate levels in SLE patients. (B) Radar chart depicting the top 20 urine proteins based on random forest analysis comparing urine from SLE and healthy controls, again based on the array-based screen of 1000 proteins in 15 SLE patients, all of whom had active disease. Each point in the graph indicates the FC of each protein in SLE vs HC. IFN-g, interferon-γ; IL-1, interleukin-1; SLE, systemic lupus erythematosus.

Thirt-five protein candidates were considered for further validation, composed of 34 proteins arising from FC>15 and q value cut-offs, and glypican-5 arising from the RFA results. Of these 35 candidates, 18 were not pursued further for various reasons, as detailed in online supplementary table S5. For example, when two or more urine proteins belonged to the same cluster based on expression profile across the screening cohort (ie, correlation coefficient >0.95), only one representative protein was selected for ELISA validation. In addition, urine proteins that have already been accorded biomarker potential in previous LN studies (eg, angiostatin, ferritin, VCAM-1, etc), were not selected for further validation. ELISA kits were purchased for the remaining 17 proteins. Eight of these 17 ELISA kits failed to yield quantifiable results with the urine samples (BMP5, BMP7, Cathepsin H, CLEC-2, htPAPPA-A, MSCF-R, Siglec-5 and Siglec-11). This is not uncommon as most commercial ELISA assays are not rated for urine and the protein array platform has a lower LOD compared with conventional ELISA. The remaining nine urinary proteins were pursued further to validate the protein array screening results (Angptl4, FOLR2, GPC-5, L-selectin, PDGF-Rβ, PRX2, TGFβ−1, TPP-1 and TSP-1). The kits and urine dilutions used are listed in online supplementary table S5.

Validation of proteomic hits in the primary cross-sectional cohort

An independent cohort of n=78 subjects (HC=16, inactive and low disease activity SLE=17, ANR=16, AR=29) was used for ELISA validation (figure 3). Of note, this validation cohort included AR SLE as well as ANR SLE patients. Importantly, both the ANR and AR groups had comparable SLEDAI scores. The demographic attributes, clinical features and treatment history of these subjects are presented in online supplementary table S2. The mean and median values of each urine biomarker candidate in the four subject groups, and the FC and statistical comparisons between groups are summarised in online supplementary table S6. The ELISA results showed that urine Angptl4, FOLR2, GPC-5, L-selectin, PDGF-Rβ, PRX2 and TSP-1 were all significantly upregulated in AR patients compared with the HCs. Furthermore, urine Angptl4, FOLR2, L-selectin, TGFβ1, TPP1 and TSP-1 were all significantly elevated in AR patients compared with ANR SLE, indicating that each of these urinary proteins were indicative of renal lupus in patients with active SLE. Urine concentrations of these proteins without urine creatinine normalisation are also displayed in online supplementary figure S1, showing similar performance in discriminating between the groups. We also examined whether these urine proteins could discriminate between inactive patients (or ANR patients) with previous history of renal involvement from those who never had any renal involvement. However, there was no difference between the groups when dichotomised based on previous renal involvement (data not shown). Hence, these urine proteins are likely to be reflective of active, not previous, renal disease in lupus patients.

Figure 3

ELISA validation of array-based screening studies in an independent cross-sectional cohort of SLE patients. (A) Cohort of 78 urine samples: 16 healthy controls (black), 17 inactive SLE (blue), 16 active non-renal (ANR, green) and 29 active renal (Ar, red) were tested by ELISA for the levels of Angptl4, FOLR2, GPC-5, L-selectin, PDGF-Rβ, PRX2, TGFβ1, TPP1 and TSP-1. Patients in the AR and ANR groups had comparable SLEDAI scores. All data are creatinine normalised. The asterisks designate the level of significance: *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001, using a Mann-Whitney U test. HC, healthy control; SLEDAI, Systemic Lupus Erythematosus Disease Activity Index; TGFβ1, transforming growth factor-ß1; TSP-1, thrombospondin-1.

The performance of these urinary proteins in differentiating between ANR and AR SLE or between AR and inactive SLE is further highlighted in figure 4, using ROC. Urine Angptl4 and L-selectin both exhibited AUC values of ≥0.86. TPP1, TGFβ1 and TSP-1 AUC’s are 0.84, 0.78 and 0.73, respectively (online supplementary table S6). Each of these urine proteins outperformed traditional clinical parameters such as anti-dsDNA (AUC=0.58), C3 (AUC=0.48) and C4 (AUC=0.34) in distinguishing AR from ANR SLE, as displayed in figure 4A. In terms of discriminating AR SLE from inactive SLE patients, L-selectin was comparable to anti-dsDNA, but superior to urine Angptl4 and TPP1 (online supplementary table S7). Next, multiple proteins were combined into panels, to evaluate if any particular combination of urine proteins exhibited further improvement in diagnostic potential, using Lasso regression analysis. As displayed in figure 4B, the combination of urine L-selectin, Angptl4 and TPP1 exhibited the best diagnostic potential, with an AUC of 0.97, in distinguishing AR from ANR SLE.

Figure 4

Urine L-selectin, Angptl4, TGFβ1, PDGF-Rβ and TPP1 are best at discriminating active renal (AR) SLE from active non-renal (ANR) SLE. Receiver operating characteristic curves (ROC) for distinguishing ANR SLE from AR SLE (A) and inactive SLE from AR SLE (B) using urine Angptl4, L-selectin or TPP1, all determined using ELISA (as in figure 3), and the corresponding ROC curves for anti-dsDNA, C3 and C4. The five urine proteins that were most discriminatory of AR SLE from ANR SLE were urine Angptl4, L-selectin, TPP1, TGFβ1 and TSP-1; the last two are not plotted; see online supplementary table S6. The discriminatory abilities of proteins distinguishing inactive SLE from AR SLE are demonstrated in online supplementary table S7. (C) When multiple proteins were combined into panels, to evaluate if any particular combination of urine proteins exhibited further improvement in diagnostic potential, using LASSO regression analysis, the indicated combination of urine proteins exhibited the best diagnostic potential, with an AUC of 0.97, in distinguishing AR from ANR SLE. AUC, area under the curve; SLE, systemic lupus erythematosus; TGFβ1, transforming growth factor-ß1; TSP-1, thrombospondin-1.

Multivariate regression analysis was next performed to assess how well the assayed urine proteins predicted various clinical metrics. As depicted in figure 5A, most of the assayed urine proteins displayed significant association with rSLEDAI, proteinuria and/or PGA, based on linear regression analysis. Since the AR group were younger, with more African American patients (100% vs 81%) compared with the ANR group (online supplementary table S2), we next adjusted this analysis by age and ethnicity. Indeed, most of these associations maintained statistical significance, even after adjusting for demographic variables, as indicated by the bolded entries in figure 5A. Likewise, these same proteins also exhibited significant potential to discriminate AR from ANR SLE, based on logistic regression analysis (figure 5A). Of note, urine Angptl4, L-selectin, TGFβ1 and TPP1 maintained this potential even after correcting for demographic variables.

Figure 5

Graphical representation of correlations between clinically measured parameters and the nine urine biomarkers, as assayed by ELISA. (A) Multivariate regression analysis was performed to assess how well each assayed urine protein predicted SLEDAI, rSLEDAI, proteinuria and/or PGA, based on linear regression analysis, and active renal SLE versus active non-renal (ANR) SLE disease status, by logistic regression analysis. 1: Indicated in each cell is the regression coefficient, with an indication of statistical significance (*p<0.05; **p<0.01; ***p<0.001; ****p<0.0001). If an association maintained statistical significance after adjusting for demographic variables, this is indicated using bolded entries. (B) Bayesian network analysis of urine biomarkers in LN. Directed acyclic graph depicting correlation between variables, created by using maximum spanning tree algorithm. Size of nodes depicts node force, which is an estimate of the impact of that variable on all other variables in this network. Numbers indicate correlation between neighbouring nodes. Colours of nodes indicate type of variables: green is a clinical index, purple is a biomarker molecule, brown represents disease status, white is other. (C) Correlation plot of clinical/laboratory parameters with biomarkers (proteins). Each square represents a correlation. A darker background indicates a lower p value, as determined by Pearson correlation. The size of the dot in each square represents the magnitude of the correlation, with a bigger dot representing higher correlation. blue dots indicate negative/inverse correlation. orange dots indicate positive/direct correlation. Plot was drawn in R using the ggplot and ggraster functions. HCT, hematocrit; HGB, hemoglobin; LN, lupus nephritis; PDGF, platelet-derived growth factor receptor; PGA, physician global assessment; RBC, red blood cell; rSLEDAI, renal domains of Systemic Lupus Erythematosus Disease Activity Index; TGF-β1, transforming growth factor-β1; TSP-1, thrombospondin-1; WBC, white blood cell.

Next, we analysed how the clinical indices and assayed urine biomarkers were related to each other, using unsupervised Bayesian network analysis, which can uncover the interdependencies of all variables in a model, through probability distributions. These interdependencies are presented as a directed acyclic graph (figure 5B). The observed positive correlation of proteinuria with ‘renal involvement’ and negative correlation of C3 and C4 with SLEDAI offer independent validation of this networking approach. These Bayesian networks demonstrate that renal involvement was, predictably, most strongly correlated to rSLEDAI and PGA. None of the assayed urine proteins were associated with demographic variables or haematological parameters. Interestingly, the standard clinical parameters of anti-dsDNA, C3 and C4 were not as correlated to renal involvement as the assayed candidate biomarkers, TPP1, Angptl4, L-selectin, PRX2 and FOLR2. Of the assayed biomarkers, urinary Angptl4 exhibited the strongest ‘node force’ or impact on this network, being directly correlated to rSLEDAI, while urine TPP1 had the strongest impact on PGA (figure 5B).

To corroborate the Bayesian network analysis results, a correlation plot between each of the nine potential biomarkers to clinical parameters was created, as displayed in figure 5C. The levels of urine Angptl4, FOLR2, L-selectin, TGFβ1, TPP1 and TSP-1 was strongly, significantly and directly correlated with rSLEDAI. This observation holds true for these same proteins in relation to proteinuria. Conversely, most (seven of nine) of the investigated biomarkers (Angptl4, FOLR2, Glypican5, L-selectin, TGFβ1, TPP1 and TSP-1) were inversely correlated with C3 and C4 levels. Combined, these analyses demonstrate that the candidate urinary biomarkers identified are strong indicators for predicting renal involvement in lupus patients.

Further validation in LN patients with concurrent renal biopsy

To address whether these biomarker candidates reflect the pathological class of renal biopsy, we further assayed urine Angptl4, L-selectin, TGFβ1 and TPP1 in an independent cohort of 20 LN patients with concurrent urine samples obtained at the time of renal biopsy (online supplementary table S8). As already noted before, all four biomarkers were elevated in LN patients when compared with healthy controls. When LN patients were further categorised by LN class, urine L-selectin and TPP1 were elevated in each LN class compared with healthy controls (online supplementary figure S2). Urine Angptl4 was elevated in class I/II, class III(±V), class IV(±V) and class V, while TGFβ1 was elevated in LN Class III (±V), class IV (±V), and class V, compared with healthy controls. There was a trend for these proteins to be higher in LN IV(±V) compared with the other LN classes, but no statistical significance was observed (online supplementary figure S2).

Performance of urinary biomarker candidates in tracking disease activity in LN patients

To interrogate the performance of these biomarker candidates during serial follow-up, urine Angptl4, L-selectin, TGFβ1 and TPP1 were measured in 7 LN patients with multiple hospital visits. The serial tracking plots in figure 6 demonstrate the fluctuations of these proteins along with SLEDAI, rSLEDAI and uPCR over time. In some visits, these proteins elevated simultaneously (marked with ‘@’) with the increase of disease activity as reflected by the SLEDAI or rSLEDAI indices or uPCR, and in other visits, these proteins preceded the clinical flares (marked using ‘p’), as plotted in figure 6. Of the four urine proteins tested, urine L-selectin preceded or coincided with worsening of SLEDAI or rSLEDAI in all seven patients, as marked by the ‘@’ and ‘p’ symbols in figure 6 (row 2). Urine Angptl4 also performed well, preceding or coinciding with worsening of SLEDAI or rSLEDAI in six of seven patients. Urine TGFβ1 ranked third, as it preceded or coincided with worsening of SLEDAI or rSLEDAI in five of seven patients (figure 6). In all cases, the fluctuations in urine biomarkers were substantially more pronounced than the subtle changes in proteinuria. In some instances, the urine biomarker showed evidence of a rise even before proteinuria rose, and tracked better with disease activity, compared with proteinuria. For example, in patients 2, 4 and 5, urine Angptl4 kept rising with worsening disease activity, compared with the lacklustre performance of proteinuria (row 1 in figure 6).

Figure 6

Performance of urine Angptl4, L-selectin, TGFβ1 and TPP1 in tracking disease activity during serial follow-up of LN patients. The visit month is shown on the X-axis, while the disease activity index and biomarker levels are indicated on the left and right vertical axes, respectively. The serial tracking plots demonstrated the fluctuations of urine Angptl4, L-selectin, TGFβ1 and TPP1 along with SLEDAI, rSLEDAI and uPCR over time. In some visits, these proteins elevated simultaneously (marked with ‘@’) with the increase of disease activity as reflected by the SLEDAI or rSLEDAI indices or uPCR and in other visits, these proteins preceded the clinical flares (marked using ‘p’). Statistical analyses of these data highlighted urine L-selectin, Angptl4, and TGFβ1 as proteins that best tracked disease activity over time, as detailed under results. LN, lupus nephritis; rSLEDAI, renal domains of Systemic Lupus Erythematosus Disease Activity Index; TGFβ1, transforming growth factor-β1; uPCR, urine protein/creatinine ratio.

Multilevel linear models with L-selectin were superior in predicting concurrent SLEDAI (AIC=261.5) and rSLEDAI (AIC=241.1), compared with the remaining biomarkers, all of which exhibited higher AIC values, suggesting weaker performance. By LASSO analysis, urine L-selectin and TPP1 were independent predictors of SLEDAI, whereas urine L-selectin, Angptl4 and TGFβ1 were independent predictors of rSLEDAI. Urine L-selectin (AIC=116.2) was best at predicting an oncoming rise in rSLEDAI (within the following 1–2 months), with Angptl4 (AIC=116.8) and TGFβ1 (AIC=116.8) being almost as predictive. Urine Angptl4 (AIC=128.6) was best at predicting oncoming increases in SLEDAI (within the following 1–2 months), with L-selectin (AIC=129.1) and TGFβ1 (AIC=130.2) being almost as predictive (data not plotted). Hence, consistent with the visual inspection of figure 6, statistical analysis of the serial data affirms that urine L-selectin is the best predictor of disease activity, followed by Angptl4 and TGFβ1, when LN disease activity is tracked serially.

Assessing the specificity of urinary biomarker candidates using CKD controls

To examine the specificity of these urinary proteins for LN, urine Angptl4, L-selectin, TGFβ1 and TPP1 were assayed in 47 CKD patients, including 14 patients with diabetic nephropathy, 11 patients with hypertensive nephropathy, 9 patients with focal segmental glomerulosclerosis (FSGS) and 13 with other causes of CKD (online supplementary table S9). Urine Angptl4, L-selectin and TPP1 were significantly elevated in the CKD controls when compared with healthy controls, while urine TGFβ1 was increased only in FSGS (figure 7). These findings indicate that these four proteins are not specific for LN. Interestingly, urine Angptl4 correlated with CKD stage (correlation coefficient 0.56, p<0.0001) and was significantly higher in patients with CKD stages 4 and 5, compared with patients with CKD stages 1–3 (figure 7B; p<0.001). The other urine markers did not show any association with the CKD stage.

Figure 7

Urine Angptl4, L-selectin, TGFβ1 and TPP1 in other CKD controls. Urine Angptl4, L-selectin, TGFβ1 and TPP1 were assayed in 47 CKD patients, including 14 patients with diabetic nephropathy (DN), 11 patients with hypertensive nephropathy (HTN), 9 patients with focal segmental glomerulosclerosis (FSGS), and 13 with other causes of CKD. Shown in (B) are urine Angptl4 levels in the same CKD controls, parsed by CKD stage (stages 1 to 5). All data is creatinine normalised. The asterisks designate the level of significance: *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001, using a Mann-Whitney U test. CKD, chronic kidney disease; TGFβ1, transforming growth factor-ß1.

Expression of urinary biomarker candidates within the kidneys in LN

To further explore the potential origins of candidate biomarkers at a cellular level within LN kidneys, renal scRNA-seq data obtained 21 LN biopsies (figure 8A)17 and scRNA-seq data from renal-infiltrating immune cells from 24 independent LN biopsies (figure 8B)18 were queried. As is apparent from the violin plots in figure 8, all nine of the proteins were expressed within LN kidneys, either within resident renal cells or infiltrating leukocytes. Some proteins such as TSP-1 (corresponding gene name is THBS1) and TPP1 were found to be expressed in multiple resident renal cell types, and isolated infiltrating immune cell types, notably B-cells or plasma cells. Other proteins exhibited more selective expression profiles. PDFG-RB was strongly expressed on mesangial cells, Angptl4 showed the strongest expression within infiltrating plasma cells, while FOLR2 was expressed predominantly within macrophages, most notably M2 macrophages. L-selectin exhibited the strongest expression on infiltrating B-cells (notably on activated and ISG-high B-cells), while TGFβ1 was expressed at high levels on all infiltrating T-cells. The strong expression profiles of these proteins within LN kidneys strongly supported the hypothesis that the proteins elevated in LN urine were likely to be renal in origin.

Figure 8

Violin plots of scRNA-seq data for the nine candidate proteins in lupus nephritis renal tissue drawn from 45 LN biopsy tissues. renal scRNA-seq data obtained 21 ln biopsies (A)17 and scRNA-seq data from renal-infiltrating immune cells from 24 independent LN biopsies (B)18 were analysed for the expression profile of each of the nine biomarker proteins. Cells were divided into clusters based on the expression of canonical genes. Dots represent individual cells with their respective log expression level for each candidate biomarker. The profile for every cell type is shown for each biomarker. Each colour represents a particular cell type, for example, B-cells are purple, and mesangial cells are orange, in (A). For the renal-infiltrating cells in (B), the cluster annotations are as follows18—CM0: CD16+ macrophage, inflammatory; CM1: CD16+ macrophage, phagocytic; CM2: tissue-resident macrophage; CM3: cDCs; CM4: CD16+ macrophage, M2-like; CT0a: effector memory CD4+ T cells; CT0b: central memory CD4+ T cells; CT1: CD56_dim CD16+ NK cells; CT2: CTLs; CT3a: Tregs; CT3b: TFH-like cells; CT4: GZMK+CD8+ T cells; CT5a: resident memory CD8+ T cells; CT5b: CD56_bright CD16- NK cells; CT6: ISG-high CD4+ T cells; CB0: activated B cells; CB1: plasma cells/Plasmablasts; CB2A: naive B cells; CB2b: pDCs; CB3: ISG-high B cells; CD0: dividing cells; CE0: epithelial cells. LN, lupus nephritis; scRNA-seq, single-cell RNA sequencing.


The ability to screen a large library of proteins simultaneously has changed the landscape of biomarker discovery research. Mass spectrometry has been used in recent SLE studies; however, these data can often be noisy and may fail to detect low abundance proteins due to overshadowing signals from high-abundance proteins (albumin, urea, immunoglobulins, etc). In contrast, targeted proteomic assays, such as the protein array used in this study, rely on ligand-protein interactions resulting in the potential to accurately detect and quantify low abundance proteins. The array used in this study is one of the largest proteomic screening platforms available, providing quantitative information on 1000 different proteins, due to the inclusion of 1000 standards. Our screen of LN urine using this platform rediscovered several proteins already known to be upregulated in SLE or LN, including Adiponectin,19 Angiostatin,20 CD36, Ferritin,21 Galectin 7, ICAM-1,22 IGFBP family proteins,23 MIF,24 Resistin, S100A8,25 Siglec-511 and VCAM-1(23), thus offering independent validation of this platform. These reconfirmations are further discussed in online supplementary table S3. In the present study, we did not pursue these particular markers, as most of them have already been validated in the literature. Instead, we focused on novel proteins not previously implicated in LN.

Of the nine ELISA-validated proteins, urine Angptl4, L-selectin, TPP1, TGFβ1, TSP-1, FOLR2 and PDGF-Rβ exhibited the best ROC AUC values of 0.96, 0.86, 0.84, 0.78, 0.73, 0.72 and 0.67 respectively, in terms of distinguishing lupus patients with AR involvement from those with ANR disease, despite both groups having comparable SLEDAI scores. Angptl4, which is produced under hypoxic conditions, is correlated with the expression of hypoxia-inducible factor (HIF)−1a. Studies have recently shown that HIF-1a is involved in the generation of T helper, T regulatory and dendritic cells, all of which are well documented to be important in autoimmunity.26 Expression of HIF-1a in the glomerulus has been linked to SLE, with correlation to SLEDAI and renal pathology activity index.27 Angptl4 has been shown to regulate acute inflammation in several organs through both TTP-dependent and independent signalling pathways. As a candidate biomarker, urinary Angptl4 easily differentiated between ANR and AR states (p<0.001) with a specificity of 87.5%, a PPV of 93.5%, and an outstanding NPV of 100%. Similar performance was also observed in discriminating AR from inactive SLE patients. Its expression was strongly correlated with rSLEDAI. It was found mostly in renal infiltrating plasma cells, where its functional role remains a black box. These data strongly suggest urine Angptl4 is an excellent biomarker for identifying AR SLE. Moreover, it was the only marker that correlated with CKD stage, independent of the primary renal disease, suggesting that it may be a universal marker of deteriorating renal function. In diabetic nephropathy, circulating Angptl4 was significantly increased compared with controls, and was correlated with albumin-to-creatinine ratio, serum creatinine and estimated glomerular filtration rate (eGFR).28 In FSGS and membranous nephropathy (MN), urine Angptl4 was increased in patients with massive proteinuria, and was associated with relapse.29

L-selectin (CD62L), a leucocyte adhesion molecule, has a well-established association with acute inflammation,30 autoimmune diseases,31 specifically SLE,32 as well as other renal diseases.31 32 In this study, urinary L-selectin distinguished between AR versus HC or ANR patients with strong PPV (92.5%) and acceptable NPV (70.1%). L-selectin levels exhibited significant linear regression with rSLEDAI, PGA and proteinuria, even after correcting for demographic variables. Interestingly, scRNA-seq analysis revealed that L-selectin was most highly expressed on renal infiltrating B cells in LN, particularly on activated B-cells and interferon-I signature expressing B-cells, indicating the L-selectin may be mediating the migration of these lymphocyte subsets into or within the inflamed kidneys.33

TPP1 (tripeptidyl-peptidase 1) plays a pivotal role in mediating telomere capping and length control,34 and is found abundantly in the bone marrow, placenta, lungs, and lymphocytes. Its expression has been shown to be lower in PBMCs of patients with SLE compared with healthy counterparts.35 However, to date, no study has reported urinary TPP1 levels in patients with kidney disease or autoimmunity. It was strongly associated with rSLEDAI and PGA. It also exhibited excellent specificity (100%) and acceptable PPV (64.2%) in distinguishing AR from ANR SLE patients. Within LN kidneys, TPP1 was strongly and broadly expressed in multiple resident cell types, as well as in renal infiltrating B-cells and plasma cells. Importantly, the combination of urine Angptl4 and TPP1 exhibited the best diagnostic potential, with an AUC of 0.98, in distinguishing AR from ANR SLE. Clearly, this panel of markers needs further validation in additional patient cohorts to substantiate its diagnostic performance.

TGFβ1, a TGF and a potent regulatory cytokine,36 has been reported as a biomarker in renal SLE studies previously,37 38 with significantly lower levels of TGFβ1 documented in the serum of SLE patients.39 It has been shown to be upregulated in a discoid lupus erythematosus microarray study,40 but unaltered in rheumatoid arthritis.41 Genetic variations of TGFβ1 have been associated with other diseases, including coeliac disease42 and multiple sclerosis.43 In our study, urinary TGFβ1 was highly expressed in AR LN samples and strongly associated with rSLEDAI and PGA, even after correction for demographics. TGFβ1 was broadly expressed within LN kidneys, with the strongest expression being noted in infiltrating T-cells, based on scRNA-seq data. It is perhaps not surprising that both TSP-1 (gene name THBS1) and TGFβ1 demonstrated similar linear association with rSLEDAI and PGA, and similar expression profiles within LN kidneys, since TSP-1 is involved in the regulation of TGFβ1 signalling44 and has been demonstrated to be a significant activator of TGFβ1 in fibrotic renal disease.45 TGFβ1 is not specific for LN. In patients with diabetic nephropathy, urinary TGFβ1 was significantly higher compared with controls, and modestly correlated with urinary protein.46 47 In patients with essential hypertension, serum TGFβ1 was highly elevated and strongly associated with urinary albumin.48

Urine FOLR2 was another protein that exhibited the ability to distinguish AR from ANR SLE, with significant linear regression with SLEDAI, rSLEDAI and PGA, even after correction for demographics. The folate receptor is known to be expressed by M2 macrophages,49 50 which suppress inflammation,51 and this was also borne out by the scRNA-seq analysis of LN kidneys. It is conceivable that the expanded M2 macrophages in LN kidneys may play a role in reigning in inflammation; on the other hand, at later stages, they may be contributing to the fibrosis in this disease. While FOLR2 has been reported as a diagnostic target in nonalcoholic steatohepatitis,52 it has not been examined in SLE patients. While urine FOLR2 may be a convenient marker for intrarenal M2 macrophages, the precise role of these innate immune cells in LN warrants further analysis.

Urine PDGF-Rβ also showed excellent ability to distinguish AR from ANR SLE patients. PDGF/PDGF-receptors are involved in the regulation of cell migration and proliferation. Importantly, PDGF-Rβ has been identified as a candidate gene for SLE through machine learning approaches,53 and has been implicated as an autoantigen target in the serum of patients with SLE.54 The PDGF family has also been shown to be involved in renal fibrosis,55 which concurs with the elevated expression seen in AR LN patients in our study. Indeed, the inhibitor of PDGF-Rβ, Gleevac, used for the treatment of leukaemia, has been shown to be effective as an anti-fibrotic agent.56 PDGF-Rβ has also been reported to be upregulated on CD4+ T cells of SLE patients.57 In our data, urine PDGF-Rβ was upregulated in active LN patients, with most expression seen within mesangial cells and vascular smooth muscle cells. We hypothesise that increased PDGF-mediated signalling in mesangial cells may drive proliferative LN. These pinpoint urine PDGF-Rβ as an excellent biomarker of active LN, and a potential predictor of LN patients who might respond to Gleevac-based therapies.

The top four proteins, Angptl4, L-selectin, TGFβ1 and TPP1 were further validated in an LN cohort where seven patients were examined serially over multiple visits. Urine L-selectin and Angptl4 preceded or coincided with worsening of SLEDAI or rSLEDAI in almost of the LN patients tracked, and exhibited a more dramatic and often more reliable indication of clinical disease activity, compared with proteinuria. Clearly, these markers warrant independent validation in additional serial cohorts where the LN patient visits are more closely timed. The current findings raise hope that urine L-selectin and Angptl4 could potentially be used for clinical monitoring of disease status in patients with LN.

Currently, there is no consensus as to what ‘AR lupus’ is. In this study, active LN included biopsy-proven LN with rSLEDAI ≥4, and excluded patients with isolated hematuria or pyuria. In other words, all patients in the ‘active LN’ group had proteinuria. This is consistent with the observation that proteinuria is the individual best predictor of long-term renal outcome in LN, while inclusion of urinary red blood cells undermines the predictive value.58 However, this leaves us with the following question: could the emergence of these markers simply be due to proteinuria? We do not believe this to be the case based on several observations. First, most of the elevated urinary biomarkers are not elevated in circulation (data not shown). Second, when we plotted the FC of urine biomarker (active LN vs controls) versus the molecular weights of the 1000 proteins screened, dozens of other proteins that shared similar molecular weights as the biomarker candidates were not elevated in the urine. Third, in the small number of patients with multiple visits that we studied (figure 6), these proteins rose simultaneously with or preceded the increase in disease activity, more prominently and sometimes earlier than proteinuria. Clearly, more closely timed serial urine collections are needed to definitively establish if the implicated biomarker proteins can be detected in the urine before proteinuria sets in.

This study could be improved in several respects. Inclusion of additional ethnic groups, and a larger sample size would provide added power to validate the urinary molecules reported here. Concurrent renal biopsy samples (with renal pathology data) should be expanded to better assess the relationship between the identified urinary molecules and concurrent renal pathological attributes. With these biopsy concurrent samples, it would be ideal to perform the urine biomarker assays and renal scRNA-seq studies using the same set of patients. In addition, as discussed above, an expanded longitudinal study is warranted to investigate how these urinary molecules relate to renal pathology, disease progression, treatment response over time and long-term renal and patient outcome. Mechanistic studies are also needed to confirm the cellular origins of the identified biomarkers and to dissect out their respective roles in disease pathogenesis.



  • MP and CM are joint senior authors.

  • Handling editor Josef S Smolen

  • Contributors KV, SS, BS, TZ and MN undertook experiments needed for this study, and also read and approved the final manuscript. ClP, PC, CG, SD, SM, NT and VTTT undertook data analysis, and also read and approved the final manuscript. NJ, RS, ChP and MP contributed samples, discussed clinical data interpretation, and also read and approved the manuscript. CM conceived the study, helped design the experiments, interpret data, and read and approved the manuscript.In addition, KV, BS, TZ, SD and CM also wrote the manuscript.

  • Funding This work is supported by NIH funding AR074096, AR69572, and the George M. O’Brien Kidney Research Core Center P30DK079328. The Hopkins Lupus Cohort is supported by AR69572.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Patient consent for publication Not required.

  • Ethics approval The study was approved by the institutional review boards of John Hopkins University School of Medicine, Albert Einstein College of Medicine, UTSW and the University of Houston.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement All data relevant to the study are included in the article or uploaded as online supplementary information.