Article Text

PDF

Technical validation of cDNA based microarray as screening technique to identify candidate genes in synovial tissue biopsy specimens from patients with spondyloarthropathy
  1. M Rihl1,3,
  2. D Baeten2,
  3. N Seta3,
  4. J Gu3,
  5. F De Keyser2,
  6. E M Veys2,
  7. J G Kuipers1,
  8. H Zeidler1,
  9. D T Y Yu3
  1. 1Hannover Medical School, Department of Rheumatology, Hannover, Germany
  2. 2Ghent University Hospital, Department of Rheumatology, Ghent, Belgium
  3. 3University of California Los Angeles, UCLA School of Medicine, Division of Rheumatology, Los Angeles, CA, USA
  1. Correspondence to:
    Dr M Rihl
    Hannover Medical School (MHH), Department of Rheumatology (OE 6850), Carl-Neuberg-Str. 1, 30625 Hannover, Germany; Rihl.MarkusMH-Hannover.de

Abstract

Objectives: To validate the use of cDNA based microarray on synovial biopsies by analysing the experimental variability due to amplification of RNA, reproducibility of the assay, heterogeneity of the tissue, and statistical analysis.

Methods: Total RNA was extracted from three spondyloarthropathy (SpA) and three osteoarthritis (OA) synovial tissue biopsy specimens and from the peripheral blood mononuclear cells (PBMC) of four healthy donors. Exponential RNA amplification by SMART-PCR was compared with linear amplification. Reproducibility was tested by comparing different microarray systems and by performing duplicate experiments. Sample heterogeneity was assessed by comparing overall gene expression profiles, histopathology, and analysis of genes expressed in the synovium and normal PBMC. Statistical analysis using t test and Bonferroni adjustment was verified by permutation of class labels.

Results: Gene expression was concordant in 12/14 (86%) cytokine/chemokine genes between both microarrays and different RNA amplification systems. When one microarray system was used, expressed genes were 78–95% concordant in duplicate experiments. Gene expression profiles had a higher degree of similarity between SpA synovium than between PBMC or OA synovium despite clear histopathological differences between synovial samples. Comparison of SpA synovium with OA synovium and with PBMC yielded 11 and 18 expressed transcripts, respectively; six were shared in both comparisons. Permutations of SpA and OA samples yielded only one expressed gene in 19 comparisons.

Conclusions: These data provide evidence that microarrays can be used for analysis of synovial tissue biopsies with high reproducibility and low variability of the generated gene expression profiles.

  • microarray
  • SMART-PCR
  • spondyloarthropathies
  • synovial tissue
  • peripheral blood mononuclear cells
  • DMARDs, disease modifying antirheumatic drugs
  • HC, healthy controls
  • LPS, lipopolysaccharide
  • OA, osteoarthritis
  • PBMC, peripheral blood mononuclear cells
  • RT-PCR, reverse transcriptase-polymerase chain reaction
  • SMART, Switch Mechanism At the 5′ end of the RNA Template
  • SpA, spondyloarthropathy

Statistics from Altmetric.com

Microarrays are increasingly used for random screening of mRNA transcripts from cells and tissues. The technology is based on the principle that the biology of an organism corresponds with its gene expression profile disclosing preferentially expressed genes by comparative analysis. It allows profiling and simultaneous monitoring of a vast number of genes in one single hybridisation experiment. An increasing body of evidence suggests that microarrays are a valuable tool in the study of complex diseases, even though one needs to keep in mind that the technique is merely a screening approach contributing to hypothesis driven research. One basic potential lies in the detection of differentially expressed genes that would not have been expected by the current knowledge base, eventually leading to generation of new hypotheses about mechanisms in physiology and pathology. Recently, it has been shown that molecular profiling of various cancer tissues by microarray provides better recognition of new disease entities and prediction of prognosis or response to treatment than standard clinical or histological methods.1–3

We have previously used microarray technology with high technical reproducibility to screen for potentially disease mediating mRNA transcripts in peripheral blood and synovial fluid mononuclear cells from patients with active spondyloarthropathy (SpA).4,5 Because the synovium is probably the primary site of inflammation within the arthritic joint,6 it naturally is the relevant target structure for further study of specific disease mechanisms by microarray. However, to date there are no published reports of gene expression profiles from SpA synovial tissue biopsy specimens and very few reports are available about rheumatoid arthritis synovium.7,8 This is probably because the use of microarray on synovial tissue biopsy specimens raises a number of important questions about the technical aspects and the resulting scientific validity, even before starting to confirm microarray results by independent methods such as reverse transcriptase-polymerase chain reaction (RT-PCR) or immunohistochemistry. As clearly pointed out in a recent editorial,9 these aspects include reproducibility of methods, sample heterogeneity, statistical analysis with thresholds of arbitrary nature, and the apprehension that comparison of different patients and heterogeneous tissue composition (resident versus migrated inflammatory cells) reflecting the individual disease process might account for a great deal of experimental variability. Moreover, a technical aspect of specific interest for small synovial tissue biopsy specimens as obtained by needle arthroscopy is the quality and quantity of RNA/cDNA and the requirement for its amplification, possibly causing a bias in amplification of high or low abundance mRNA molecules, leading to a major distortion of gene expression profiles.

This study aimed at exploring and validating the technical aspects of microarray for analysis of synovial tissue biopsies, using SpA as a prototype of chronic inflammatory arthritis. Firstly, we analysed the reproducibility of the technique by studying exponential RNA amplification and the variability between different arrays on peripheral blood mononuclear cells (PBMC), as well as the run-to-run variability on synovial tissue biopsies. Secondly, we assessed the effect of sample heterogeneity on microarray results by analysing the gene expression profiles of SpA synovium, osteoarthritis (OA) synovium, and PBMC, and by comparing the results with the histopathology of the SpA and OA samples. Finally, we analysed the validity of the classical statistical approach (analysis of variance, Student’s t test with Bonferroni adjustment) and the appropriateness of the chosen thresholds by permutation testings of data of the SpA and OA synovial tissue samples. This approach was undertaken to demonstrate that the particular analysis can yield a low number of false positive results and that an increase in the number of false positive genes inevitably occurs when less stringent thresholds are used.

METHODS

PBMC and synovial biopsy samples

PBMC from four healthy donors (HC) were separated by a standard Ficoll-Histopaque procedure. PBMC from one healthy male donor were resuspended in serum-free RPMI medium (Gibco Invitrogen, Carlsbad, CA) and incubated in a tissue culture plate for 20 minutes at room temperature. The adherent cell fraction contained more than 80% of cells with macrophage-like appearance as determined by inverted microscope and described earlier.5 These cells were further subjected to incubation with 10 ng/ml of Salmonella-lipopolysaccharide (LPS; Sigma, St Louis, OH) for 4 hours. Exerting this stimulus ensured enhanced expression of a considerable number of genes specifically encoding for proinflammatory molecules to be analysed.10 The model of LPS stimulated monocytes uses the subtractive potential of the microarray analysis, which enables determination of comparability of amplified versus unamplified cDNA and reproducibility of two different microarray assays.

Synovial tissue biopsy samples were obtained by needle arthroscopy in three patients with SpA and three with OA as previously described11; six small biopsy pieces of the synovial membrane were collected from each patient. All patients with SpA fulfilled the European Spondylarthropathy Study Group (ESSG) classification criteria.12 These patients were chosen for analysis because they had active and early disease with an average time period from beginning of symptoms to the date of biopsy of 5.6 (range 3–8) months. They were all tested positive for the HLA-B27 allele. Patients had not received treatment with disease modifying antirheumatic drugs (DMARDs) at the time of biopsy. Both the synovial tissue biopsy specimens from the three patients with OA and the resting PBMC from four healthy donors were used as control samples (table 1 gives detailed characteristics of the patients and control subjects). All patients had given their informed consent, and the procedures followed were in accordance with the ethical standards of the University of California, Los Angeles.

Table 1

Characteristics of patients and healthy subjects

Synovial histopathology

Synovial tissue sections were available in two of three patients with SpA and two of three patients with OA. Histological and immunohistochemical analyses were performed as reported in a previous publication.6 Briefly, the histological measures included synovial lining hyperplasia, degree of vascularity, degree of infiltration, presence of lymphoid aggregates, plasma cells, and polymorphonuclear cells. Immunohistochemical peroxidase staining (LSAB+kit, Dako, Glostrup, Denmark) was performed for T cells (anti-CD3, Dako), B cells (anti-CD20, Dako), macrophages (anti-CD68, Dako), plasma cells (anti-CD138, Dako), and endothelial cells (anti-CD146, Dako). The parameters were scored semiquantitatively as described earlier.6

RNA extraction and RNA/cDNA amplification

Before RNA extraction, synovial tissues were ground by a pellet pestle set and homogenised in denaturing medium (solution D; Stratagene, La Jolla, CA). PBMC/monocytes were resuspended in solution D. Total RNA was extracted by four serial applications of phenol:chloroform 5:1 (pH 4.5; Ambion, Austin, TX), followed by two precipitations in isopropanol at −80°C for 1 hour and incubation with RQ1 DNase (Promega, Madison, WI) at 37°C for 1 hour. RNA was extracted again with phenol:chloroform and precipitated in 100% ethanol at −80°C for 30 minutes. Total RNA from monocytes before and after stimulation with LPS was split into aliquots of 150 ng, 300 ng, and 5 µg samples. The 150 ng and 300 ng aliquots were subjected to SMART-PCR (BD Biosciences Clontech, Palo Alto, CA) in a total reaction volume of 50 µl. The technology (Switch Mechanism At the 5′ end of the RNA Template) allows reverse transcription of small amounts of total RNA and subsequent amplification of the entire cDNA.13 For our studies, conditions of the PCR reaction were determined as follows: after reverse transcription using cDNA-specific primer and SMART II-oligonucleotide for annealing, 10 µl of single stranded cDNA dissolved in TE buffer (5 mM Tris, 1 mM EDTA, pH 7.5) was mixed with 32 µl of molecular biology grade water, 5 µl of 10× reaction buffer, 1 µl of 50× dNTP, 1 µl of specific PCR primer, and 1 µl of 50× polymerase mix. After denaturation at 95°C for 2 minutes, one PCR reaction cycle was performed at 95°C for 1 minute, 65°C for 1 minute, and 68°C for 6 minutes. Termination mix (2 µl) was added after completion of the reaction. PCR product (5 µl) was subjected to gel electrophoresis (1.5% agarose in 0.5× TE buffer) followed by ethidium bromide staining. Cycle numbers of the PCR reaction were conservatively optimised in order to restrict the amplification reaction to the exponential phase, strictly avoiding any over- or underamplified templates. This was achieved by choosing a cycle number two cycles fewer than the one that showed no more increase in band intensity—that is, the start of the plateau phase.

Microarrays

After amplification with the optimal cycle number, cDNA was purified using QIAquick PCR purification kit (Qiagen, Santa Clarita, CA) and mixed with the rediprime II dry beads random prime labelling system, followed by incubation with 32P at 37°C for 10 minutes (both Amersham Pharmacia Biotech, Piscataway, NJ). Probes were purified from unincorporated nucleotides using NucleoSpin purification kit (BD Biosciences Clontech), heat denatured for 5 minutes at 99°C, and hybridised for 20 hours at 68°C to the cDNA based nylon membrane Atlas Human 1.2 array containing 1185 genes (BD Biosciences Clontech). Microarray experiments of 5 µg total RNA aliquots were performed using the standard linear procedure as described in a previous publication.5 To validate findings generated with the Atlas array, we used another microarray system (cDNA based nylon membrane GeneFilters, GF211, containing 4136 genes, Research Genetics Invitrogen, Carlsbad, CA). Again 150 ng and 300 ng RNA aliquots were applied by using the described SMART-PCR amplification, whereas the 5 µg sample was subjected to the linear conventional procedure according to the instructions of Research Genetics Invitrogen as outlined in the protocol. After exposure to a phosphor screen cassette for 2 days, all membranes were scanned through a STORM 860 scanner (Molecular Dynamics, Sunnyvale, CA). The software packages “AtlasImage 1.5” (BD Biosciences Clontech) and “Pathways 2.01” (Research Genetics) retrieved signal intensities of each gene. The local background intensity measured around each gene was then subtracted from all signal intensities. The local background intensities of all individual genes were subsequently averaged, resulting in the mean background intensity of a particular membrane. Because normalisation and all further calculations were performed with data from which the local background had already been subtracted, thresholds for analysis were derived from the mean background of all membranes used in the analysis. Microarray experiments from all synovial tissues and healthy PBMC used 150 ng of total RNA amplified by SMART-PCR as described above. Labelling and hybridisation of probes to Human Atlas 1.2 membranes were performed also as described above. For a complete list of genes see http://www.clontech.com/atlas/genelists/index.shtml.

Data analysis and statistics

The G3PDH signal was used as indirect quality marker for the RNA used as only experiments were accepted that had a G3PDH signal lying within 1.5 times the standard deviation of all membranes evaluated. Normalisation was performed as previously described.4,5 A gene was considered to be LPS induced when the signal intensity of the particular mRNA transcript was at least twice as high as the average background signal of all membranes used in that particular experiment, and at least a twofold increase compared with non-LPS stimulated monocytes was seen. Because we were screening for LPS induced genes, we preferred the sensitivity in these comparisons to be high and accordingly the chosen ratio cut off point was 2. However, it must be taken into account that a low threshold inevitably decreases specificity, risking a higher number of false positive genes.

For the comparison of synovial tissue samples with normal PBMC, classification of gene expression data was achieved by hierarchical cluster analysis, implementing pairwise similarity function (average linkage analysis, Cluster-TreeView14;) for clustering genes and between-groups linkage analysis with Pearson correlation (SPSS 10.0 for Windows) for clustering samples. Results of clustered samples were visualised by dendrogram, reflecting the degree of similarity between the various expression datasets. Before statistical testing, two criteria had to be fulfilled in this study: firstly, difference of mean signal intensities from each transcript in the two comparison groups (SpA v OA, SpA v HC) was required to be more than twofold higher than the background level of all membranes used. Secondly, signal intensity ratios of disease versus control (SpA v OA, SpA v HC) were required to be higher than 4 in order to achieve a more stringent cut off point for this particular analysis. However, it should be noted that thresholds for signal to background ratio and signal increase in microarray studies are arbitrary. A signal increase four times higher than the control group can raise specificity by lowering the number of false positive genes, on the one hand, but will also lead to a lower sensitivity, on the other.

As for the generation of statistically valid data, genes that met the above mentioned criteria were subjected to Student’s t test (two tailed, unpaired) in combination with Hartley’s Fmax test for homogeneity of variance. Bonferroni adjustment of α was performed in relation to the number of genes fulfilling the above criteria. A value of p<0.05 after Bonferroni correction was considered to be significant—that is, p<0.0017 in the comparison of SpA v OA synovium and p<0.00063 in the comparison of SpA synovium v healthy PBMC. We also calculated the p values for all possible rearrangements of signal intensity data from all samples in both of the comparisons (19 permutations in SpA v OA and 34 permutations in SpA v HC). To confirm the validity of our analysis, we reanalysed (difference of mean higher than twofold background signal; ratio—that is, signal increase higher than 4; analysis of variance, Student’s t test, Bonferroni adjustment) the SpA and OA synovial tissue samples by shuffling the complete set of all signal intensity data (permutation of class labels). The same permutational analysis was performed with less stringent criteria (difference of mean higher than only onefold instead of twofold background signal; ratio higher than 2 instead of 4; analysis of variance, Student’s t test without Bonferroni adjustment—that is, a p value <0.05 was considered to be significant) in order to test to what extent the number of false positively expressed genes would increase.

RESULTS

Validation and reproducibility of the exponential RNA amplification method

This experiment was performed to demonstrate that RNA aliquots with different amounts of starting material amplified either linearly or exponentially yield reproducible results.

Complementary DNA templates were generated by exponential RNA amplification of two different RNA aliquots (150 and 300 ng) from monocytes of one healthy subject before and after LPS stimulation and were then analysed by microarray. To test the global reproducibility of this technique, correlation matrices using Pearson correlation were calculated between the 150 ng and the 300 ng Atlas array datasets: coefficients were 0.94 and 0.99 for adherent PBMC before and after LPS stimulation, respectively, with both correlations being significant at the 0.01 level (see fig 1A for XY scatter plots). Because this global reproducibility appeared to be high, further analysis of the reproducibility of individual gene expression was performed. As a number of cytokines and chemokines are known to be strongly up regulated after LPS stimulation of monocytes,10 these experiments mainly focused on this particular family of mRNA transcripts. Cytokine/chemokine genes were found to be expressed on a threefold higher level than all other LPS induced genes. The 150 ng and 300 ng exponentially amplified samples and the 5 µg linearly amplified samples were analysed by two different arrays: the Atlas 1.2 array containing a total number of 68 cytokines/chemokines and the GF211 array containing a total number of 59 cytokines/chemokines, with 32 overlapping cytokines/chemokines between both arrays. Table 2 shows that with the Atlas array nine of 68 identical cytokines/chemokines were detected in both the 150 ng and the 300 ng RNA aliquot samples obtained by exponential amplification. Moreover, six of these genes were also detected in the conventionally prepared sample containing 5 µg of total RNA, whereas only one additional cytokine was detected in this sample but in neither of the SMART-PCR samples (IL1β). The same analysis using the GF211 array yielded similar results: eight of 59 cytokines/chemokines were detected in the two 150 ng and 300 ng SMART-PCR samples as well as in the conventionally prepared sample. Again, only two chemokines were detected in the 5 µg sample but not in the two SMART-PCR samples, while one cytokine (IL1β) was detected in the 5 µg sample and in only one of the two SMART-PCR samples.

Table 2

Gene expression profiles of cytokines/chemokines in LPS stimulated monocytes as assessed by different RNA amplification methods and array systems

Figure 1

XY scatter plots for signal intensity data as generated by Atlas 1.2 microarray from different RNA aliquots. (A) displays the XY scatter plot of signal intensity data as generated from LPS stimulated monocytes of a healthy donor. Two different RNA aliquots from this donor (150 and 300 ng) were amplified by SMART-PCR and subsequently assayed by the Human Atlas 1.2 microarray system. The two datasets yield reproducible signal intensities as can be seen by their almost identical distribution with very few outliers. Calculation showed that the correlation coefficient between the two datasets was 0.99 (significance level 0.01; Pearson correlation). The local background was subtracted from the signal intensity data, which were normalised. The bold lines depict the onefold background level at 13 400 (SI, signal intensity; arbitrary units). (B) Signal intensities from a duplicate experiment using identical aliquots from a synovial tissue biopsy sample are shown. Total RNA (150 ng) was amplified by SMART-PCR and hybridised to the Atlas 1.2 array membrane. In comparison with LPS stimulated monocytes, most genes are expressed on a lower level, which is typically found in synovial tissue samples. Accordingly, the onefold background level was lower at 6700. Also, the distribution of signal intensities was more scattered as compared with fig 1A but still shows a reasonably high reproducibility. Here, the correlation coefficient was 0.98 (significance level 0.01; Pearson correlation).

In the SMART amplified RNA samples, seven LPS induced genes other than cytokine/chemokine genes were expressed in both the Atlas Array and the GF211 systems: metalloproteinase inhibitor 1, urokinase inhibitor, IEX-1L anti-death protein, guanine nucleotide binding protein, intercellular adhesion molecule-1, epidermal growth factor receptor, and protease inhibitor 19. Of those seven transcripts, only one transcript (IEX-1L anti-death protein) was detected in all samples irrespective of the mode of amplification and the array system used.

Variability between different microarrays

This experiment was performed to demonstrate that RNA aliquots hybridised to two different nylon based array systems yield reproducible results.

Using the same dataset (table 2), the three different samples were compared between the two microarray systems focusing on the 32 cytokines/chemokines immobilised on both of the two nylon membranes used. In seven of the eight genes that were expressed in at least one of the six experiments performed, the results were concordant in at least five experiments: four cytokines/chemokines were expressed in all six experiments, two genes were expressed in five of all six experiments. For one cytokine (IL1β) and one chemokine (MCP-4), the expression profile was discordant. Based on these results, it was decided to use 150 ng of total RNA because this would allow synovial tissue biopsy samples with very low yields of RNA to be studied. The SMART-PCR and Atlas array were considered to be sufficiently sensitive for the desired approach and were therefore used for the following experiments.

Run-to-run variability of microarray on synovial tissue biopsies

Two RNA aliquots from two synovial tissue samples each were tested by duplicate experiments in order to demonstrate a low run-to-run variability.

Although exponential amplification of RNA appeared to be reproducible, microarray data obtained from synovial tissue biopsies might still be biased owing to variability of the microarray procedure itself rather than owing to variability of the RNA amplification. To examine this issue, SMART-PCR amplified cDNA templates from synovial tissue biopsy samples of two patients with ankylosing spondylitis were submitted to two microarray testings each. Correlation coefficients of signal intensities of all genes showed high values between the two datasets of both patient samples: 0.98 and 0.95, respectively, at a significance level of 0.01 (see fig 1B for XY scatter plots). Genes identified as being expressed in this analysis were defined as genes with a signal intensity higher than twice the background level of the membranes used. When we focused on individual genes, analysis of the first patient sample indicated 78 expressed transcripts in the first array and 64 expressed transcripts in the second array, with 61 expressed in both testings (61/78 (78%)). Among those 61 expressed genes, the highest ratio of expression values between duplicate experiments—that is, RNA aliquots from the identical sample, was 1.8. When the same analysis was carried out on the second patient sample, 21 expressed genes were found in both the first and the second array, whereas 20 genes were detected in both testings (20/21 (95%)). Here, the highest ratio of expression values between duplicate experiments was 1.5. These data indicate a low variability of gene expression as measured by the array and led us to perform only one experiment for each probe in the following study.

Analysis of synovial tissue sample heterogeneity

Gene expression profiles of SpA synovial tissue samples were compared with two different control groups: (a) heterogeneous OA synovial tissue samples, and (b) PBMC from healthy donors, constituting a rather homogeneous cell population.

To study the impact of sample and/or patient heterogeneity on the microarray data, we analysed synovial tissue biopsy samples from three patients with SpA and three with OA as well as PBMC from four healthy controls. Organisation of gene expression data from synovial biopsies by hierarchical cluster analysis indicates the degree of similarity between samples as displayed by the dendrogram in fig 2. Thus, all six synovial samples were clustered under one node, whereas healthy PBMC were clustered separately under a different node. The degree of similarity was particularly high between the three patients with SpA, even higher than between the PBMC of four healthy controls. In contrast, the OA samples had a more heterogeneous expression profile.

Figure 2

Dendrogram depicting results of hierarchical cluster analysis from gene expression values of six synovial tissue samples and four normal PBMC. Hierarchical cluster analysis of gene expression data of all 10 samples as assessed by between-groups linkage analysis. The two columns on the left list the patient diagnoses and attribute numbers to the 10 samples used in this analysis. The horizontal scale (“rescaled distance cluster combine”) uses arbitrary units from 0 to 25 in order to measure the similarity between the various samples that are clustered. The length of the multiple horizontal tree branches measured on the scale above reflects the degree of similarity between the various datasets. Each sample (horizontal line) is connected to the adjacent sample by a vertical line, which is extended by the thin dashed line and can be read off the scale. The three SpA synovial samples are classified as most similar among the three different groups (SpA, OA synovium, and normal PBMC) because their horizontal lines merge at 2 as compared with the four normal PBMC samples, which merge at 6. When all six synovial samples are taken together, they demonstrate the lowest degree of similarity with a vertical line merging at 20 on the scale above. Note the separation of samples (three SpA and three OA synovial tissue samples (SpA1−3, OA1−3) and four healthy PBMC (HC1-4)) into two different nodes (O) of the dendrogram. Homogeneity of gene expression pattern is highest among SpA samples (quantified as 2) and lowest among OA samples (quantified as 20).

Interestingly, histological analysis of synovial tissue samples from four of the six patients indicated a quite different profile. Both of the patients with SpA studied were histopathologically heterogeneous, with clear lining hyperplasia, hypervascularity, and inflammatory infiltration in SpA3 and a nearly normal synovium in SpA2 (table 3, fig 3). On the other hand, SpA2 resembled histologically OA2, whereas OA3 showed a different pattern of inflammatory cell infiltration, with mainly macrophages. Accordingly, the histopathological heterogeneity of the SpA as well as the OA samples contrasted with the homogeneity of the gene expression profiles within each group, suggesting that the microarray data are not merely biased by sample heterogeneity within one disease group as assessed by histopathology.

Table 3

Evaluation of various histological and immunohistochemical parameters of synovial tissue sections available from four of the six patients studied by microarray

Figure 3

Microscopic picture of anti-CD3 staining on frozen synovial sections of two patients with SpA and two with OA (original magnification ×160). SpA3 shows clear signs of chronic inflammation with synovial lining hyperplasia, hypervascularity, and perivascular infiltration of lymphoid cells, whereas SpA2, OA2, and OA3 show only minimal signs of hypervascularity and inflammatory infiltration (see also table 3). Peroxidase staining against CD3 is strongly positive in SpA3, showing mainly a perivascular pattern, but is negative in all other patients.

Considering the high degree of global similarity of the microarray profiles of the three SpA synovial tissue samples, the influence of sample heterogeneity was further analysed by identifying genes that might be expressed by microarray assay in SpA synovium compared with OA synovium, on the one hand, and with PBMC on the other (table 4). When SpA and OA synovium were compared, 31 transcripts were found to have a difference of mean signal intensities higher than twice the background level (SpA v OA). Twenty nine of the 31 genes had a signal intensity ratio (SpA/OA) of more than 4. They were subjected to statistical testing. All data had equal variance. According to t test, 11 transcripts were found to be significantly up regulated in SpA when compared with OA (table 4, genes 1–11). Comparison of SpA with HC retrieved 86 genes with a difference twofold higher than the background signal (SpA v HC). Seventy nine of the 86 transcripts had a ratio (SpA/HC) of more than 4. Here, t test identified 18 mRNA transcripts which were significantly up regulated in SpA compared with HC (table 4, genes 1–6 and 12–23). Six genes were identical in both comparisons (6/11 (55%) and 6/18 (33%), respectively), indicating an important overlap of the microarray results when using either a heterogeneous sample such as OA synovium or a more homogeneous sample such as PBMC for comparison. Moreover, only one gene was expressed in OA synovium compared with PBMC, indicating that the increase in the level of gene expression in SpA synovium was not due solely to differences in the composition of the cell population. When average linkage analysis was used for clustering 1185 transcripts of the 10 arthritis samples, one homogeneous SpA cluster containing 16 genes was identified (fig 4), possibly suggesting similar function or regulation of transcripts present in this particular cluster.14 Nine of those 16 genes were also identified by t test when comparing either SpA with OA or SpA with HC.

Table 4

Genes found to be significantly expressed as assessed by Student’s t test and Bonferroni adjustment in SpA synovium when compared with OA synovium and with normal PBMC

Figure 4

Average linkage analysis of 1185 transcripts from all 10 samples as assayed by microarray. The left column (A) displays all 10 assayed samples with 1185 genes placed in rows. Each gene is represented by a single row of coloured boxes with correlation between level of expression and intensity of colour ranging from black (no expression) to bright red (high expression). The yellow lines in column A frame one identified homogeneous cluster of genes highly expressed in all three SpA samples. Column B is a magnified segment of the left column showing 16 genes within that particular cluster. Nine of the 16 genes, underlined in red, were also identified by t test in either SpA v OA or SpA v HC, and five of nine genes within that cluster were found to be among the genes expressed significantly in both the t test comparison groups.

Validation of the statistical approach by permutation of class labels

This test was performed in order to verify that the statistical analysis is specific.

The relatively small number of genes that were expressed in SpA synovium (11 and 18 out of 1185 genes for comparison with OA synovium and with PBMC, respectively) indicated that a commonly used approach for statistical analysis of microarray data, which was implemented in the present study, is quite stringent (difference of signal intensities higher than twofold of background, fourfold increase of signal intensities, Fmax/t test, Bonferroni adjustment). To further validate that the obtained results indeed reflect gene expression and are not random results due to multiple comparisons, analysis of three SpA and three OA datasets was tested by permutation of samples. As shown in table 5, three samples were randomly assigned to one group, independent of diagnosis, and subsequently tested for expressed genes compared with the three reciprocal samples. Of major interest, the comparison of three SpA samples with three OA samples yielded 11 expressed genes as reported above, whereas all 19 remaining possible combinations together yielded only one expressed gene. When the thresholds were chosen to be less stringent (difference of mean higher than onefold background signal, signal increase higher than 2, analysis of variance, Student’s t test without Bonferroni adjustment—that is, a p value <0.05 was considered to be significant) the number of genes identified as being expressed in SpA compared with OA was 61 as compared with 11 previously. However, when permutation analysis was performed using the less stringent criteria, naturally the number of false positive genes in all other possible combinations of samples was seven as compared with one gene before.

Table 5

Confirmation of results by permutation tests: statistical analysis of all 20 possible combinations (shuffling) of six synovial tissue samples studied by microarray

DISCUSSION

In this study we initially evaluated the technical reproducibility of the combined RNA amplification and microarray assay using both PBMC and synovial tissue biopsies. SMART-PCR used in combination with Atlas array has recently been shown to yield representative gene expression results when compared with the conventional linear procedure as published by a Clontech research group.15 However, an independent group reported slight discrepancies in signal intensity ratios between SMART amplified probes and those conventionally prepared.16 Nevertheless, in that particular study a higher sensitivity of SMART amplified probes in agreement with our own data, at least when used in combination with Atlas array, was described. Our present analysis of variability due to different methods of RNA amplification confirms and extends these findings by indicating a high homology between the standard linear procedure and the exponential SMART-PCR in combination with two different array systems. For the Atlas array, SMART-PCR showed a higher sensitivity yielding three more cytokines/chemokines than the linear method, which most likely is due to a higher specific activity of SMART amplified and 32P labelled cDNA probes used for hybridisation. Another reason might be the use of a random prime labelling system for SMART amplified cDNA templates, further contributing to a higher sensitivity.

When the two different aliquots prepared by the SMART-PCR assay were compared (150 ng and 300 ng), there was complete concordance for all the cytokines and chemokines identified by the Atlas 1.2 array, indicating a low variability of the RNA amplification and this particular microarray assay. For the GF211 array, one cytokine (IL1β) was detected in the 5 µg Atlas array sample as well as in the 150 ng but not in the 300 ng sample, whereas MCP-4 was exclusively detected in the 5 µg sample. Nevertheless, four of eight cytokine/chemokine genes were detected in all of the six experiments performed across the two different array systems and six of eight cytokine/chemokine genes were detected in at least five of six experiments, indicating a low variability not only between the different RNA amplification techniques but also between the two different microarray systems. However, when the seven LPS induced non-cytokine/chemokine genes were analysed, reproducibility between exponential and linear amplification as well as between the two different array systems was low. This is probably because these genes are expressed on a threefold lower level than the cytokine/chemokine genes—a finding which supports the application of stringent thresholds in microarray data mining in order to avoid detection of false positive events. Assessing the reproducibility on synovial tissue samples of the microarray itself, we performed duplicate experiments on samples from two patients with SpA. The high correlation coefficients between the two duplicate datasets and little variation in signal intensity ratios of expressed genes calculated from duplicate data suggest a high experimental reproducibility. These data together with the aforementioned results indicate that the low technical variability of microarray applied to homogeneous cell populations5,17 is also applicable to synovial tissue samples.

A possible major problem of microarray experiments on synovial tissue biopsies is the sample heterogeneity, which might be another cause of experimental variability. However, no concise explorations have been published on this point. The results of our study indicate low variability due to tissue and/or patient heterogeneity when comparing the gene expression profiles of SpA synovium with OA synovium and with PBMC. Firstly, as expected, hierarchical cluster analysis could separate clearly gene expression profiles from synovium and PBMC. However, there was also a high degree of similarity between gene expression data from the three SpA synovial tissue samples, which were unambiguously separated from OA synovium. In fact, the similarity between SpA synovial tissue samples was even higher than between the PBMC samples. Secondly, the histopathological analysis showed clear differences between two SpA samples but similarity between one SpA and one OA sample, indicating that the gene expression clusters are more dependent on the disease than on the cellular composition of the tissues. Thirdly, the relative homogeneity of the SpA data is further indicated by the significant expression of 11 and 18 genes compared with heterogeneous tissue such as OA synovium and a rather homogeneous cell population such as PBMC, respectively. Six genes were shared between both analyses and we found just one expressed gene in OA synovium compared with PBMC, which emphasises that these results are not merely biased by heterogeneity of the tissues. It also demonstrates that RT-PCR and array technology only mirror actively up regulated genes, whereas immunohistochemistry and proteomics potentially can reflect the entire proteome. Finally, the relatively low number of samples studied here (three v three v four) probably leads to an underestimation of the total number of differentially expressed genes, though they indicate a high homology between patients within one disease group in the expression of various genes. Further studies will have to determine whether this is mainly biased by the particular phenotype of patients studied here (early disease, no DMARD treatment, HLA-B27 positivity, peripheral joint inflammation).

Apart from reproducibility of the technique and sample heterogeneity, the scientific validity of microarray data is strongly dependent on the statistical methods used for analysis. Applied methods should be stringent in order to select a relatively small number of differentially expressed genes out of thousands but, more importantly, should also avoid false positive results.

This study did not compare different statistical methods, but attempted to evaluate the validity of a widely used approach: difference and ratio of signal intensities higher than predetermined cut off points, test for homogeneity of variance of data followed by t test with Bonferroni adjustment.17–21

The validity is also confirmed by recalculation of p values for all possible combinations of signal intensity data from the 10 samples. Moreover, permutation of class labels—that is, a complete reanalysis of datasets between SpA and OA groups bears evidence of a low rate of false positive results, thus supporting the specificity of the SpA v OA synovium comparison. Using the permutation analysis, we additionally demonstrate that the approach can be considered as feasible for microarray data analysis because less stringent threshold criteria inevitably lead to a higher number of false positive results.

In addition, commonly used average linkage analysis identified one homogeneous cluster of genes expressed in all three SpA samples. Nine of 16 genes present in this cluster were also identified by t test when SpA was compared with OA and SpA with HC (fig 4). Five of the nine transcripts were also found among the group of six shared t test genes expressed in both the statistical comparisons—a finding which helps to confirm our t test genes by another analytical tool.

In summary, our results indicate that the approach reported here can narrow down the spectrum of potentially expressed genes of more than 1000 to fewer than 20. The microarray technology is emerging as a useful screening tool for identification of molecular mediators in complex diseases. We have technically evaluated this tool for analysis of synovial tissue samples obtained by needle arthroscopy. It should clearly be emphasised that it was not the aim of our study to identify and confirm pathogenically relevant genes involved in SpA pathogenesis but to explore and validate the microarray technology for studying synovial tissue samples from patients with arthritis.

Our data suggest that microarray technology can be used reliably to detect molecular differences between SpA and OA synovium. Moreover, future studies applying molecular profiling, combined with confirmatory techniques such as RT-PCR or immunohistochemistry, to a larger number of patients with distinct phenotypical features might help to identify distinct gene expression patterns (“signatures”) and thus diagnostic and prognostic subgroups of SpA.2,3 In this context, predicting the response to particular forms of newly emerging, expensive treatments could indeed be useful. In conclusion, microarray on synovial tissue biopsy samples might be a valuable tool for enhancing the search for clues to SpA pathogenesis.

Acknowledgments

The authors thank Mrs Jenny Vermeersch for excellent technical assistance.

This work was supported by the Nora Eccles Treadwell Foundation. Dr M Rihl and Dr JG Kuipers are supported by the Deutsche Forschungsgemeinschaft DFG (RI 1119/1-1 and KU 1182/1-3). Dr M Rihl is also supported by the Rheumatology Competence Network, Berlin. Dr D Baeten is supported by the Fund for Scientific Research-Flanders (FWO-Vlaanderen).

REFERENCES

View Abstract

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.