Introduction

Systemic lupus erythematosus (SLE) [MIM 152700] is a multisystem autoimmune disease with high morbidity and mortality. The overall estimated prevalence in the United States is between 12 and 64 cases per 100 000 individuals.1, 2 Significant gender differences (female:male=9:1) are observed in prevalence, age at onset, premorbid conditions, clinical expression, course of illness, response to treatment, and morbid risk.1, 2 Although the etio-pathogenesis of SLE remains largely unknown, the dysregulation of self-reactive B cells leading to immune complex activation and complement consumption is well observed. The production of autoantibodies against nuclear components is the hallmark characteristic of SLE. The genetic basis for SLE is well established, although, various environmental factors, perhaps interacting with certain genes, also may play a significant role in development of SLE. The relative risk ratio to the sibs of an affected proband (λs) is about 20–40.3

Recently, in an effort to identify the novel susceptibility genes for SLE, we performed a genome-wide scan on 37 Hispanic families with multiple affected individuals.4 We have identified the evidence of linkage to SLE at two regions on chromosome 16 (16p13 and 16q12–13). While the evidence of linkage at 16q12–13 exceeded the recommended threshold for genome-wide suggestive linkage5 (Zlr=3.06, LOD=2.01, P=0.001 at 66.6 cM), the evidence of linkage at 16p13 was also nearly suggestive (Zlr=2.68, LOD=1.13, P=0.004 at 20.0 cM).

The main purpose of this study is to evaluate the linkage at these two genomic locations on chromosome 16 in two large independent data sets. Here we report the confirmation of linkage at 16q12–13 in one of the two ethnically diverse data sets. We also report the presence of linkage heterogeneity at the 16q12–13 susceptibility locus.

Materials and methods

All the SLE patients met the 1997 revised criteria for classification of SLE.6, 7 Detailed ascertainment procedure for the SLE families is given elsewhere.4, 8, 9 Two independent replication samples consisting of 172 families (group-1) and 120 families (group-2) are used to verify our previously identified linkage effects. The group-1 consists of 110 European-American (EA) and 62 African-American (AA) families while group-2 consists of 82 EA and 38 AA families (Table 1). Although most of the families from group-1 are used in our previous genome scans, the families from group-2 are used for the first time for chromosome 16 replication study.

Table 1 Selected clinical and demographic features of the affected SLE individuals

All the linkage analyses are performed using the families multiplex (at least two affected individuals) for SLE. Genomic DNA was isolated from peripheral blood cells, buccal cell swabs, mouth wash specimens, or EBV transformed cell lines using standard methods. A total of 11 microsatellite markers with an average marker spacing of 11 cM (range 8–15 cM) and a heterozygosity of 76% (range 56–94%) were typed for chromosome 16. We used the genetic map and intermarker distances from the Marshfield map.10 Positions of markers not in the Marshfield map were set by interpolation on the basis of physical distances. Major family and demographic characteristics were compared between these two groups. We assess the significance of the quantitative variables using one-way ANOVA and post hoc Tukey's test for multiple pair-wise comparisons between groups.

The evidence of linkage has been assessed by both a nonparametric, penetrance-independent, allele-sharing method, and a parametic method using the program GENEHUNTER-PLUS.11, 12 GENEHUNTER-PLUS reports allele-sharing LOD scores and maximum Z scores for the likelihood ratio (Zlr). Parametric linkage analysis is performed assuming both linkage homogeneity (LOD) and linkage heterogeneity (HLOD). Significance of LOD is determined by χ2=4.6 × LOD with one d.f. Since HLOD follows a complex statistical distribution, significance for the observed HLOD is first converted to a χ2, where χ2=4.6 × HLOD. P-value (P1) was then derived for χ2, using the χ2 distribution with one d.f. The P-value for the HLOD score is then 0.5 × [1–(1–P1)(1–P1)].13 We have used the race-specific marker allele frequencies estimated from family founders when we analyzed the families with a specific racial background.

Results

The comparison of major family and demographic characteristics show that overall the four groups are similar (Table 1). We have assessed the significance of the quantitative variables using one-way ANOVA and post hoc Tukey's test for multiple pair-wise comparisons between groups. The mean age at onset for affected individuals does not vary significantly among four groups (F=1.54, P=0.20). Similar results have been found using the post hoc pair-wise comparison between the groups.

The results of multipoint nonparametric and parametric linkage analyses at 16q12–13 are given in Table 2. Virtually, no evidence of linkage was revealed by group-1. In contrast, moderate evidence of linkage was revealed in group-2 by the nonparametric analysis (Zlr=2.45, P=0.007). Once the evidence of linkage at 16q12–13 was found, we performed parametric multipoint analyses, employing simple dominant and recessive models. Greenberg et al14 have shown, as long as both dominant and recessive models are used, that simple genetic models are nearly as powerful to determine evidence of linkage as when the true parameters are known. While no evidence of linkage is found by the parametric analysis assuming linkage homogeneity (LOD=−19.83), a significant evidence of linkage is found by assuming genetic heterogeneity (HLOD=4.85, P=4.5 × 10−6) under a dominant model, and about 35% families are estimated to be linked with this locus. Although the analyses based on these two methods identify two different peaks, but these peaks are very close to each other between markers D16S3253 and D16S2624. Similar patterns of peak location and linkage heterogeneity are found when linkage analyses are performed separately in the individual group (Table 2 and Figure 1). The results of parametric linkage analysis under recessive models are significantly inferior to the dominant model (data not shown).

Table 2 Multipoint parametric and nonparametric linkage analyses on chromosome 16 for group-1 and group-2. For comparison, the linkage results from Hispanic (Nath et al, 2004) families are also provided
Figure 1
figure 1

Multipoint linkage analysis in four groups of families (EA=European-American (n=82), HIS=Hispanic (n=37), AA=African-American (n=38) and AA+EA=AA and EA combined (n=120)). The marker names (‘D16S’ is removed from all the markers) and their relative positions and inter-marker distances (cM) are indicated by an arrow, vertical line and numbers.

Discussion

Several genome screens have been performed to find the susceptibility genes for lupus. Interestingly and importantly, the evidence of linkage and association in and around 16q12–13 has previously been identified by at least four independent studies. First, the USC group15 found modest evidence (P=0.017) for linkage to marker D16S3136 at 62.1 cM in Mexican-American and EA families. Second, the Minnesota group16 reported significant evidence of SLE linkage at 16q12–13. They found maximum evidence of linkage (LOD=3.85, P=1.3 × 10−5) at 60.45 cM on marker D16S415 in mostly EA families. Third, the UCLA group17 confirmed the evidence of SLE linkage spanning between markers D16S753 and D16S757 at 16q11–q13 in mostly non-Caucasian families. They also found a positive epistatic interaction between this locus and the 1q23 SLE susceptibility locus. Fourth, a Chinese group18 found an association between D16S517 (16q12) and SLE at 58.46 cM after performing a family-based association study on the Chinese population. These four independent studies found maxima over a range of 9 cM (58 and 69 cM), which is a typical shift of linkage peak location for a complex disease linkage.

Our previously identified second linkage signal on chromosome 16 at 16p13 in Hispanics4 has not been replicated in any of these replication groups. Several reasons may contribute. First, a factor contributing to the lack of consistency across studies is locus heterogeneity, which can weaken or even eliminate evidence for linkage that is present only in a subset of families. This phenomenon is clearly exemplified from the linkage at 16q12–13 in which we did not find any evidence of linkage in any of the ethnic group from group-1. Although a significant evidence for linkage is found at 16q12–13 in group-2, it also showed tremendous genetic heterogeneity. Approximately, 65% genetic heterogeneity is estimated in group-2, which is consistent among the families and is irrespective of their ethnic origin. Second, current sample sizes would allow the reliable identification of only major genes that take up a large proportion of all the overall familial risk. In practice, this means that studies need ‘luck’, in addition to a high standard of data quality, to detect the loci with relatively small effect size. Third, this linkage might be a false positive linkage; therefore, it is not detected in the replication groups.

In conclusion, our results provide strong evidence that confirms the existence of a major SLE susceptibility gene on 16q12–13. In our data initially we have shown the evidence of linkage in Hispanics4 and in the present study, we show this SLE linkage in the families with AA, EA ethnic background. This susceptibility locus also demonstrates substantial genetic heterogeneity. Nevertheless, the finding of highly significant evidence in the present study and the previously published results from four independent studies demonstrates the powerful evidence of linkage to this chromosomal region, and furthermore, this linkage is reproducible and ‘consistent’. Several independent confirmations would justify further and more intensive approaches to identify the actual SLE susceptibility gene(s) at 16q12–13.