Introduction

Primary osteoarthritis (OA, MIM 165720) is a common late-onset arthritis characterised by degeneration of the hyaline cartilage of synovial joints. Twin-pair, sibling risk and segregation studies have demonstrated a significant genetic component to OA, which is transmitted as a multifactorial, oligogenic trait.1 These studies also reported large differences in the level of OA heritability between different joint sites (ie hip vs knee) and between the genders. They have therefore revealed a high degree of heterogeneity in the nature of the encoded susceptibility and highlighted the need for the judicious application of phenotypic stratification as a means of reducing heterogeneity in OA gene mapping studies.

We previously carried out a genome-wide linkage scan on OA affected sibling pair families collected in the UK. The affected individuals had all undergone one or more replacements of the total hip (THR), the total knee (TKR), or both, for primary OA. Stratification of the families by gender and by joint replaced (hip or knee) revealed suggestive susceptibility loci for hip OA on chromosomes 2q and 6, and for female-hip OA on chromosomes 4q, 11q and 16p.2,3,4 Since our genome scan employed only a low density of microsatellite markers (one every 15 cM), the intervals for these five linkages were broad. We have now finer linkage mapped these loci in an expanded cohort of OA families and report here the results for chromosome 6.

In the genome scan we had genotyped 15 chromosome 6 markers in 297 OA families.3 Multipoint linkage analysis gave an MLS of 1.0 within a large interval between markers D6S1610 and D6S262 (52.4–129.3 cM from the 6p-telomere). When we stratified this data, we obtained an MLS of 2.9 in the 194 families from the 297 that contained sibling pairs concordant for hip OA (THR-families). The linkage peak was centred between D6S462 and D6S262 (96.0–129.3 cM), with over 50 cM of the chromosome having a multipoint-LOD score >2.0. There was no evidence for linkage in families containing sibling pairs concordant for knee osteoarthritis. In the first step of the refined analysis reported here, we genotyped 20 chromosome 6 markers in an expanded cohort of 378 THR-families. We obtained an MLS of 2.2 between markers D6S1610 and D6S460 (52.4–86.8 cM), an interval of 34.4 cM. We subsequently genotyped six additional markers from within this interval and obtained an MLS of 2.8 between markers D6S452 and 509-8B2 (70.5–81.9 cM), an interval of 11.4 cM. Stratification by gender revealed that the linkage was completely accounted for by the female THR-families (n=146), with an MLS of 4.0 and with the highest two-point LOD score being 4.6 at D6S1573 (75.9 cM).

The COL9A1 gene resides just within the finer-linked interval, at 81.9 cM. This gene encodes for the α1 polypeptide chain of the heterotrimeric type IX collagen, a quantitatively minor cartilage collagen that is involved in skeletal development and which is required for maintaining cartilage integrity.5 We identified and then genotyped 20 COL9A1 single nucleotide polymorphisms (SNPs) in the 146 probands from our female THR-families and in 215 age-matched female controls. No SNP allele, genotype or haplotype was associated with OA.

Overall, by genotyping a larger cohort of OA families for a higher density of microsatellite markers, and by applying gender stratification, we have finer linkage mapped the hip OA susceptibility locus on chromosome 6 to an 11.4 cM interval at 70.5–81.9 cM (6p12.3–q13). We have demonstrated that this locus is particularly relevant to female disease, whilst a comprehensive association analysis does not support COL9A1 as the chromosome 6 OA susceptibility gene.

Families and methods

Hip OA families

For the linkage analysis we examined 378 UK Caucasian families that each contained two or more siblings who had undergone one or more replacements of the total hip for primary OA (THR-families). Since OA is a late-onset disease, no parents to the siblings were available for study. We therefore collected additional siblings who had not undergone THR to assist in the determination of identical-by-descent allele transmittance (in the linkage analysis, these siblings were given an affected status of unknown). Table 1 lists the family structures. We excluded individuals who had undergone hip replacement due to prior joint trauma or other forms of arthritis by clinical examination and by reviewing X-rays and histological samples. Our ascertainment procedure has been described in detail elsewhere.2 All families were genotyped for 50 unlinked microsatellite markers to confirm full-sib status using the RELATIVE program (ftp://linkage.rockefeller.edu/software/relative/). The average age of the affected individuals at the time of their first operation was 66 years (SD=8.9 years), with an average age of 66 years (SD=9.1 years) in affected women and an average age of 65 years (SD=8.6 years) in affected men. COL9A1 association analysis was performed on a case–control cohort comprising the 146 probands from our female THR-families (cases) and 215 age-matched, unrelated female controls. The controls were UK Caucasians who had not undergone any joint-replacement surgery or required clinical treatment for OA. The average age of the controls when recruited into the study was 73 years. Ethical approval for the study was obtained from the Central Oxford Research Ethics Committee and informed consent was obtained from all subjects.

Table 1 THR-family structures

Microsatellite markers and genotyping

Microsatellite markers were PCR amplified and the alleles sized using Genescan and Genotyper software (Applied Biosystems), as described previously.2 We initially genotyped all 20 chromosome 6 markers from the Applied Biosystems LMS–MD10 panels 8–10 (http://www.appliedbiosystems.com). These markers have an average spacing of 10 cM. We then genotyped four additional markers from the LMS–HD5 panels (D6S1549, D6S282, D6S452 and D6S1573) and the two COL9A1 microsatellite markers 509-8B2 (located in intron 5 of COL9A1) and 509-12B1 (located 14.3 kb upstream of exon 1 of COL9A1).6 These additional six markers map between LMS–MD10 markers D6S1610 and D6S460.

A quality control test was performed on the markers in which excess homozygosity, due to the preferential or missed amplification of one allele, was examined by RECODE v1.4 (ftp://watson.hgen.pitt.edu/pub/). All markers passed this check. Furthermore, Mendelian inconsistencies were examined in families containing at least three individuals by use of the PEDCHECK program (http://watson.hgen.pitt.edu/register/soft_doc.html). These tests, along with the RELATIVE analysis described above, enabled us to confirm the full-sib status of the siblings and to ensure that the chromosome 6 markers genotyped were reliable.

COL9A1 SNP discovery and genotyping

The COL9A1 gene structure was obtained from published sources.7 For SNP discovery two methodologies were employed: (1) heteroduplex analysis, with DNA variants then characterised by direct sequencing (Finnish group) and (2) by direct sequencing only (UK group). Nine hundred base pairs of promoter sequence, the 5′ and 3′ UTRs, all 38 COL9A1 exons (including acceptor and donor splice sites), differentially spliced exon 1*, and all of intron 1 were scanned in at least 35 unrelated individuals. All of intron 1 was screened since several collagen genes are known to contain cis elements in the first intron that can regulate gene expression.8 SNPs were genotyped by PCR-restriction enzyme analysis, with the digestion fragments separated by electrophoresis through 3% agarose. Further information regarding primer sequences, and the PCR and digestion conditions can be obtained from the authors.

Statistical analysis – microsatellites

In the linkage analysis, microsatellite allele frequencies were calculated from all scored genotypes by use of the GAS program (http://users.ox.ac.uk/ayoung/gas.html) and subsequent two-point and multipoint linkage analysis were performed using ANALYZE (ftp://linkage.cpmc. columbia.edu) and ASPEX (ftp://lahmed.stanford.edu/pub/aspex), respectively. The two COL9A1 markers 509-8B2 and 509-12B1 were tested for evidence of transmission disequilibrium in our families using the TRANSMIT sibling–TDT program, version 2.5.2.9 This program is appropriate to our study as it incorporates data from families with missing parents.

Statistical analysis – SNPs

Allelic association with OA was evaluated for each COL9A1 SNP by χ2 using standard contingency table analysis. We further stratified our families using their individual non-parametric linkage (NPL) scores taken from the GENEHUNTER two-point linkage analysis (http://waldo.wi.mit.edu/ftp/distribution/software/genehunter/gh2/). Probands from the female THR-families which showed a positive contribution to the linkage for the two COL9A1 markers (positive NPL scores) were compared with female controls.

The pair-wise linkage disequilibrium coefficient D′ values were calculated using the GOLD program (http://www.well.ox.ac.uk/asthma/GOLD) and displayed graphically on a physical map to reveal the SNP haplotype structure.10 The positions of the COL9A1 SNPs on the map were defined using published sources.7 Haplotype frequencies amongst blocks of SNPs in linkage disequilibrium were calculated using the estimate haplotype program EH-PLUS (http://www.iop.kcl.ac.uk / IoP / Departments / PsychMed / GEpiBST/software.stm). Haplotype frequency differences between the probands and the controls were then compared using χ2 and contingency table analysis.

Results

Linkage analysis

Table 2a lists the two-point LOD scores and Figure 1a shows the multipoint linkage plot for the 20 LMS–MD10 markers genotyped in the 378 THR-families. The highest two-point LOD score was 1.6 for marker D6S257, whilst the maximum multipoint-LOD score (MLS) was 2.2, located between markers D6S1610 and D6S460, a 34.4 cM interval. Although the LMS–MD10 markers have an average density of 10 cM, there is a relatively large gap between markers D6S1610 and D6S257 of 25.9 cM. Since our linkage peak was centred on D6S257 we genotyped four additional markers from within this gap. These were the LMS–HD5 markers D6S1549, D6S282, D6S452 and D6S1573. We also genotyped the two COL9A1 markers 509-8B2 and 509-12B16, which map between D6S257 and D6S460. Table 2b lists the two-point LOD scores for these six additional markers. Four had LOD scores 1.4, with the highest being 2.3 for marker D6S1573. Figure 1b shows the multipoint linkage plot for all 26 chromosome 6 markers. The MLS had increased to 2.8 in an 11.4 cM interval defined by markers D6S452 and 509-8B2.

Table 2 Two-point LOD score in the 378 THR-families
Figure 1
figure 1

Multipoint linkage analysis of chromosome 6. (A) For the 20 chromosome 6 markers from the Applied Biosystems LMS-MD10 panels in the 378 THR-families. (B) With six additional markers between D6S1610 and D6S460. (C) For all 26 chromosome 6 markers with the 378 THR-families stratified by gender. (D) With the 146 female THR-families stratified into the 79 families from our original genome-wide linkage scan and the 67 families that were subsequently recruited.

As noted earlier, several studies have reported differences in the degree of heritability of OA between males and females.1 We therefore stratified our linkage data by gender. The highest two-point LOD score was 4.6 for marker D6S1573 in the 146 female THR-families (Table 3). Four markers flanking D6S1573 had LOD scores 2.1. In the 83 male THR-families, D6S1573 and its flanking markers showed no evidence for linkage. An MLS of 4.0 was obtained for the female THR-families in the 11.4 cM interval flanked by D6S452 and 509-8B2 (Figure 1c). The male THR-families were unlinked to this interval.

Table 3 Two-point LOD scores for all 26 of the chromosome 6 markers genotyped, with the THR-families stratified by gender

The 146 female THR-families are composed of 79 families used in our original genome-wide linkage scan3 and 67 families that were subsequently recruited. To demonstrate the contribution that each collection of families made to the linkage, we carried out the linkage analysis on each cohort (Figure 1d). The MLS for the original 79 families was 1.6, the MLS for the additional 67 families was 2.6, whilst the combined MLS was 4.0. This analysis demonstrates that an increase in the sample size has increased the evidence for linkage, strongly suggesting that the chromosome 6 linkage is robust.

Association analysis of COL9A1

Twenty common SNPs (rare allele frequencies >0.05) were identified in COL9A1 (Table 4). One of these SNPs is located in the promoter, two are exonic (both non-synonymous), with the remaining 17 being intronic. None of the intronic SNPs were located in donor or acceptor splice sites, or affected the conserved (A) nucleotide of any putative branch consensus sites.

Table 4 Single nucleotide polymorphisms in COL9A1 and association study of probands from the female THR-families and of controls

The 20 common SNPs were genotyped in the 146 probands from our female THR-families and in 215 age-matched female controls (all SNPs were in Hardy–Weinberg equilibrium in the control cohort). None of the SNP alleles or genotypes were associated with OA at P0.05. Three SNP alleles gave P-values0.1: intron 16 (+143) (P=0.09); intron 25 (−24) (P=0.10); and exon 28 (+44) (P=0.10) (Table 4). To determine whether any of these three SNPs were suggestive of an association, we stratified our female THR-families into those that showed a contribution to the linkage (positive GENEHUNTER NPL scores). We stratified against the COL9A1 markers 509-8B2 and 509-12B1. Seventy-four of the 146 female THR-families showed linkage to 509-8B2 and 67 showed linkage to 509-12B1. When the SNP allele and genotype frequencies were compared between the female controls and the probands from the families contributing to the linkage at each marker, none of the three SNPs tested showed a significant allele or genotype frequency difference at P0.05 (data not shown).

We next determined the COL9A1 SNP haplotype structure and its diversity. We focused on the 16 SNPs from the 20 that had minor allele frequencies >0.10. Pair-wise linkage disequilibrium D′ values were calculated using the GOLD program and displayed graphically against the physical distance between the SNPs (Figure 2). This revealed a high degree of linkage disequilibrium between the 14 SNPs encompassed by Intron 1 (−54) to Intron 34 (+32), inclusive, with D′ never falling below 0.7. This represents a haplotype block of 65 kb. Haplotype blocks of this length have been reported on in other studies.11,12 The two remaining SNPs tested [Promoter (−364) and Intron 1 (+39)] approached linkage equilibrium with each other (D′=0.16), suggesting that a recombination hotspot resides immediately upstream of SNP Intron 1 (−54).13 Using EH-PLUS, we constructed the haplotypes for the 14 SNPs in linkage disequilibrium. Of the 16 384 possible haplotypes, only 57 were observed confirming the high degree of linkage disequilibrium between the SNPs. Of the 57 haplotypes, only six had frequencies >5% and these accounted for 72% of the chromosomes in our probands and controls. There was no significant difference in the frequency of these common haplotypes between the probands and the controls (P=0.60).

Figure 2
figure 2

Graphical overview of pair-wise linkage disequilibrium (D′) values across COL9A1 for the sixteen SNPs that have allele frequencies >0.1. The axis of the gene runs along the diagonal from the bottom left to the top right. The D′ values are colour coded from red (linkage disequilibrium) to blue (linkage equilibrium).

Finally, we tested the two COL9A1 microsatellite markers 509-8B2 and 509-12B1 for association to OA by assessing the evidence of transmission disequilibrium in the 146 female THR-families using the TRANSMIT program. Neither marker demonstrated transmission distortion (data not shown).

Discussion

In this study, we have finer linkage mapped a susceptibility locus for severe hip OA in females to an 11.4 cM interval at 6p12.3-q13 (70.5 to 81.9 cM from the 6p-telomere). The COL9A1 gene, which encodes for the α1 polypeptide chain of type IX collagen, maps within this interval. COL9A1 is a logical candidate for OA susceptibility. A mutation in the gene was recently reported in the affected members of a multiple epiphyseal dysplasia pedigree, a rare chondrodysplasia that has early-onset (secondary) OA as one of its phenotypic components.14 However, our comprehensive association analysis did not produce any evidence supporting COL9A1 as the primary OA susceptibility locus that we have mapped to chromosome 6. We cannot exclude, however, the possibility that several rare alleles of COL9A1 encode for the chromosome 6 susceptibility, with our study lacking the necessary power to detect this association.

We have previously excluded a number of other genes that are mutated in the chondrodysplasias as candidates for primary OA susceptibility.15 The fact that the genes which encode for such secondary OA phenotypes are not necessarily those that are involved in primary OA susceptibility reinforces the view that these two forms of OA are distinct entities that do not share all causal genetic factors.

The finer linkage mapped region of chromosome 6 contains other potential candidates. Two other collagen genes map close to COL9A1; COL19A1, which encodes for the α1 polypeptide chain of the homotrimeric type XIX collagen, and COL12A1, which encodes for the α1 chain of the homotrimeric type XII collagen.16 Type XII collagen is present in cartilage, tendon and bone5, making COL12A1 a plausible candidate. However, this gene maps approximately 5 cM distal to COL9A1, placing it approximately 11 cM from the centre of the linked interval. A candidate that maps closer to the interval centre is the bone morphogenetic protein 5 (BMP5) gene. BMPs are a class of secreted signaling molecules that are involved in a number of different biological processes including skeletal patterning.17 The choice of candidates for subsequent linkage disequilibrium analysis should take into account the female specificity of the chromosome 6 linkage: genes that are under hormonal control should be studied first.

Overall, we have finer linkage mapped an OA susceptibility locus on chromosome 6. This was achieved by genotyping a large number of families for a high density of microsatellite markers, and by stratifying the data by gender. The dramatic effect of gender stratification highlights the heterogeneity of OA susceptibility that has been reported on in a number of epidemiological studies and which is now being confirmed by genetic analysis. The linked interval, at 11.4 cM, is sufficiently narrow for targeted and more systematic linkage disequilibrium/association analysis to be undertaken.