Radiographic endophenotyping in hip osteoarthritis improves the precision of genetic association analysis

Objective Osteoarthritis (OA) has a strong genetic component but the success of previous genome-wide association studies (GWAS) has been restricted due to insufficient sample sizes and phenotype heterogeneity. Our aim was to examine the effect of clinically relevant endophenotyping according to site of maximal joint space narrowing (maxJSN) and bone remodelling response on GWAS signal detection in hip OA. Methods A stratified GWAS meta-analysis was conducted in 2118 radiographically defined hip OA cases and 6500 population-based controls. Signals were followed up by analysing differential expression of proximal genes for bone remodelling endophenotypes in 33 pairs of macroscopically intact and OA-affected cartilage. Results We report suggestive evidence (p<5×10−6) of association at 6 variants with OA endophenotypes that would have been missed by using presence of hip OA as the disease end point. For example, in the analysis of hip OA cases with superior maxJSN versus cases with non-superior maxJSN we detected association with a variant in the LRCH1 gene (rs754106, p=1.49×10−7, OR (95% CIs) 0.70 (0.61 to 0.80)). In the comparison of hypertrophic with non-hypertrophic OA the most significant variant was located between STT3B and GADL1 (rs6766414, p=3.13×10−6, OR (95% CIs) 1.45 (1.24 to 1.69)). Both of these associations were fully attenuated in non-stratified analyses of all hip OA cases versus population controls (p>0.05). STT3B was significantly upregulated in OA-affected versus intact cartilage, particularly in the analysis of hypertrophic and normotrophic compared with atrophic bone remodelling pattern (p=4.2×10−4). Conclusions Our findings demonstrate that stratification of OA cases into more homogeneous endophenotypes can identify genes of potential functional importance otherwise obscured by disease heterogeneity.


INTRODUCTION
Osteoarthritis (OA) is a phenotypically heterogeneous disease characterised by cartilage degeneration and bone remodelling within synovial joints, 1 and affects 9 million people in the UK alone. 2 The heritable component to OA susceptibility is estimated at 40%-65%, [3][4][5] however fewer than 20 robustly associated genetic loci have been identified to date explaining <10% of this heritability. [6][7][8][9][10][11][12][13][14][15][16][17][18] These candidate gene and genome-wide association studies (GWAS) used a simple dichotomous description of joint involvement that does not reflect the heterogeneity of OA biology, 19 with the exception of the DOT1L association with hip OA in men that was made by measurement of cartilage thickness. 20 OA endophenotypes may be characterised on plain radiographs by the pattern of joint involvement and by the bone remodelling response to the disease. [21][22][23] At the hip, the site of maximum joint space narrowing (maxJSN) can be classified as superior, medial, axial or concentric. 22 24 25 The superior pattern is most common in men and is a risk factor for disease progression. 26 27 Medial migration is more common in women and more likely to occur bilaterally. [26][27][28][29][30] The bone remodelling response to OA also varies between individuals, and may be classified as atrophic (bony attrition, cysts and a lack of osteophytes), normotrophic (osteophytes <grade 3) or hypertrophic (osteophytes ≥grade 3±increased femoral head size). 23 Atrophic OA is associated with thinner trabeculae, lower bone volume and lower bone mineral density than the hypertrophic or normotrophic patterns of OA. 31 32 Atrophic OA is also associated with faster disease progression, osteoporosis and with femoral head collapse. 28 33 The risk of severe hip OA is threefold greater in siblings of index cases with an atrophic pattern of hip OA compared with siblings whose index case had a normotrophic or hypertrophic bone remodelling response. 27 Here, we applied these clinically relevant radiographic endophenotypes to identify novel variants that associate with hip OA in a subset of subjects from the arcOGEN Study, 14 the largest hip and knee OA GWAS to date. We calculated the differences in statistical power achieved using this approach versus a simple dichotomous description of joint involvement. We investigated the effect of using these endophenotype definitions on signal detection for previously established hip OA loci. Finally, we examined the expression levels of genes highlighted in the association analysis in primary Panoutsopoulou  human chondrocytes taken from patients classified using the same bone remodelling endophenotypes.

PATIENTS AND METHODS Study populations and radiographic classification
The hip OA subjects comprised 2118 unrelated individuals of UK European ancestry participating in the arcOGEN Study, an ethically approved study in which all subjects provided written, informed consent prior to participation. 14 All had radiographic evidence of hip OA (Kellgren-Lawrence (KL) Score ≥2) on a digitised anteroposterior radiograph of the affected joint. 21 Hip OA pattern was classified by the site of maxJSN as superior, axial, medial or concentric using the Osteoarthritis Research Society International (OARSI) atlas and the KL Score (table  1). 21 22 The bone remodelling response was classified according to the Bombelli and OARSI atlases as atrophic, normotrophic or hypertrophic. 22 23 Each radiograph was read independently by two clinically trained observers (ST and EZen). Where discrepancy existed between the observers, adjudication was made by the senior clinical author ( JMW). The correlation between the two primary observers was 'very good' for the axial and medial JSN phenotypes (κ=0.85, 95% CI 0.82 to 0.88) 34 and 'good' for the superior JSN phenotype (κ=0.7, 95% CI 0.67 to 0.73); 'good' for the atrophic remodelling phenotype (κ=0.64, 95% CI 0.58 to 0.69) and 'moderate' for the hypertrophic phenotype (κ=0.54, 95% CI 0.50 to 0.58).

Genotyping, imputation and association analyses
Genotypes were extracted from 1817 and 301 hip OA cases previously genotyped on Human 610-Quad and HumanOmniExpress (Illumina, San Diego, USA) platforms, respectively (figure 1). The controls comprised 6500 subjects previously included in the arcOGEN GWAS: The Wellcome Trust Case Control Consortium 2 Study 1958 Birth Cohort, the UK National Blood Donor Service and the Type 1 Diabetes Genetics Consortium. 14 A 1000 Genomes Project-based imputation was conducted, 35 36 and was followed by stringent quality control excluding variants with minor allele frequency (MAF)<0.01 and minor allele count <10 given the different sample sizes of the GWAS case cohorts, 37 imputation information score <0.4 and by inspecting genotype intensity plots 100 kb on either side of the top signals and re-imputation of these regions of association.
Association analyses at ∼7 million variants were made under an additive model (maximum likelihood ratio test, SNPTESTv2.5). Our primary analyses comprised fixed-effect meta-analyses 38 within OA cases stratified by site of maxJSN or by bone remodelling response to identify variants exclusively associated with these endophenotypes (table 1 and figure 1). The axial and medial maxJSN groups were collapsed into one (axial/medial) due to the small sample size in each of these categories. Promising signals were independently validated by de novo genotyping using the MassARRAY iPLEX Gold assay (Agena Bioscience GmbH, Hamburg, Germany).
In our secondary analyses the GWAS outputs for the stratified hip OA cases against population-based controls were compared with that achieved using a non-stratified GWAS of all hip OA cases versus population-based controls. Power was calculated using Quanto V.1.2.4 under the additive model 39 at α=5×10 −8 , and based on study-specific effect sizes and case-control ratios identified in the Human 610-Quad GWAS. Similarly, the sample size required to have 80% power to detect the associated signals with genome-wide significance ( p=5×10 −8 ) was estimated. To calculate the sample size fold reduction afforded by the use of precise phenotype definitions over an all hip OA versus controls approach we fixed the estimated sample size of hip OA cases from infinite to n=1 000 000.

Gene expression analysis
Gene expression profiles were examined in samples from the RAAK Study (Research Arthritis and Articular Cartilage) from individuals undergoing joint replacement surgery, as described previously. 40 Isolated RNA was prepared for hybridisation using the Illumina TotalPrep RNA Amplification Kit (Illumina, San Diego, USA) onto Illumina HumanHT-12 v3 microarrays, with each pair (OA-affected and intact) measured on the same chip. Microarrays were read using an Illumina Beadarray 500GX scanner and data were analysed in R statistical programming language after quality checks using Illumina Beadstudio analysis software (Illumina). Intensity values were normalised and absence of large-scale between-chip effects was confirmed using the Globaltest-package. 41 Probes that were not optimally measured were removed (detection p>0.05 in more than 50% of the samples). All comparisons of the differential normalised probe expression levels between intact and diseased OA chondrocytes were made in SPSS (V.23, IBM, New York, USA) using the generalised estimation equations and corrected for age and sex. A random effect for patient identity (ID) was included to account for the structure of the data and correct for putative correlations between intact and OA-affected articular cartilage from the same joint.
Preoperative plain radiographs of the affected joint were scored independently by two experienced readers ( JMTAM, KH) to determine the OA subtype. Subsequently, osteophytes and JSN were graded 0-3 for lateral and medial femur and tibia (knee joints) and for inferior and superior femur and acetabulum (hip joints) by consensus opinion using the OARSI atlas. 22

Site of maxJSN association analysis
In the GWAS meta-analysis of hip OA cases with superior maxJSN versus all other hip OA cases (see online supplementary figure S1a), the most significant association was observed at rs754106 (MAF=0.46) in intron 1 of the LRCH1 (leucine-rich repeats and calponin homology domain containing 1) gene (T allele OR (95% CIs) 0.70 (0.61 to 0.80), p=1.49×10 −7 (table 2 and see online supplementary figure S2a)), and was consistent across both cohorts (see online supplementary table S1). In the GWAS meta-analysis of hip OA cases with axial/medial maxJSN versus all other hip OA cases (see online supplementary figure S1b), rs754106 was also the most significantly associated variant ( p=3.66×10 -7 ) but with the effect of the T allele in the opposite direction, in keeping with expectation (OR (95% CIs) 1.47 (1.27 to 1.70), table 2 and see online supplementary table S1).
In the association analyses of hip OA cases with superior maxJSN versus population-based controls (see online supplementary figure S3a) and of patients with hip OA with axial/medial maxJSN versus population-based controls (see online supplementary figure S3b), opposite directions of effect were also observed for allele T: OR (95% CIs) 0.91 (0.84 to 0.99), p=0.034 compared with OR (95% CIs) 1.32 (1.14 to 1.52), p=1.39×10 −4 , respectively (see online supplementary table S1). When all hip OA cases were analysed together versus the same population-based control set, the signal was fully attenuated (OR (95% CIs) 1.01 (0.93 to 1.08), p=0.89) (see online supplementary table S1 and figure S3e).
In the association analysis of hip OA with axial/medial maxJSN versus all other hip OA cases the second most significant single nucleotide polymorphism (SNP) was rs17050727 (MAF=0.14), located between PRDM5 (PR domain containing 5) and MAD2L1 (mitotic arrest deficient-like 1 (yeast), T allele OR (95% CIs) 1.65 (1.35 to 2.01), p=6.87×10 −7 (table 2, see online supplementary table S1 and figure S2c). rs17050727 exhibited a similar pattern of association with the converse stratum (superior maxJSN vs all other hip OA cases) as for the variants described above; and its association was also fully attenuated in the hip OA analysis versus population-based controls analysis (OR (95% CIs) 1.00 (0.90 to 1.12), p=0.98, see online supplementary table S1).
Adjustments for sex, height and body mass index (BMI) had little effect on the association strength of the reported signals  (see online supplementary  table S3).
In the GWAS of atrophic versus non-atrophic OA (see online supplementary figure S1d), the most significant association was observed at rs16869403, which resides within the gene encoding the G protein-coupled receptor 98 (GPR98) (MAF=0.10, allele G OR (95% CI) 2.11 (1.61 to 2.75), p=1.37×10 −7 , see online supplementary figure S2f and table S1). The strength of association was similar in the analysis of atrophic OA cases versus population-based controls (OR (95% CI) 1.93 (1.52 to 2.46), p=4.14×10 −7 , see online supplementary table S1 and figure S3d). This association was fully attenuated in the nonstratified hip OA versus population-based controls analysis (OR (95% CI) 1.05 (0.93 to 1.19), p=0.43; see online supplementary table S1). Genotype and minor allele concordance between typed and imputed genotypes at this SNP were 99% and 97%, respectively (see online supplementary table S3).

Power gains
Power increases substantially in the within-OA analyses compared with the non-stratified hip OA versus population-based controls analysis (table 3). Small to modest power increases are observed in the stratified OA cases versus population-based controls analyses compared with the non-stratified hip OA versus population-based controls analysis. The decrease in sample size required to reach genome-wide significance for an analysis within OA cases over the non-stratified hip OA versus population-based controls analysis ranged from 41-fold to 286-fold. The decrease in sample size between stratified hip OA versus population-based controls analyses over the non-stratified hip OA versus population-based controls analysis ranged from 1-fold to >240-fold.  1.27)). This suggests that the sex disparity of effects at the DOT1L locus is not driven by the differences in prevalence of different JSN patterns between the two sexes. Five other index SNPs (rs9350591, rs10948172, rs11177, rs4836732, rs6094710) also exhibited similar or lower p values and higher ORs in endophenotype-based association analyses than in the hip OA versus population-based controls analysis, despite the decrease in case sample size (ranging from 30% to 77%) (see online supplementary table S4). However, unlike the DOT1L variant, none of these showed an association that was strengthened enough to compensate for the multiple comparisons performed.

Gene expression analyses in articular cartilage
The expression levels of 10 genes located near variants highlighted in the bone remodelling stratified GWAS (GRP98 and ADGVR1 for rs16869403; STT3B, GADL1, OSBPL10, THRAP3 and TGFBR2 for rs6766414; and C1orf65, CAPN8 and SUSD4 for rs61837881) were examined in macroscopically intact and OA-affected articular cartilage (n=33 pairs from the RAAK Study, 6 with an atrophic, 19 with a normotrophic and 8 with a hypertrophic phenotype). We observed robust expression for SUSD4, STT3B, OSBPL10 and TGFBR2, with mean normalised probe level values of 7.9, 7.8, 8.4 and 11.5, respectively. Of these genes, only STT3B showed differential expression with respect to the disease process, with significant upregulation in OA-affected versus intact cartilage independent of joint site, particularly in patients with hypertrophic/normotrophic OA (fold change=1.30, p=4.23×10 −4 ; figure 2). When the genotypes of rs6766414 (TT and TG) were tested, a significantly lower expression of STT3B in lesioned OA cartilage was observed among carriers of allele G (n=9 heterozygotes-no homozygotes for the G allele were present in our data) compared with  figure 3).

DISCUSSION
We applied clinical radiographic endophenotypes to stratify genome-wide association analysis of hip OA and identify six putative associations ( p<5×10 −6 ). For the strongest signals in both endophenotypes we observe consistent effects across complementary strata both in the primary analysis and against population-based controls in the secondary analyses. These signals were all fully attenuated in the non-stratified analyses against the population-based controls, consistent with our hypothesis that this stratification approach increases the sensitivity of signal detection. rs754106, which showed the strongest association with pattern of maxJSN, resides in intron 1 of the LRCH1/CHDC1 gene whose function has not been well characterised. However, replicating associations with knee OA at variants spanning intron 1 of LRCH1 have been identified in samples of European descent. 42 rs6766414, which showed the strongest association with bone remodelling response, is located between STT3B and GADL1. STT3B mediates both co-translational and posttranslational N-glycosylation of proteins and GO terms associated with this gene include glycoprotein catabolic process. 43 STT3B encodes a catalytic subunit of the N-oligosaccaryl transferase complex, 44 suggesting that altered glycosylation in OA-affected cartilage mitigates protein folding and affects intermolecular interactions. 45 Thus we hypothesise that the upregulation of STT3B is a beneficial response among hypertrophic cases.
This study also has limitations. While the interobserver agreement of the radiographic classifiers supports the approach taken, differences in calling between observers does impact on the overall sensitivity of both the genetic association and expression analyses. Although the identified loci did not meet the Bonferroni-adjusted genome-wide significance threshold of p<5×10 −9 used because of the multiple discovery association analyses conducted, the putative association of variants with specific endophenotypes corroborated by substantial or full attenuation in other strata warrants replication and further functional investigation.
The hypothesis that stratification of cases into distinct endophenotypic categories provides a more direct link to the genetic basis of OA has not been previously tested systematically. However, this approach has been successfully applied to better understand disease biology in related complex traits, including joint shape and pain. [46][47][48][49][50] The results of our analyses show that, if further replication is targeted to the specific OA endophenotypes, study power may improve from nearly 0% in the traditional approach to 80% in a sample size of <5000 cases. We used post hoc power calculations, applying the observed variances from the study population of interest to exemplify the difference in the statistical power of the tests we performed. Such calculations have limitations, but are useful in designing follow-up studies and in conducting meta-analyses. We address one limitation in interpreting post hoc power calculations, ascertainment bias, by examining the actual summary statistics of the previously established genome-wide significant locus (DOT1L) associated with JSN. We find that the index SNP is more strongly associated in hip OA cases with superior JSN than in the non-stratified hip OA versus controls analyses ( p=8×10 −6 and p=3×10 −4 , respectively). As more precise phenotype definitions are accrued in the field and future replication is targeted to the specific endophenotype, the strength of the observed associations may increase and the power burden of multiple testing may be resolved. Our findings demonstrate that stratification of OA cases into more homogeneous endophenotypes can identify genes of potential functional importance otherwise obscured by disease heterogeneity.