Introduction

Keratolytic winter erythema (KWE, OMIM 148370) is an autosomal dominant skin disorder of unknown etiology. It is also known as ‘Oudtshoorn skin disease’ in reference to the South African district where it was first described.1 KWE is characterised by cyclical erythema, hyperkeratosis, and peeling of the skin of the palms and soles, arresting at major skin creases. The symptoms are especially noticeable during winter. The prevalence of KWE in the South African Afrikaans-speaking Caucasoid population is 1/7200,2 and a founder effect has been confirmed with haplotype analysis.3 In Germany, several cases have been reported.3 The disorder usually manifests within the first 5 years of life, and the age of onset seems to be related to the severity of the phenotype, with individuals who display an early age of onset being more severely affected.2 The gene causing KWE was localised to chromosome 8p22-p23 in a 5-cM interval between markers at D8S550 and D8S552 by linkage analyses, with a proposed ancestral recombination event pointing to a 1-cM interval between markers at D8S550 and D8S265.3 The identification of the gene causing KWE will contribute to a better understanding of the process of epidermal differentiation.

The chromosomal region 8p22-p23 has also been associated with frequent loss of heterozygosity (LOH) in different types of cancer. There are several reports of LOH of chromosome 8p in breast cancer,4,5,6,7,8,9 lung cancer10,11 head and neck cancer,12 prostate cancer,13 hepatocellular carcinoma,14 intrahepatic cholangiocarcinoma,15 and urinary bladder cancer.16,17

This study reports on the construction of a transcript map between the markers at D8S550 and D8S1759 in the KWE critical region using exon trapping, cDNA selection, genomic sequencing, and sequence analyses.

Materials and methods

Construction of a BAC contig

The CITB Human BAC Libraries B and C (Invitrogen, Inchinnan, Scotland, UK) were initially screened by PCR using the primers for D8S550, D8S265, and D8S1593, and then using STS primers derived from the BAC ends (164D9T7F: 5′-AAA TCA AGG TGT TGG CTG GG/164D9T7R: 5′-ACA TCT CAC TCT GTC ACG CA; 169O5SP6F: 5′-GTG GAA GGA TAC TGA TGT GG/169O5SP6R: 5′-CAG CAG CCT CTC GTG TCT TA; 169O5T7F: 5′-CCT GGT CAG CTA ATT TCT GG/169O5T7R: 5′-AAA CAG ACT CCT TGG CAG CA).

Exon trapping

Exon trapping experiments18 were performed following the recommendations of the manufacturer (Exon Trapping System (Life Technologies, Karlsruhe, Germany)) with the BAC clones CTC-306G11, CTC-493P15, CTB-164D9, CTB-169O5, CTB-65D4, CTC-271O15, and CTC-367I24. In brief, BAC DNA was digested with either PstI or BamHI and BglII and subcloned into the exon trapping vector pSPL3B. The vector constructs were transfected into E. coli. The DNA pool of different plasmid clones was then isolated and transfected into COS-7 cells using LIPOFECTACE Reagent (Life Technologies). After 2 days of incubation at 37°C, total RNA was isolated with TRIzol Reagent (Life Technologies). Reverse transcription with the vector primer SA2 (5′- ATC TCA GTG GTA TTT GTC AGC) was followed by a PCR with vector primers SA2 and SD6 (5′- TCT GAG TCA CCT GGA CAA CC). To eliminate vector-derived products the reaction mix was then digested with BstXI, and a second PCR with the primers dUSA4 (5′-CUA CUA CUA CUA CAC CTG AGG AGT GAA TTG GTC G) and dUSD2 (5′- CUA CUA CUA CUA GTG AAC TGC ACT GTG ACA AGC TGC) was performed. The dUMP residues in these primers enabled a further subcloning into the UDG cloning vector pAMP10 after incubation of the PCR products with Uracil DNA Glycosylase (UDG). The size of the inserts of all transformants was determined by PCR followed by agarose gel electrophoresis. Products longer than the spliced vector were further analysed. All putative exons were sequenced using standard protocols for an A.L.F. automated DNA sequencer (Amersham Pharmacia Biotech, Freiburg, Germany) and an ABI 377 automated DNA sequencer (Applied Biosystems, Foster City, CA, USA). By BLAST searches19 of the generated exon sequences, several ESTs were identified, and corresponding clones obtained from the Resource Center of the German Human Genome Project (http://www.rzpd.de) or the HGMP Resource Center (http://www.hgmp.mrc.ak.uk) and completely sequenced.

cDNA direct selection

cDNA direct selection was performed essentially as described by Morgan et al.20 Four overlapping YAC clones, 770E9, 915H4, 737E5 and 773G4, from contig WC8.1, were identified from the Whitehead Institute website (http://www.genome.wi.mit.edu). Pulsed field gel electrophoresis was used to separate the YAC clones from individual yeast chromosomes and the sizes were compared to the expected sizes (770E9 (1400 kb); 915H4 (740 kb); 737E5 (800 kb); 773G4 (1570 kb)). The YACs were excised, purified and biotinylated using both random priming and nick translation. Salivary gland (Sal) and foetal brain (Fb) cDNA pools were constructed using both oligo dT and random priming from total cytoplasmic RNA. Linkers (Fb: 5′-CGA GAA TTC TGG ATC CTC; Sal: 5′-CCA CTG AAT TCT CAG TGA) were annealed to each of the pools separately. Repeat suppressed cDNA pools were hybridised for 48 h in solution to the biotin labelled genomic DNA to an intermediate C0t1/2 of between 100 and 200. Streptavidin beads were added to the solution and three washes, 2×SSC/0.1% SDS, 1×SSC/0.1% SDS, and 0.1×SSC/0.1% SDS, at 65°C were performed to elute any cDNAs not bound to the genomic DNA within the YACs. An alkali wash was used to elute the selected cDNAs. Two rounds of hybridisation selection were performed to ensure that quasi-normalisation occurred to increase the chance of identifying low abundance transcripts. Selected cDNAs were amplified using the library specific linker primers. The two tissue specific cDNA pools were separately subcloned non-directionally into pAMP10 (Life Technologies). Nine hundred and sixty clones from each of the libraries were picked into microtiter dishes and gridded on a Biomek 1000 station (Beckman, Fullerton, CA, USA) at a high density, in duplicate, onto nylon filters. To eliminate clones containing repetitive sequences, the filters were hybridised with total human DNA and colonies showing positive hybridisation were excluded from further analysis. Redundancy screening was done by hybridisation. DNA from the isolated non-redundant cDNA clones was cycle sequenced with the M13R primer (5′-CAG GAA ACA GCT ATG AC) using fluorescent DyeDeoxy Terminators (FSkit) on an ABI 377 automated DNA sequencer (Applied Biosystems).

Genomic sequencing and analyses

The BAC clones CTC-306G11, CTC-493P15, CTB-164D9, CTB-169O5, CTB-65D4, CTC-271O23, and CTC-367I24 (CITB Human BAC Libraries, Invitrogen) were completely sequenced using a combination of shotgun and directed methods as described.21 Data were collected on ABI 377 automated sequencers (Applied Biosystems) and assembled and edited using the GAP package.22 The generated contiguous genomic sequence of 507 616 bp (acc. nos. AF131215, AF131216) and the draft sequence of BAC clone RP11-148O21 between D8S1695 and D8S1759 (acc. no. AC022239) were analysed using RUMMAGE23 and several other programmes (http://www.hgmp.mrc.ac.uk/NIX) including BLAST and FASTA.19,24 For analysis, the translated proteins were submitted to the GeneQuiz system (http://jura.ebi.ac.uk:8765) and the ExPASy molecular biology server (http://www.expasy.ch).

RT–PCR

About 150 ng of total RNA from skin, primary keratinocytes, HeLa cells, and lymphoblastoid cell lines, respectively, were used for cDNA synthesis with either oligo(dT)15 or random hexamers as primers in a 10 μl reaction. One microlitre was then used in a 15 μl PCR. The PCR products were either sequenced directly or cloned into the TA vector pCR2.1 (Invitrogen) and sequenced using standard protocols for A.L.F. automated DNA sequencers (Amersham Pharmacia Biotech) or ABI 377 automated DNA sequencers (Applied Biosystems) with primers −21M13F (5′-TGT AAA ACG ACG GCC AGT), M13R primer (5′-CAG GAA ACA GCT ATG AC), or gene specific primers. The sequences were analysed using BLASTN,19 and corresponding cDNA clones25 were ordered from either the German Resource Center (http://www.rzpd.de) or the HGMP Resource Center (http://www.hgmp.mrc.ak.uk) and completely sequenced.

Expression studies

Patterns of transcription of the isolated cDNAs were studied by RT–PCR experiments using commercially available cDNA pools from 12 different tissues (set 1 and set 2 (Origene, Rockville, MD, USA)), and cDNA from skin, primary keratinocytes, HeLa cells, and lymphoblastoid cell lines. The primers used were as follows: BLK: BLK4F (5′-TTG CTC CAA TCA ACA AGG CC) and BLK4R (5′-ACA TGG TTC CCT CCT TCA GC); MTMR8: S8aG4F (5′-GAT GAA GCT CTT CGG AAG GT) and S9A10R (5′-CTC TTT GAT GTG AGT CAG CC); TDH: vir17F (5′-GTT CAT TAG GAT GCT GAG GC) and vir17R (5′-ATA GAT GGT CCT GGG TCT CT); C8orf13: vir3F (5′-GAG ACC CAC TGG AGT AAC TT) and vir3R (5′- GAA GTG GGT GCA GAA GAG GG); AMAC: vir8F (5′-CTG AGT TGG AGT TGT GTG GG) and vir8R (5′-CGG GCT CCC AAG TTC TAT CT); C8orf12: vir33F (5′-AGG TCT CCC TGC TTC TTC AA) and vir33R (5′-TCC CTT GCC AAT GTA ATC GG); C8orf14: vir35F (5′- AAG CTC TCA TCC AAT GTC CC) and vir35R (5′-ATT AGT CCA GGG TGA GTC TG); C8orf7: vir11F (5′-TGT AGT CCT GCA CGA ACC AG) and vir11R (5′-TCG ATG CAC ATC TGC CAC TG); C8orf8: vir32F (5′-GCA GAC ATG ATG GCA TCT TA) and vir32R (5′-GAA GGT CCT TGT GCA TGA AG); C8orf5: vir15F (5′-TGA GCA CAA ATG AAA GCG AC) and vir15R (5′-TGA GCC ACA TAT CCA TTC AG); C8orf9: 205798_5rev (5′-TGC TGG AGC CCA CAA CAA CT) and 205798pr (5′- CCT GGA GGA TTC CGT AAG GT); C8orf6: vir25F (5′-TTT CTA CAA TGA CCC ACC AC) and vir25R (5′-TGA CCA CAT ACA GCT CTT CT).

Mutation analyses

Mutational analyses were performed with genomic DNA of at least one healthy German control individual and one or more patients from the German KWE pedigree investigated in the original linkage study.3 Each exon was amplified by PCR, directly sequenced using the BigDye terminator cycle sequencing kit (Applied Biosystems), and electrophoresed on an ABI 377 automated DNA sequencer (Applied Biosystems).

Results and discussion

Construction of a BAC contig and genomic sequencing

As an essential step towards identifying the KWE gene, a complete physical map has been generated spanning the KWE critical region between the markers at D8S550 and D8S265.3 These two markers were initially used to screen a BAC library. Additional STSs generated by end-sequencing of BACs were used in a subsequent screening to close remaining gaps (Figure 1). Seven BAC clones covering the region were completely sequenced and a contiguous genomic sequence of 507 616 bp was generated (acc. nos. AF131215, AF131216). Within this sequence, markers at D8S550, D8S1755, D8S265, and D8S1695 were identified. In addition, the contig could be extended to D8S1759 by the overlapping BAC clone RP11-148O21 (acc. no. AC022239) reported to consist of six unordered contigs. Three of these were ordered based on overlaps with BAC CTC-367I24 (acc. no. AF131216), another two contigs were ordered by alignments to human BLK mRNA (acc. no. S76617) which also defined their orientation relative to the latter contig, and the orientation of the remaining contig is given by the clone end. Thus, a genomic sequence of 634 404 bp comprised of four contigs of 518 835, 18 713, 77 179, and 19 677 bp was generated and analysed. In contrast to the Généthon linkage map,26 in which the order of markers is tel-D8S1755-(D8S1695-D8S550)-D8S265-D8S1759-cen, we found the order of markers as tel-D8S550-D8S1755-D8S265-D8S1695-D8S1759-cen. In addition, we identified 13 new polymorphic microsatellite loci (D8S2619 through D8S2631) between D8S550 and D8S1759 (Figure 1).

Figure 1
figure 1

Physical and transcript map of the KWE gene critical region on human chromosome 8p22-p23. The order of previously identified microsatellite markers (bold) is given as determined by the genomic sequence, and the positions of 13 new microsatellite markers (marked by asterisks) are shown at the top. The transcript map is shown underneath. Twelve transcripts, depicted as filled boxes, were identified using various approaches. The EST locus D8S1593 is part of the novel myotubularin-related protein gene MTMR8. Partial overlaps of genes are shown as hatched boxes. The direction of transcription is presented by arrows. Local analysis of the G+C content of the genomic sequence suggested three different G+C content domains. The average G+C content of these regions is shown by the dotted line. BAC clones were identified by PCR-screening of a human BAC library. The BAC clones that cover the whole region between D8S550 and D8S1695 were completely sequenced. The contig was extended to D8S1759 by the partially sequenced BAC clone RP11-148O21, which was identified by database searches.

The average G+C content of the 634 kb region is 45% (Figure 1). Local analysis revealed compositionally distinct regions having average G+C contents of 49% (50 kb in the distal portion, positions 1–50 000), 42% (300 kb representing the central portion, postions 50 001–350 000) and 48% (284 kb representing the proximal part of contig). These differences suggest that the analysed region spans different G+C content domains: tel-H2-L-H2-cen. Seven CpG islands were identified in the genomic region (length500 bp, G+C50%, ratio of observed to expected CpGs 0.627). Five of these are associated with genes or transcription units. Overall, about 41% of the 634 kb segment is composed of repetitive elements, such as SINEs (12%), LINEs (16%), and LTR elements (7%). Locally, the repeat content is low in the distal region of high G+C content with an average of 36%, and rises to 42% in the central and proximal portions of the region analysed.

Identification of transcripts by exon trapping, cDNA selection, and sequence analysis

Various approaches were taken to identify positional candidate genes in the KWE critical region. Exon trapping was used to generate novel ESTs from the seven BAC clones of the sequenced genomic contig. Five different cDNAs (corresponding to parts of C8orf7, C8orf8, MTMR8, TDH, C8orf13) were initially derived from 12 trapped exons by RT–PCR (Table 1). BLAST searches of their sequences identified several ESTs extending the cDNAs, which were then confirmed by RT–PCR. Moreover, the transcription of 21 predicted exons contained in the transcipts C8orf7, TDH, MTMR8, and C8orf13 could be verified by RT–PCR. The cDNA selection using four YAC clones from the contig WC8.1 of the Whitehead Institute led to the initial isolation of 790 non-redundant cDNA clones (acc. nos. AW408850 to AW409398, AW441142 to AW441186, and AW737018 to AW737061).28 The resulting EST sequences were compared to the genomic sequence, and of 128 clones with at least 90% sequence identity, 26 contained exonic sequences. Four additional transcripts (corresponding to C8orf5, C8orf6, AMAC, BLK) were detected using selected EST sequences. Furthermore, they confirmed C8orf7, MTMR8, and C8orf13. Another three transcripts (C8orf9, C8orf12, C8orf14) were identified by BLAST searches of the genomic sequence between D8S550 and D8S1759 (Table 1).

Table 1 Transcripts isolated from the region between D8S550 and D8S1759

By far the most exons of the transcripts described could be identified by direct analysis of the genomic sequence using various techniques (79%; Table 1). The total number of exons identified by exon trapping (91) is relatively small, however, 61 of these were confirmed as transcribed sequences and mapped to the original region (data not shown). In contrast, 378 exons were predicted in the genomic sequence by at least one algorithm but only 23% of the exons analysed could be confirmed experimentally. At least in the tissues tested, the number of false positive sequences obtained by exon trapping is thus rather small (33%) as compared to cDNA selection (80%) and genomic sequence analysis by exon prediction (77%). In particular, it became obvious that a combination of different tools to predicting exonic sequences23 was opportune to diminish false positive results. Although there is a considerable overlap between the exons found by the different approaches described, a number of transcripts would not have been isolated with one method alone. These findings revealed that it is still rather useful to combine the current availability of human genomic sequences with at least one other method of searching transcribed sequences.

Transcript map between D8S550 and D8S1759

Altogether, 12 transcripts were identified in the region between D8S550 and D8S1759 (Figure 1; Table 1). Transcription units as well as corresponding genes were localised on the genomic sequence and their exon/intron structures determined. The transcription was analysed by RT–PCR with cDNAs from different tissues. The expression patterns are shown in Table 2.

Table 2 Expression patterns of the transcripts based on RT–PCR

One known gene, BLK, encoding B-lymphocyte specific tyrosine kinase, was identified and localised between D8S1695 and D8S1759. The gene is comprised of 13 exons spanning about 70 kb. It harbours an open reading frame of 1518 codons starting in exon 2 and terminating in exon 13. In contrast to Drebin et al.,29 the transcription of BLK could be detected in lymphoblastoid cell lines, spleen, liver, leukocytes, ovary, muscle, and testis (Table 2). As has been described for genes having tissue specific expression,27 the 3′ end of BLK is associated with a CpG island of 1097 bp in length and a G+C content of 70%. BLK is a member of the SRC family, which is thought to play an important role in the signalling pathways controlling cell proliferation and differentiation.30 BLK may also be involved in membrane attachment and thymopoeisis.29,30 This gene is therefore a good positional candidate for the cancers mapping to this region.

A putative novel myotubularin-related protein gene31 (MTMR8, acc. no. AJ297823), containing D8S1593 in its 3′ UTR, was identified between D8S1755 and D8S265. It is comprised of 10 exons spanning 43 kb. The 5′ end of the gene is associated with a CpG island of 792 bp and a G+C content of 70%. The island also shows transcriptional activity in the opposite direction. The cDNA corresponding to MTMR8 is 7081 bp in length as confirmed by Northern blot analysis. The open reading frame predicts a primary structure of 549 residues. It starts in exon 1 and terminates in the last exon. Based on the similarity of the polypeptide to members of the myotubularin protein family (32–40%), the gene was named myotubularin-related protein gene 8. The gene is expressed in all 16 tissues investigated by RT–PCR. So far, eight other proteins belonging to the myotubularin family have been identified, and two of these have been associated with diseases. Myotubularin (MTM1) is mutated in X-linked myotubular myopathy, a severe recessive congenital muscle disorder32 (XLMTM, OMIM 310400), and mutations in myotubularin-related protein 2 (MTMR2) have recently been reported to cause Charcot-Marie-Tooth disease type 4B, an autosomal recessive demyelinating neuropathy with myelin outfoldings.33

The gene product of TDH (acc. no. AJ301562) shows similarities to murine L-threonine 3-dehydrogenase (79% identity in 200 residues, acc. no. AF134346), to the CG5955 gene product of D. melanogaster (55% identity in 194 residues, acc. no. AAF51607) and the hypothetical protein F08F3.4 of C. elegans (51% identity in 187 residues, acc. no. T29433). The human gene consists of nine exons spanning 30 kb, and its 5′ end is associated with a 70% GC-rich CpG island of 1195 bp in length. The cDNA encoding TDH is 1357 bp in length and harbours an open reading frame of 250 codons, which starts in the first exon and ends in exon 7. TDH is expressed in only six of the investigated tissues (Table 2). There are at least three different splice forms of TDH in lymphoblastoid cells. We detected an alternatively spliced exon of 111 bp between exons 1 and 2. The resulting transcript contains an open reading frame of 231 codons that starts in exon 2. A further alternatively spliced exon of 59 bp was detected between exons 6 and 7. This splice variant contains an open reading frame of more than 280 codons.

C8orf13 (acc. no. AJ301564) contains D8S265 in its 3′ UTR. The gene is comprised of three exons spanning 45 kb. Its 5′ end is associated with a CpG island of 1097 bp in length and a G+C content of 70%. The transcript was detected in 10 of the 15 tissues tested (Table 2). The open reading frame predicts a primary structure of 214 amino acids and shows similarities to the human SEC oncogene (37% identities in 83 residues, acc. no. X52259), which is involved in human breast, colon, and prostate carcinomas.34 This is noteworthy with respect to the frequent LOH reported for this region. In particular, Wang et al.5 and Richard et al.8 observed LOH in 8p22-pter in invasive ductal breast carcinoma, and Sigbjörnsdottir et al.9 noticed significantly more frequent chromosomal deletions between D8S550 and D8S1752 in patients with BRCA2 999del5 linked breast cancer than in patients with sporadic breast cancer. Kawaki et al.15 reported LOH in intrahepatic cholangiocarcinoma between markers D8S265 and D8S258. In bladder cancer, Takle and Knowles16 observed deletions between markers D8S264 and D8S133. Also, the region between markers D8S277 and D8S258 is reported as a ‘hot spot’ of allelic loss in small cell and non-small cell lung cancer,11 which was observed early in the pathogenesis of lung cancer.10 Therefore, C8orf13 is a good candidate for a tumour suppressor gene thought to be located in this chromosomal region.

AMAC (acc. no. AJ291677) exhibits high similarity to murine acyl-malonyl condensing enzyme. In contrast to the mouse gene (Amac1, acc. no. NM_019871), however, the human transcript AMAC consists of a single exon. The longest open reading frame starts at position 122 and is 1017 bp in length. Remarkably, all three possible forward open reading frames, which were verified by RT–PCR experiments and genomic sequencing, show similarities to different parts of the mouse protein (in detail, positions 845–1135, reading frame +2, 82% identity in 97 residues; positions 237–530, reading frame +3, 56% identity in 99 residues; positions 574–732, reading frame +1, 64% identity in 53 residues). This indicates that AMAC might be a non-functional copy of the gene and hence a transcribed pseudogene. Moreover, a 3′-polyadenylation tract, which is present in the genomic DNA adjacent to AMAC, and the absence of introns are characteristic of pseudogenes that have arisen by retrotransposition.35 In the human genome, three additional sequences with high similarities are located on 17p13.1 (acc. no. AC007732.3), 18p11.2 (acc. no. AC022030.2), and 17q11.2 (acc. no. AC022706.4). Functional analyses, however, are still necessary to investigate whether AMAC is the homologue of murine Amac1 or a transcribed pseudogene.

The remaining seven transcripts do not show similarities to known genes (Table 1). The gene encoding C8orf12 (acc no. AJ301563) spans 70 kb and consists of seven exons. The longest open reading frame of 105 codons starts in exon five and ends in the last exon. This transcript was detected in 11 of the 16 tissues tested by RT–PCR. C8orf14 (acc. no. AJ291678) is only transcribed in HeLa cells, lymphoblastoid cell lines, primary keratinocytes, skin, and testis. It is 1927 bp in length and consists of three exons. The last exon harbours the longest open reading frame of 93 codons. C8orf7 (acc. no. AJ301560) is comprised of two exons with an open reading frame of 288 codons and spans about 7 kb. The transcript was detected in five of the 16 tissues investigated. C8orf8 (acc. no. AJ301561) consists of two exons which are separated by an intron of 20 kb in length. It was isolated by exon trapping and shown to be transcribed in lymphoblasts and testis. In contrast, C8orf5 (acc. no. AJ305312) could be detected in almost every tissue tested. This transcript is generated by only one exon. The same is true for C8orf9 (acc. no. AJ291676) and C8orf6 (acc. no. AJ307469).

Interestingly, the genes corresponding to transcripts TDH and C8orf12 partly overlap in their 3′ and 5′ UTRs, respectively. Since they are transcribed in the same direction, one can assume they may be controlled by the same mechanisms. However, this is apparently not a cell-type specific regulation as they have different expression patterns (Table 2), and there is no clue to a possible functional relationship between both genes so far. The overlapping genes C8orf12 and C8orf13 are transcribed from different DNA strands. The last four exons of C8orf12 harbouring the open reading frame are positioned in the second intron of C8orf13. C8orf9 and the gene encoding MTMR8 also overlap and are transcribed from opposing DNA strands resulting in mRNA molecules sharing a complementary 5′ UTR sequence. As C8orf9 is an intronless gene with a short open reading frame, it might function as an antisense control element of MTMR8. Although overlapping genes occur frequently in viral genomes as well as in genomes of cellular prokaryotes and prokaryote-derived organelles such as mitochondria,36 they are less frequent in eukaryotes. Generally, the overlap of two genes transcribed from different DNA strands is more common37,38,39,40,41,42,43,44 than the overlap of genes transcribed from the same DNA strand.45,46 The functional role of the arrangement of such antisense genes has not been determined in most cases. Several functions have been proposed including transcriptional inhibition by steric hindrance of the two genes transcribed at the same time, interference of splicing and processing, inhibition of translation by formation of RNA duplices, and effects on mRNA stability.

Mutation analyses

In order to determine whether one of these transcripts was indeed the KWE gene, each exon belonging to a transcript was screened for mutations by direct sequencing of genomic DNA from unaffected individuals and KWE patients of the German pedigree linked to this region.3 A total of 46 single nucleotide polymorphisms (SNPs) and insertion/deletion polymorphisms were identified (for details, see acc. nos. AF131215 and AF131216). Thirty-nine of these are located within the genes described here (Table 1), including seven coding SNPs. In C8orf6, the exchange of T to C at position 1058 leads to the putative substitution of tryptophan by leucine at position 38 (W38L), and in MTMR8 an exchange of A to T leads to the alteration of methionine to leucine (M200L). A further coding SNP, I104R, caused by a T to G exchange at position 352, was discovered in TDH, and in C8orf12 the substitution of T to A at position 1105 affecting the stop codon results in an elongation of the predicted peptide by five residues, KCLSP. In C8orf13, two further coding SNPs were identified, H56Q, caused by an exchange of T to G at position 616, and T107S, caused by a C to G substitution at position 768. The exchange of A to C at position 1654 in C8orf14 leads to a potential exchange of threonine to proline (T67P). All the polymorphisms were validated by the analysis of 10 healthy individuals from five independent German pedigrees. The KWE causing mutation, though, was not identified in any of the transcripts, making it very unlikely that one of these is the KWE gene. As promoters and introns were not analysed exhaustively, it cannot be excluded that one of these harbours the pathogenic mutation. However, it seems now necessary to confirm the refined localisation of the KWE gene. As the proximal boundary of the region at D8S265 is only based on one family with a proposed ancestral recombination,3 further members of this family should be recruited and new families analysed to ensure the exact localisation of the gene. However, the identification of the novel genes described here and the physical mapping of new microsatellite markers and SNPs will facilitate the further analysis of disease loci that map to chromosome 8p22-p23.