Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

1. This exercise illustrates a very commonly-encountered task in molecular biolo

ID: 91512 • Letter: 1

Question

1. This exercise illustrates a very commonly-encountered task in molecular biology: finding a gene sequence in a database, designing PCR primers for its amplification and cloning it into a plasmid vector. Here, our goal is to amplify the gene encoding hexokinase from bakers' yeast (Saccharomyces cerevisiae). The first step is to find the correct DNA sequence from the National Center for Biotechnology Information NCBI. These data are shown below shown in the 5 to 3' direction. Next, identify the start and end of the protein coding region (begins at nucleotide 718 and ends at nucleotide 2175. 2778 bp linear PLN 27-APR-1993 LOCUS YSCHXKA mRNA. DEFINITION Yeast (S.cerevisiae) hexokinase PI (HXK1) gen complete cds ACCESSION M14410 M11184 VERSION M14410.1. KEYWORDS hexokinase hexokinase PI Saccharomyces cerevisiae (baker's yeast) SOURCE Saccharomyces ORGANISM Cerevl3laes Eukaryota Fungi Dikarya Ascomycota Saccharomycotina; Saccharomycetes Saccharomyceta les Saccharomy cetacea e Saccharomyces REFERENCE 1 bases 1 to 2778 Kope tzki, E and Mecke, D AUTHORS Entian, K. D TITLE Complet nucleotide sequence of the hexokinase PI gene (HXK1) of Saccharomyces cerevisiae JOURNAL Gene 39 (1) 95-101 1985) PUBMED 3908224 Original source t COMMENT ext east mRNA, clone YRp/HXK1-2 Location/Qualifiers FEATURES 2778 Source /organism "Saccharomyces cerevisiae" /mol type mRNA, /db xref taxon: 4932 CDS 718 2175

Explanation / Answer

1.

The DNA sequence of 2175 base pairs is given in NCBI sheet in 5' to 3' direction.

A CD region is the coding region of the nucleic acid which codes for protein. The CD region provides information about the nucleic acid sequence (gene sequence) and the protein that it codes for.

The CDS of hexokinase gene of Saccharomyces cerevisiae is 718 to 2175.

The coding region of a gene is the open reading frame (ORF). ORF is the part of gene that has potential to get translated to protein. It starts with the start codon (ATG) and ends with either of the three stop codons ( TAA, TAG, TGA). It is a continuous stretch of gene in between a start codon and a stop codon.

2.

In the above given NCBI sheet, 718, 719 and 720 is ATG, the start codon which codes for methionine.

The ORF extends till 2175. 2173, 2174 and 2175 is TAA, which is a stop codon.

Hence 718 is the starting point and 2175 is the ending point of the protein coding region.

3.

The double stranded DNA sequence of hexokinase gene from the given data is as follows:

5' ...708 A AAA AAT AAG ATG GTT CAT TTA GGT CCA AAG AAA C.....3'

3' ........ T TTT TTA TTC TAC CAA GTA AAT CCA GGT TTC TTT G......5'

The double stranded DNA sequence of the last 25 nucleotides of hexokinase gene is as follows:

5'.... 2151 G TCT CTT GGT ATC ATT GGC GCT TAA 2175 TGA AAA AAA T...3'

3'.......... C AGA GAA CCA TAG TAA CCG CGA ATT ACT TTT TTT A ...5'

4.

Nde1 is a restriction endonuclease derived from Neisseria denitrificans.

It recognises the sequence 5' CATATG 3'

3' GTATAC 5'

The original sequence needs a CAT tag/ modification at its 5' end.

Nde1 genetrates heterologous DNA construct as it has a start codon ATG .

5.

Bamh1 binds to the recognition sequence 5' GGATTC 3'.

It is a symmetric dimer and requires a sequence with Guanine on either strand for recognition

  5' GGATTC 3'

3' CCTAAG 5'

Amplification of the gene using PCR requires a forward primer and a reverse primer which are complementary to the starting and ending regions of the gene.

Now a days primer design tools are available using which primers can be designed.