- Population Genetics /
- Teaching /
- Bioinformatics course
Assignment #1
- Open the file pSKII+.doc and try to find the sequences for the sequencing primers M13-forward (5' gtaaaacgacggccagt 3') and M13-reverse (5' ggaaacagctatgaccatg 3') as well as the RNA polymerase promoters T3 (5' aattaaccctcactaaaggg 3') and T7 (5' gtaatacgactcactatagggc 3').
- Which primers are homologous to the single stranded SKII sequence and which are complementary?
Software: Word, JaMBW (Reverse, Complement, Inverse)
Download: pSKII+.doc
Assignment #2
You received a cDNA clone and the sequence of the insert (prc1edvkurs.doc) from your colleague. They told you that the startcodon is the "atg" at position 79. For synthesis of an antisense RNA used as Northern Blot probe you have to subclone the insert into another vector.
The vector you have in the lab is Bluescript (pSKII+.doc). Bluescript contains a multiple cloning site flanked by sequences for the sequencing primers M13-forward and M13-reverse and the RNA polymerase promoters T3 and T7.
- Find the multiple cloning site in the vector.
- Find the best cloning strategy using only one restriction enzyme.
- Use directed cloning to ensure that all clones could be used to produce an antisense probe with the RNA polymerase T3.
- Define a strategy to use the T7 RNA polymerase. Which enzymes would you use?
Software: Word and Webcutter
Download: prc1edvkurs.doc
Assignment #3
Microsatellites are highly polymorphic markers, which are extensively used for paternity testing, genome walking, provenance studies and analysis of population structures.
They consist of tandemly repeated simple sequences of di-, tri and tetranucleotids as (AT)n, (CT)n, (CA)n, (GA)n, (GT)n or (CCT)n, .......
Their length variation results from DNA slippage a mechanism, which increases and decreases their repeat number. The repeats are flanked by unique sequences, which allow to design specific primers for the amplification of the microsatellite.
Please design primer pairs for the amplification of a microsatellite using the following criteria:
- product length: 100 - 300 bp
- annealing temperature: higher than 55 °C
- primer length: between 20 - 24 bp
Software: Word and Primer3
Download: microsatellite.doc
Assignment #4
A 32bp deletion in the C-C chemokine receptor type 5 (CCR5) results in a premature stop codon. This mutation has been shown to mediate resistance against HIV-1 infection.
Given the continuous threat of the human population by this virus, you have decided to develop a CRISPR-Cas9 mediated strategy to develop a targeted gene deletion of CCR5 in human populations.
- develop a suitable guide RNA with PCR primers
- develop a guide RNA without polymorphisms
Download: CCR5 gene
Assignment #5
You isolated a cDNA clone (PlecDNA.doc) and you would like to know how many introns are in the gene. Fortunately you are working with a fully sequenced organism thus it is easy to retrieve the full genomic region (Plegenomisch.doc).
- How many introns does the gene contain?
- What are the sequences (10 bp) around (i.e.: at the 5' and 3' splice site) the 3rd and 4th introns?
- If you have time, expand your analyses to further introns.
Software: Word and Dotlet
Download: PlecDNA.doc, Plegenomisch.doc
Assignment #6
The previous analysis showed that with the dot matrix program some useful interpretation can be made on DNA sequences. You have recently isolated a genomic fragment (Test.doc) and encouraged by the former results to analyze it with the dot matrix program.
- How can you explain the pattern you see in the dot matrix?
- Delete an internal portion of the sequence and compare the full versus the deleted sequence?
- What is the pattern on the dot matrix?
Software: Word and Dotlet
Download: Test.doc
Assignment #7
You have obtained a peptide sequence (ASFPCLNGGTCNDQVNGYVCVCAQDTSVSTCET) and would like to find its position in the full length protein.
Software: Word and Blast2 Sequences
Download: UEGF1.doc
Assignment #8
You have isolated a number of proteins by their interaction with a protein known to interact with RING finger proteins. By sequencing the protein you got:
from human cell lines:
 msvdmnsqgsdsneedydpnceeeeeeeeddpgdie
from C.elegans
 mnsddeiymegsasseddmddeclsd
 and
 mddedmsctsgddyagygdedyyneadv
from Drosophila melanogaster
 mdsdndndfcdnvdsgnvssgddgdddfg
 and
 mdsdiemdmesdndgeydddydyyntgedcd
from Saccharomyces cerevisiae
 mssgtendqfysfdesdsssielyeshntseftihglv
from Arabidopsis thaliana
 mdnnsvigsevdaeadesyvnaaledgqtgkks
 and
 mddyfsaeeeacyyssdqdsldgidneeselqpl
- Find the complete protein sequences for every given peptide and align the sequence to find out about their overall homology.
- Are there RING finger motifs in your proteins and if yes how many and where?
- RING-Finger proteins share a common protein motif of C-X2-C-X9-29-C-X1-3-H-X2-3-C/H-X2-C-X4-48-C-X2-C.
- Are there other remarkable protein motifs?
Assignment #9
You received a manuscript submitted for publication. The authors claim that they have discovered a gene involved in abnormal muscle growth in salmon (hs = heavy salmon). You should decide if the paper should be published.
- What gene is it?
- Is it really a novel gene?
- Do you support the authors' claim that this is a salmon gene?
- Could the authors' claim be true?
Software: Word, BLAST, FastA and Pubmed
Download: hs_gene.doc
Assignment #10
Inspired by the manuscripts you reviewed, you decide to look for the gene in whales.
- Make a sequence alignment to design primers for cross species amplification (do not use available sequence data from whales, use only cDNA sequences).
- Design primers that have a fair chance to amplify the gene from whales.
- You know that human contaminations are a problem in your lab. What would you do to minimize the risk of a human contamination?
Software: Word, BLAST, FastA and Clustal, Taxonomy Browser (NCBI)
Assignment #11
Alzheimer disease, the most common cause of dementia, is inherited as an autosomal dominant trait in some families.
- Find out which gene(s) are associated with this disease
- Determine the chromosomal positions (locus)
- How many allelic variants are deposited in the databank?
- In which model organisms do you find similar proteins?
- Determine the neighboring markers and genes (in humans)
Software: NCBI-GENOME, OMIM, UCSC, ENSEMBL
Assignment #13
You received one pair of microsatellite primers, made PCR and found a highly interesting pattern in one population (no variability). Inspired by this result, you are interested to know more about the locus. Unfortunately, you found only the sequence of one of the primers (ttttgtcgttttcgttatg) and your friend has gone for a 6 months holiday. Fortunately, you are working with one of the best studied organisms: Drosophila melanogaster so you have all possibilities to investigate!
- What is the repeat motif of your microsatellite?
- Which gene is in close proximity to the microsatellite?
- On which chromosome is the gene located?
- Determine the number of available transposon insertions in the gene.
- What would you do to obtain a flystock having the gene deleted?
Assignment #14
You have transformed an Arabidopsis thaliana mutant with a genomic sequence (Annotierungssequenz.doc) and the presumable gene is sufficient to restore the function of the mutant gene.
- Find the coding sequence.
- Find the PolyA signal.
- Where is the TATA box motif located?
- Locate the gene on the A. thaliana map.
- Are cDNA clones available for this gene?
- Where is the gene expressed?
- Predict the protein sequence.
- Does this protein share homologies with other proteins?
- Are there any related proteins in other plants/animals?
- Do these homologies indicate a possible function?
- Does the protein has some interesting domains?
- Is there a transmembrane domain?
- Predict the subcellular localization.
Software: TAIR, GENSCAN, Geneid, Softberry, ExPasy, MIPS
Download: Annotierungssequenz.doc
Assignment #15
You obtained the sequences of two genomic clones that are supposed to contain the orthologs of your two favorite genes in D. pseudoobscura, a distant relative of D. melanogaster.
- Predict the protein sequence using ab initio bioinformatic tools.
- Predict the protein sequence taking advantage of the known protein sequence of D. melanogaster.
- Make a multiple sequence alignment to compare the different predictions to the D. melanogaster protein.
- How do you interpret the results?
