27611 Introduction to Bioinformatics

27611 targets students at 4th semester. 27622 targets more advanced bachelor students.

Introduktion til bioinformatik


Lectures and computer exercises

13 weeks

F4A, E4A

General course objectives:

Students should become familiar with the use of computers for molecular structure and sequence analysis, with special emphasis on applications in microbiology, biotechnology and biotech industry.

Introduction to Bioinformatics is a practically oriented course with focus on the practical use of the methods. A large part of the course is computer-based exercises.

Learning objectives:

A student who has met the objectives of the course will be able to:
  • Explain how the information in biological macro-molecules, such as DNA and protein can be represented in an electronical format.
  • Explain how DNA and protein sequences from related organisms are influenced by a common evolutionary history.
  • Search for sequence and structure data from the publicly available databases, such as GenBank, UniProt and PDB.
  • Visualize protein 3D structure using computer software.
  • Generate and critically evaluate DNA and peptide alignments.
  • Query sequence databases using alignment based methods (BLAST) and critically evaluate the results
  • Predict the most probable biological function of a novel gene or protein product by comparison to already characterized genes/proteins.
  • Generate multiple sequence alignments of sets of related sequences – using both globally and locally optimized algorithms.
  • Generate phylogenetic trees from multiple alignments.
  • Generate and interpret visualizations of the information content of sets of related sequences (“logo plots”).


Evolution at the DNA level. Taxonomy. Practical use of taxonomy databases.

Biological information. Information content in biological macro-molecules. DNA sequencing – including error sources. DNA sequences in electronical format. How to use the GenBank database.

Protein sequences. Protein structure levels. Protein sequences in electronical format. Sources of protein sequences (direct sequencing and computer based translation). How to use the UniProt database.

Protein structure. How protein structures are determined. Quality of protein structure data. How to use the PDB database. Computer based visualization of protein structure.

Pairwise alignment. Alignment scores, gaps, substitution matrices. Global and local alignment.

BLAST. How to use BLAST for searching sequence databases. Critical evaluation of results. Iterative BLAST.

Multiple alignments. The use of heuristic methods due to data complexity. Globally and locally optimizing algorithms.

Generation and interpretation of phylogenetic trees from multiple alignments. The NJ algorithm for tree construction. Rooted versus unrooted trees.

Weight-matrix based methods. How to search using weight-matrices. Generation and interpretation of LOGO plots.


Lecture notes and exercise manuals handed out during the course

Bent Petersen , bent@cbs.dtu.dk
Henrik Nielsen , Building 208, room 005, Ph. (+45) 4525 6124 , hnielsen@cbs.dtu.dk
Rasmus Wernersson , raz@cbs.dtu.dk


27 Department of Systems Biology

