Description
The field of structural bioinformatics is shifting towards generating artificial proteins and designing them specifically for certain target functions. In this seminar, we will dive into the topics that awarded David Baker a Nobel Prize in Chemistry in 2024 (among Demis Hassabis and John Jumper for their contributions to protein structure prediction).
We will cover the latest advances in protein generation and design, with a focus on deep learning-based methods. The seminar will be structured as a block seminar in late September/early October, where students will present recent papers on the topic and work on a hands-on project related to their presentation topic.
Requirements
We have no strict prerequisites but will prefer students who took at least one of the following courses:
- Structural Bioinformatics
- Neural Networks
- Elements of Machine Learning
- Machine Learning
Team
- Prof. Dr. Olga Kalinina
- Prof. Dr. Dietrich Klakow
- M.Sc. Roman Joeres
- M.Sc. Anastasia Kolchina
- M.Sc. Qingyuan Liu
- M.Sc. Hanqing Liu
What Do You Need To Do?
- Read and present one of the assigned seminar topics (assignment made at the kick-off meeting).
- Presentations are 30 minutes with an additional 10-minute Q&A from the audience.
- Attend all talks in the seminar and participate actively in discussions.
- Submit a hands-on project on your topic, details will be announced in the kick-off meeting.
How Are the Grades Computed?
Presentation (40%)
Assessed on clarity, depth of understanding, and the quality of your answers to audience questions.
Hands-On (40%)
You will try to reproduce a figure from the assigned paper. This will be individually discussed with the supervisor, based on the paper. Your submission should include the following two components:
- Either a Google Colab notebook (preferable) or local Jupyter notebook or Python file(s). If you choose a local option, please provide a clear instruction on how and in which order to run your scripts. Please provide instructions how to install the required dependencies.
-
A written report (max. 2 pages excluding tables and figures) on the reproducibility of the assigned paper. Here you need to describe what
worked and what did not, and what exactly you did to reproduce the results together with the generated tables/figures. We provide an
Overleaf template for the report.
If, for some reason, you find it impossible to reproduce a figure, this is fine. Some papers publish not reproducible results, and this is an important finding in itself. In this case, you will be asked to write a short outline analyzing the reasons for the irreproducibility.
Participation (20%)
Active engagement during other students' presentations. Ask questions — it counts toward your grade and improves the seminar for everyone.
Registration
This seminar is open to all students from computer science and bioinformatics. To apply, please write an email to roman.joeres@helmholtz-hips.de before May, 3rd 23:59. Please attach your current transcript of records (available via LSF/HISPOS).
After the registration deadline, all applicants will be informed of their participation status by email. We will then find a date for the kick-off meeting, where we will assign the presentation topics and discuss organizational details.
Organisational Details
📅 The seminar will be held as a block seminar in late September/early October.
🎓 Successful participants will earn 7 credit points (CP).
👥 Maximum number of participants: 12. 4 CS students, 8 bioinformatics students.
🌐 Seminar language: English.
📋 Registration in LSF/HISPOS is due on 02.06.2026.
For questions, contact roman.joeres@helmholtz-hips.de.
Important Dates
| 03.05.2026, 23:59 | Registration deadline |
| 12.05.2026, 15:00 | Kick-off meeting — room 0.01, building ZBI (E2.1) |
| [DD.MM.YYYY, HH:MM] | Presentation day(s) |
Topics
- ProtGPT2 is a deep unsupervised language model for protein design Assigned to: Deekshitha Poobalan, Supervisor: Qingyuan Liu
- All-atom protein generation with latent diffusion Assigned to: Klea Sinjari, Supervisor: Qingyuan Liu
- Understanding protein function with a multimodal retrieval-augmented foundational model Assigned to: Nicolas Pham, Supervisor: Qingyuan Liu
- PB-GPT: An innovative GPT-based model for protein backbone generation Assigned to: Bhavyashree Vishwanatha, Supervisor: Roman Joeres
- Zero-shot prediction of mutation effects with mutlimodal deep representation learning guides protein engineering Assigned to: Malik Saad Wazir, Supervisor: Roman Joeres
- Structure-informed language models are protein designers Assigned to: Mariia Landau, Supervisor: Anastasiia Kolchina
- ProteinBERT: a universal deep-learning model of protein sequence and function Assigned to: Ioannis Mamalis, Supervisor: Anastasiia Kolchina
- MULAN: multimodal protein language model for sequence and structure encoding Assigned to: Ben Sievers, Supervisor: Anastasiia Kolchina
- Scaling unlocks broader generation and deeper functional understanding of proteins Assigned to: Ibrahim Nadiyan, Supervisor: Roman Joeres
- Simulating 500 million years of evolution with a language model Assigned to: Regina Nitsch, Supervisor: Hanqing Liu
- Robust deep learning-based protein sequence design using Protein MPNN Assigned to: Susan Maria Bino, Supervisor: Hanqing Liu
- Atomic context-conditioned protein sequence design using LigandMPNN Assigned to: Aya Ahmed, Supervisor: Hanqing Liu