Seminar:
Explainability in AI for Drug Discovery

Description

The idea of this seminar is to make the students familiar with basic algorithms in the field of artificial intelligence (AI). We will not cover neural networks, machine learning or deep learning as this is beyond the scope of this seminar. Alongside with basic AI we also want to introduce explainability. Explainable AI (XAI) is gaining more and more importance in recent years as AI systems take over in various fields. Two prominent examples are self-driving cars and medical disgnosis. Both require lots of confidence in AI and therefore it is of huge importance to be able to understand and comprehend decisions made by AI systems.

Other than in normal proseminars, you will get practical experience with the algorithms and implementations. You will implement one of the XAI algorithms for an AI algorithm (based on the sklearn pipeline) and present both together with the results in a 20 minutes presentation. Additionally, you will write a 5-pages report about you r findings.

Despite the seminar officially is called "Explainability in AI for Drug Discovery" we will not discuss papers applying explanability techniques to AI models used to aid drug discovery. But we will dive deep into these explainability techniques to discuss and understand them. A good overview for XAI in drug discovery and this seminar is provided in Jiménez-Luna et al., published in Nature.

Requirements

The following requirements are not strictly enforcedbut will make your life way easier in the seminar.

Basic knowledge in

Python coding,
Jupyter notebooks,
sklearn, and
Elements of [Statistical|Machine] Learning

What Do You Need To Do In The Seminar?

Read and present one of the seminar papers (as assigned in the kick-off meeting).
The presentations are 20 minutes with additional 10 minutes of questions from the audience.
Implement your XAI method in a Jupyter notebook to get hands-on experience (no worries, you are can use libraries, etc.)
Write a 5-pages report on your findings.
You need to attend all talks.

How Are The Grades Computed?

30% for the presentation and answers to questions on the topic
30% for the notebook. We will check for

○

Completeness
○

Documentation
○

Executability (can I run the notebook from top to bottom without raising errors?)

30% for the report. The report should include

○

A short explanation of how the model you explained works
○

An explanation of how the XAI technique works. What are its (dis-)advantages?
○

Comparison to other methods in this seminar or beyond
○

Your own thoughts on this method

10% for your participation in the discussion of other presentations. Therefore: Ask questions!

Plagiarism

We will check every submission for plagiarism with TurnItIn. This is an online tool automatically checking submissions for plagiarism. You are free (and encouraged) to use it before submitting your final report. Following the link above, you can login with your UdS-credentials (as you use for the students email) and use TurnItIn for free. With attendance of this seminar, you agree that we upload your report to TurnItIn.
If we detect plagiarism in your work, you will have the chance to explain yourself. Ultimately, you will fail this seminar if your explanation is not convincing.

Registration

Please register to this seminar by writing an email to Roman Joeres before 30.04.2023 23:59 . Please also attach your transcript of records which can be downloaded from the LSF/HISPOS. The seminar has a limited number of slots. After registration deadline, we will inform you about your participation.

Other Organizational Things

The seminar will be held as a block seminar, last week in September 2023 (25.09.-29.09.2023).
Passing participants will earn 7 CP.
The number of participants is limited to a maximum of 9.
The seminar language is english.
Bioinformatics master students and students with no prior seminar will be preferred.
Registration in LSF/HISPOS will be available later and communicated in time.
In case of questions , contact Roman Joeres (roman.joeres[at]helmholtz-hips.de) using "[XAI Seminar]" in the subject.

Important Dates

30.04.2023 23:59 - Registration deadline
08.05.2023 3 pm - Kick-Off Meeting in room 106 in E2.1 (CBI building)
15.09.2023 23:59 - Submission deadline for the final draft of the slide set.
29.09.2023 9 am to 1 pm - Presentations day
13.10.2023 23:59 - Submission of report and notebook

Milestones

We decided to introduce a partially voluntary milestone system to aid the students if they want to.
All deadlines in this milestone section are until 23:59 MESZ (GMT+2).

Notebook revision: Friday, August 4th, 2023
First version of slides: until September 1st. Feedback will be provided one workday after I received the slides.
Second version of slides: until September 15th

You are allowed to implement changes based on the feedback after the "second version of slides"-milestone.
Please send me the final version of your presentation by September 22nd.

Topics

All notebooks will deal with the Breast Cancer Dataset from the sklearn package to compare the algorithms performance and have interpretable features and comparable features. We will mainly rely on the sklearn python library.
Topics 1. and 2. will explained based on their implicitly computed feature importance.
Topics 3. and 4. apply their XAI technique to a linear regression model.
From topic 5 on, you will explain a neural network as true black box.

Decision Trees and Random Forrest with Gini Index - Jis Kochuniravathu Saji
sklearn docu | sklearn DT | sklearn RF
Support Vector Classifier with Linear Kernel - Kriti Maurya
ESLII pp. 417 | sklearn
Shapley Regression Values
Paper | requires students implementation
SHAP - Shilpa Sharma
Paper | PyPI | GitHub | Notebook
LIME - Annmariya Elayanithottathil Sebastian
Paper | PyPI | GitHub | Notebook
Permutation Feature Importance - Evgenia Khodzhaeva
Paper | sklearn | Notebook
DiCE - Jibin Varghese
Paper | PyPI | GitHub
NiCE - Muhammad Hamid
Paper | PyPI | GitHub | Notebook

Additional reading, papers that help understanding the bigger picture.

Seminar:Explainability in AI for Drug Discovery