Open Positions

  1. PhD projects
  2. Student projects

PhD projects

Transfer learning methods for personalised cancer treatment

Project Description 

How can we find better diagnostic markers for early stages of cancer? How can we better predict which therapy is best for which patient? The aim of the ‘COMPUTE CANCER’ project is to develop a novel methodological and computational framework that improves the selection of diagnostic and predictive biomarkers from cancer genomics data. Our strategy is to adapt known prediction methods (e.g. penalized regression, random forests) to allow for the inclusion of ‘co-data’: auxiliary information on the genomic variables derived from other data sources. Examples include p-values from related external studies or genomic and drug response data from cancer cell lines.

Interdisciplinary collaboration 

The group of Mark van de Wiel (Biostatistics, VUMC) will focus on developing ‘weighted learners’, where the challenge is to determine weights from co-data in an automatic and objective way. The groups of Marco Loog, Marcel Reinders and Lodewyk Wessels (TUDelft and NKI) will develop transfer learning approaches that exploit large cell line response data sets to predict therapy response in tumors. We will focus on response to chemotherapy in neo-adjuvant breast cancer and metastatic colorectal cancer. The project is embedded within broad oncological collaborations within the VUMC as well as the NKI. The successful candidate will be employed at the TUDelft and will spend at least two days per week at the NKI. Regular meetings between project partners and oncologists will be held to generate sufficient cohesion and momentum.

Candidate profile 

We are seeking a highly motivated PhD candidate with:

  • A degree in bioinformatics, computer science or a related discipline
  • Proficiency in bioinformatics programming languages (e.g. R, Python)
  • Good cross-disciplinary collaborative and communication skills
  • Experience in machine learning and statistical methods is a plus
  • Experience in analysing high-throughput molecular data is a plus
  • Experience in cancer biology and clinical applications is a plus

Please send CV and motivation letter to Magali Michaut ( & Lodewyk Wessels ( Please include the names and contact information of at least two references.

Deadline: 17 April 2017

Student projects

If you want to do your master project in the group, you are always welcome to send an open application to Lodewyk Wessels.

Master student project: Understanding drug response via data integration


Most cancer treatments are effective only in a subset of patients. One of the major current challenges in providing effective treatment is to predict which patients will respond to a given treatment and explain the (epi)genomic alterations associated with these responses. State-of-the-art methods for drug response prediction methods employ various data types. Although they can have good predictive value, the results are typically not readily interpretable (Jang et al. 2012). Our goal is to develop a new method to predict drug response while additionally providing novel insights into the determinants of drug response. In this project, we will leverage existing methods and data sets that have not been exploited to their full extent yet.

What needs to be done

The proposed approach will combine 2 existing methods: aNMF (Schlicker et al., in preparation) and ModuleNetworks (Segal et al., 2003). First, we plan to use aNMF to integrate heterogeneous data types, both from tumor samples and cell lines, and produce sample clusters and associated features. Next, ModuleNetworks will bring explanatory value by defining potential regulatory programs leading associated with these clusters. The method will be developed on lung cancer data (TCGA and Sanger cell line panel) and validated using leave-one-out cross-validation on the cell lines. We will perform independent functional validation using the knockdown screens of the Achilles project. Then, the method will be applied on a private melanoma data set. A potential further investigation comprises the extension of aNMF to include additional data types cluster generation.


We are looking for a motivated student with a strong background in computational biology, statistics and the R programming language to work on this project. There will be ample opportunity to bring forward your own ideas. During the project a report has to be written about the work performed and an implementation of the method should be provided. The research group The project will be carried out in Amsterdam at the Netherlands Cancer Institute (NKI-AVL), a dynamic and inter-disciplinary research institute. The Computational Cancer Biology group headed by Prof. Dr. Lodewyk Wessels consists of about 19 scientists including postdocs, PhD students, bioinformaticians and MSc students. The research is focused on the development of novel computational approaches exploiting a wide variety of data sources in order to improve cancer diagnosis and treatment. Supervision will be performed by Magali Michaut and Lodewyk Wessels.

Additional information

For further information, you can contact: Magali Michaut ( or Lodewyk Wessels (


Bachelor / Master student project: Assessing tumor heterogeneity in silico


Tumors originate from cells that have accumulated genetic mutations. Subsequently, tumors evolve over time, leading to genetically distinct subpopulations of cells within a single tumor. This so-called intra-tumor heterogeneity (ITH) poses several challenges in cancer research and therapy: How to treat such heterogeneous mixtures of cells? How to target at best all of them? The more heterogeneous a cancer, the higher it’s chance to ‘escape’ a treatment? If we know the path of tumor evolution, can we stop it?



Many bioinformatics tools have been developed that try to quantify ITH and/or to reconstruct the tumor’s evolution based on copy number changes, mutations and other information from DNA sequencing data. This project involves a literature search to catalogue applicable tools and their requirements, and subsequently installing and benchmarking a selection of tools on real clinical samples and in silico simulated data. The clinical data comprises multiple sequencing files from a small number of patients, for instance from different tumor locations or before and after treatment. The simulated data would involve mixing samples from different patients in various ratios to mimic cell populations. In addition, TCGA data can be used to validate methods on a large scale.

Project description

The project is in close collaboration with both the department of Pathology and the Computational Cancer Biology group at the Netherlands Cancer Institute (NKI) in Amsterdam. As an end product, we hope to have a pipeline that can be used in multiple research projects and can be run routinely on new clinical samples. Potential findings in our clinical samples can be included in a scientific publication. Experience with working in a Unix environment / command-line programming is a big plus, but not a strict requirement. For more information, please contact Marlous Hoogstraat.