Humboldt University

Founded in 1810 according to the concept of Wilhelm von Humboldt, the Humboldt-Universität zu Berlin was the “Mother of all modern universities“. She still pays tribute to her foundation history of reform orientation by planting innovative ideas into study, research and organization. Organized into 11 faculties, HU has more than 400 full professors and approximately 40.000 students in 170 study programs. The department for computer science comprises 22 faculty members with a strong focus on databases, distributed systems, software engineering, and algorithms and logic. The department is a member of several publicly funded national graduate schools, collaborative research centres, and large EU-funded international research projects, for instance in the area of distributed sensor network (graduate school METRIK), service-oriented software systems for medical applications (graduate school SOAMED), or information management in cloud infrastructures (research unit STRATOSPHERE). It is located in the quickly growing campus Adlershof, a major focus point of innovative research and development in Berlin.

The group on “Knowledge Management in Bioinformatics“ has a proven record of successful research and applications especially in biomedical databases, scientific workflows, statistical analysis of high-throughput experiments, and biomedical text mining. It gathers more than 15 years experience in managing and analyzing biomedical data sets, especially from high-throughput experiments. Previous and current projects in this area are concerned with analysis and management of proteins and their function, phenotype data, cancer-related knowledge bases, protein-protein-interaction networks, and sequence / microarray / ChIP-Chip / ChIP-seq data sets. The group has developed several popular integrated biomedical databases (e.g. IXDB, COLUMBA, ALIBABA). The group participates in the German collaborative project Stratosphere (DFG-funded) which develops a cloud-based system for large-scale distributed data management. Being based on database technology, this system builds an excellent basis for extensions to also plan, optimize, execute and monitor the more liberal data flows described in scientific workflow systems. Within the BMBF-funded project ColoNet, the group already is involved in analysis of large amounts of EXOM data derived from next generation sequencing. The group has also developed efficient algorithms for generating and indexing large sets of sequence data (PETER, PEARL).