NGS/Hadoop Workshop: Next Generation Processing for Next Generation Sequencing

Dates: 19-20 February 2015.
Location: KTH - Royal Institute of Technology, Kista (Stockholm), Sweden


Videos from the workshop are available on YouTube.


A huge wave of NGS data has descended on Bioinformatics researchers, with the volume and velocity of this data still increasing at a fantastic rate. Recent developments in the Hadoop Ecosystem, with MapReduce, Apache Spark, and Security frameworks (Kerberos, tokens, Knox, Sentry, and Rhino), have provided tools with which we can potentially securely store and efficiently process this flood of NGS data. Hadoop project and its associated frameworks provide a platform for scaling to petabyte or even exabyte scale.


The workshop will bring together both bioinformaticians and systems researchers who have a shared interest and/or experience in genomics and Hadoop. We will have speakers who have experience building Hadoop NGS pipelines, (MapReduce SEAL, PigSeq and Cuneiform/HiWAY), new formats for representing NGS data in Hadoop, (Hadoop-BAM), and support for security in Hadoop (BiobankCloud). We will also have speakers on systems and operations issues of building and scheduling Hadoop pipelines in frameworks such as Spark and MapReduce on YARN. The second day of the workshop will be practical, where small working groups will work on specific problems in NGS data on Hadoop, including a hackathon on a platform such as Adam/Cuneiform/SEAL/PigSeq.

Speakers will include:

  • Jim Dowling (BiobankCloud, Hops)
  • Keijo Heljanko (HadoopBam, SeqPig)
  • Ulf Leser (Cuneiform/HiWAY)
  • Gianluigi Zanetti (Hadoop/Seal, SeqPig)
  • Jan Fostier (Halvade)
  • Other invited speakers to be announced.


The workshop will be held at KTH Royal Institute of Technology in Kista.
Address is: Isafjordsgatan 22, Kista (Electrum building).
Room: Sal A (ground floor)
This is the same location as SICS who has excellent travel and hotel information on their web site:
Scroll down a little on the page to come to travel information.

The workshop will be free of charge, as it is sponsored by SeRC and the EU FP7 project BiobankCloud.


Standard Ticket