Dear Colleague,
 
The 2015 Workshop on Compression, Text and Algorithms (WCTA, http://www.dcs.kcl.ac.uk/events/spire2015/workshops.html) will take place at King's College, London, the day after the Symposium on String Processing and Information Retrieval (SPIRE), i.e., September 4.
 
We invite abstracts for presentations (15--25 minutes) of recent results and surveys of interest to the string-processing community.  We particularly encourage submissions from junior members of our community.  Since WCTA has no proceedings, presenting results there should not preclude submitting them to other conferences or publishing them in journals.  Results already accepted (or even already recently presented) elsewhere are also welcome.

Please submit abstracts by emailing copies (preferably PDF) to *both* WCTA co-chairs (addresses below).

Submission deadline: August 7th, 2015 (anywhere on Earth)
Notification: August 21st, 2015

WCTA will feature an invited talk on "Using Suffix Array Based Data Structures in Computational Genomics" by Richard Durbin, FRS, from the Wellcome Trust Sanger Institute, and a tutorial on "Compact and Succinct Data Structures -- From Theory to Practice" by Simon Gog from the Karlsruhe Institute of Technology (abstracts below).  WCTA will be free for SPIRE attendees; there may be a small fee for those attending only the workshop.

Best regards,
 
Travis Gagie,
University of Helsinki
<travis.gagie@gmail.com>
 
Tatiana Starikovskaya,
University of Bristol
<tat.lastname@gmail.com>
 
(WCTA co-chairs)

==========

Title:
Using Suffix Array Based Data Structures in Computational Genomics
 
Abstract:
This talk will describe how the BWT and FM-indexes are used in read mapping (e.g., with BWA), sequence assembly (e.g., with SGA and Fermi), and haplotype matching and storage (with the Positional BWT).

BWA: http://www.ncbi.nlm.nih.gov/pubmed/19451168 http://www.ncbi.nlm.nih.gov/pubmed/20080505 http://bio-bwa.sourceforge.net/
SGA: http://www.ncbi.nlm.nih.gov/pubmed/22156294 https://github.com/jts/sga
Fermi: http://bioinformatics.oxfordjournals.org/content/28/14/1838 https://github.com/lh3/fermi
PBWT: http://www.ncbi.nlm.nih.gov/pubmed/24413527 https://github.com/richarddurbin/pbwt

Bio:
Richard Durbin, FRS, is Acting Head of Computational Genomics at the Wellcome Trust Sanger Institute and leader of the Genome Informatics group. He studied mathematics at Cambridge and earned a PhD on the development and organization of the nervous system in C. elegans. He has developed numerous methods for computational sequence analysis, co-authored a textbook on this subject, and co-leads the international 1000 Genomes Project. He was a joint winner of the Mullard Award of the Royal Society in 1994 (for work on the confocal microscope), won the Lord Lloyd of Kilgerran Award of the Foundation for Science and Technology in 2004, and was elected a Fellow of the Royal Society in 2004 (for contributions to computational biology) and a member of the European Molecular Biology Organization (EMBO) in 2009.

==========

Title:
Compact and Succinct Data Structures -- From Theory to Practice

Abstract:
For decades index structures where build on top of the data to enable users to efficiently carry out queries. For instance suffix trees or arrays were built on top of a text to answer pattern matching queries in a time complexity which is independent from the text length.

Unfortunately, these traditional pointer-based index structures often take significantly more space than the original data and therefore can not be used in scenarios where the data itself is not much smaller than the available main memory. In the last 25 years researches invented space-efficient counterparts for many index structures which use not much more space than the original data and have the same query complexity in theory. In this talk we will review popular examples of compact and succinct structures -- ranging from bit vectors over wavelet trees to compressed suffix trees -- and show how they can be easily used in applications by employing the Succinct Data Structure Library (SDSL). We will further show how to use the library's facilities to analyse, measure, and monitor time and space requirements of structures. Finally, we will learn how more complex structures can be composed and integrated in the existing framework.

SDSL: https://github.com/simongog/sdsl-lite

Bio:
Simon Gog obtained his PhD from Ulm University where he was working on space-efficient index data structures with applications in Bioinformatics. The Succinct Data Structure Library project was started in Ulm and continued at the University of Melbourne, where he was working as a PostDoc on compressed external index structures. Currently, Simon is working at the Karlsruhe Institute of Technology on compact index structures for applications in the area of Information Retrieval.