Sanger Institute's Debian cluster with 320 TB Lustre FS of its 1.5 PB storage for genome sequencing
contributed by andremachado, published on Mon Mar 10 15:03:38 2008 in success-stories
Wellcome Trust Sanger Institute, UK, uses a Debian cluster with 320 TB HP-SFS (Lustre) filesystem as part of its 1.5 PB storage for human genome sequencing
Wellcome Trust Sanger Institute, Hinxton, South Cambridgeshire, UK, runs a 640+ cores Debian GNU / Linux cluster with 320 Terabytes of "live data", like a giant virtual memory swap partition, out of its 1.5 Petabytes of storage.
Each of the 27 new technology robotic computerized genome sequencers generates 1 TB of image data each three days, at a 2 MB/s rate during a 3 day run.
This amount of data needs to be "live" during the sequencing and initial analysis, and with the processing needs of the scientific software on the Debian GNU / Linux 640+ cores cluster, the "swap-like" storage needs to provide 320 TB of space using the HP-SFS Lustre filesystem.
Antony Cox, PhD, the Head of Sequencing Informatics, and Phil Butcher, the Head of IT at the institute, gave an interview to The Guardian, presenting the Thousand Genome Project.
The project aims to accurately sequence one thousand individual human genomes to map all of their differences in 0,5% or more of the population sampled, and identify the places involved in the interactions between multiple DNA bases that cause different conditions.
Given that the human DNA has 3 billion bases, and each sampled base must be sequenced between 11 and 30 times to factor out measurement errors, this is one of the biggest computational biology efforts of today.
The project is unique not only because of dealing with 1.5 PB of storage, but for keeping 320 TB of "swap-like storage" for fast comparisons and calculations.
According to Butcher, genomics research is changing focus from the laboratory of glass tubes and moving to be more informatics focussed. The Sanger Institute started using Debian GNU / Linux when the world discovered how reliable and useful it can be. Now the institute has to compete with commercial organisations using Linux for system administrators able to manage large clusters with large-scale distributed filesystems.
You may read the interview for more details.
About the Wellcome Trust Sanger Institute
The Wellcome Trust Sanger Institute is one of the world's largest centres for DNA sequencing and analysis. It made the largest single contribution to the sequence of the Human Genome Project, contributed approximately 25% of the mouse genome sequence, is finishing the zebrafish genome sequence as well as making contributions to other model organism sequences, such as yeasts and the nematode C. elegans. Institute researchers have also contributed to the sequence of more that 60 finished genomes of bacterial pathogens, such as Salmonella typhi, TB, MRSA and Cdiff, as well as parasites such as those causing malaria, African trypanosomiasis and Leishmaniasis.
Investment in new- technology sequencing will dramatically increase the breadth and depth of genome analysis in humans, model organisms and pathogens.
You can contact Wellcome Trust Sanger Institute press Team here.
About the Debian Project
Debian GNU / Linux is one of the free libre operating systems (GNU/Linux, GNU/Hurd, GNU/NetBSD, GNU/kFreeBSD), running 18733+ officially maintained packages on 15 hardware platforms, from cell phones and network devices to mainframes and supercomputers, developed by more than two thousand volunteers from all over the world who collaborate via the internet on the Debian Project.
Debian's dedication to Free Libre Open Source Software, its constitutional non-profit nature, its open and meritocratic development model, organization and social governance make it a first among free libre operating system distributions.
The Debian project's key strengths are its volunteer base, its dedication to the Debian Social Contract, and its commitment to provide the best operating systems attainable, following a strict quality policy, working with an established QA Team.
You can help Debian Project without joining it and even not being a programmer, or being a development and or service partner company or institution at the Debian Partner Program, or simply making various donations to the Debian Project.
Debian Project news, press releases and press coverage can be found from the official Debian wiki page. PR contact at debian-publicity list.


