Leveraging open source platforms for increased scalability, greater computing power and reduced IT costs.
Done.
The Institute for Genomic Research (TIGR) turned to Unisys JBoss experts for extensive training and knowledge transfer necessary to realize the full value of open source.
TIGR recognized the performance and cost advantages of open source, but also the need to leverage best practices and expert knowledge for a smooth migration.
After working with Unisys, TIGR had the skills and knowledge it needed to move forward with its migration to a JBoss-based platform, and realize the full benefits of open source platforms. Specifically, TIGR gained:
The Institute for Genomic Research (TIGR) is a non-profit organization whose primary mission is to sequence and analyze the genomes of pathogens and other organisms. Since its founding in 1992, TIGR has been at the forefront of the genomics revolution; in 1995, it was the first to publish the complete DNA sequence of a free-living organism (the bacterium Haemophilus influenzae). Along with its partners, TIGR has played a leading role in a number of major genomics landmarks, including sequencing the complete genomes of more than 50 organisms and microbial strains.
Though it must manage its IT investments carefully, TIGR is highly reliant on its infrastructure and its ability to process huge amounts of data. For instance, after TIGR sequences the DNA for a particular organism, it must begin a series of intensive secondary calculations for genome assembly and gene finding.
These calculations run on a compute grid composed of hundreds of 32-bit Intel processors and dozens of 64-bit Opteron processors configured with up to 32 GB RAM per node. TIGR’s IT department manages the grid with the open source SGE NI grid engine. TIGR’s Bioinformatics department maintains scientific application software written in C++, Java, Perl, and Python. TIGR’s compute jobs require access to the networked file system, MySQL databases, and Sybase databases.
Much of TIGR’s code was structured as two-tier applications which did not scale with TIGR’s grid. For instance, some Perl programs used the DBI interface to connect directly to a Sybase database; but, when run on the grid in a massively parallel fashion, they could fail due to saturation of database connections.
In addition to restricting growth, the two-tier applications also limited modernization because they were heavily used and tightly coupled to particular databases. Martin Shumway, Director of Software Engineering at TIGR, said, “Changing the database schema had become so costly as to be nearly impossible. Migration to three-tier had to happen before we could bring our sequencing data model out of the 1990s.”
The two-tier applications were also too tightly coupled to the TIGR environment to be shared with external collaborators and the open source community, to which TIGR is strongly committed. “We had been contributing genomic data and applications to the open source community for years, and yet our internal sequencing and tracking systems had remained closed,” said Shumway.
TIGR’s Bioinformatics department undertook the re-writing of the two-tier applications to make them more modular and scalable, as well as standards-compliant and easier to maintain. The group decided to extend TIGR’s battery of open source infrastructure, which already included Suse Linux, Apache Tomcat, and SGE. The group settled on JEE standards and the JBoss JEMS suite in particular.
“We knew the value of open source, but we also faced all the risks associated with introducing new technology,” said Jason Miller, Software Manager at TIGR. One of Miller’s critical concerns was developing the necessary skills and knowledge to operate and maintain the infrastructure at top performance levels. After initial struggles to “bootstrap” the migration, he brought in the JBoss experts at Unisys.
To generate the most value from open source, organizations must not only avoid common risks and pitfalls, but also leverage best practices from open source leaders. “The fact that Unisys was a JBoss premier partner gave us confidence that they had the experience and knowledge we needed,” said Miller.
Unisys designed a highly interactive and hands-on workshop tailored to TIGR’s needs. The curriculum addressed key issues, including JEE best practices, JBoss architecture and migration strategies for TIGR.
In addition to presentations, Unisys developers and architects worked side-by-side with TIGR developers to demonstrate how best practices translate into actual code. They even led hands-on migrations of TIGR applications into JBoss. The core migration team, plus about 20 other developers, attended the weeklong workshop.
Other work completed during the workshop included: