Skip To Navigation Skip To Content
Center for Biomolecular Science & Engineering
Baskin School of Engineering
  • Home
  • People
    • Faculty Affiliates
    • QB3 Affliliates
    • Staff
    • UCSC Genome Browser Staff Photo 2005
  • Research
    • Advancing Biomedical & Biomolecular Research
    • Advancing Technology Development
    • Bioinformatics, Applied Math, Biostatistics
      • Bioinformatics & Computational Biology
      • Bioinformatics Research Projects
      • Bioinformatics Documentation
      • UCSC Genome Browser and Bioinformatics Tools
        • UCSC Genome Browser Overview
        • Genome Browser Licensing
      • ENCODE Project
      • Comparative Genomics
        • Human, Chimp, Mouse
      • Genome Research Primer
      • Human Genome Project
        • What is the Human Genome?
        • Human Genome Project Race
      • Cancer Genomics Browser
    • Experimental Genomics & Proteomics
    • Molecular & Cellular Biology
    • Stem Cell Research
    • Structural & Chemical Biology
    • Biochemistry & Biophysics
    • Computer Engineering & Scientific Visualization
    • Bioengineering and Biotechnology
      • Nanopore Project
        • Nanopore Analysis of DNA Molecules
          • Nanopore Movies
        • Nanopore Project Publications
        • Nanopore Project Members
    • Health & Environmental Science
    • Research Facilities
      • CBSE Computing Facilities
      • Microarray Facility
        • Microarray Equipment & Protocols
        • Microarray Database
        • Microarray Publications
    • Funding Opportunities
    • Ethics
  • Education
    • Programs
    • Resources
    • Undergraduate Scholarships
    • Graduate Fellowships
    • Postdoc Fellowships
    • Academic Calendar
  • Diversity
    • Diversity Outreach
    • RMI & Diversity Awards
      • Diversity Award Student Projects
      • Undergraduate Scholarships
      • RMI Graduate Fellowships
    • Events & Classes
    • Professional Organizations and Recruitment Events
    • Summer Programs
    • Government Programs
    • UCSC Resources
    • State and National Orgs
    • Resources for Educators
  • News & Events
    • Recent News
    • News Archive
    • Upcoming Events
    • Events Archive
    • CBSE in the News
  • Jobs
    • Staff Positions
    • Faculty Positions
    • Postdoc Positions
  • About Us
    • Partners
      • Academic Partners
      • Industry Partners
    • About Our Logo
    • Contact Us
    • Directions
You Are Here:
Home » Research » Research Facilities
» CBSE Computing Facilities

CBSE RESEARCH

CBSE Research Interests

Technology Development

Research Facilities

Funding Opportunities

Ethics

Related Links

UCSC Genome Browser

UCSC Genome Browser overview

Genome research primer

CBSE Computing Facilities

PitaKluster block

PitaKluster block


Computing Clusters

Two large parallel processor clusters, called Swarm and PitaKluster, and a bank of web servers support the UCSC Genome Browser and its associated tools and databases. These facilities also support much of the computational genome research conducted within CBSE. Swarm, the newest of the two clusters, consists of 256 quadcore Intel Xeon processor compute nodes, each with 8 gigabytes of memory. Swarm has a total of 1024 cores on 4 double-sided racks; it has a theoretical maximum flop rating of over 10,000 gigaflops/second. It runs Rocks Linux, an optimized Linux distribution for clustering applications. The PitaKluster consists of 198 dual AMD Opteron processor compute nodes, each having 4 gigabytes of memory, housed in three Rackable storage units. The PitaKluster's 396 processors, which also run on a Linux operating system, can perform over a trillion instructions/second. Both systems were designed to provide an exceptional amount of inexpensive computing power in minimal space. For memory-intensive jobs, CBSE employs a cluster of 8 machines with dual dual-core AMD processors, each with 32 gigabytes of memory. These computational clusters are supported by a parallel file system, Hive, spread across 16 data storage servers and 4 metadata servers via GPFS. Hive provides multiple redundant facilities and holds up to 160 terabytes of data. CBSE also employs a computer with 32 gigabytes of memory for software and database development. It will be soon replaced with an 8x Quad Core AMD processor machine with 256 gigabytes of RAM. These computational clusters are supported by several file servers, providing almost 40 terabytes of network storage.

The CBSE systems administration team keeps these computing resources, including the PitaKluster shown here, up and running 24/7. Photo by Branwyn Wagman

The CBSE systems administration team keeps these computing resources, including the PitaKluster shown here, up and running 24/7. Photo by Branwyn Wagman

Web Servers

The web servers for the UCSC Genome Browser consist of 8 dual AMD Opteron processors; each offers 1.6 terabytes of internal storage and 8 gigabytes of memory. These machines have access to a central file server that provides 5 extra terabytes of shared disk area. Fifteen additional servers provide web access to BLAT (Blast-like alignment tool) software. Most of these machines have 16 gigabytes of memory—several have up to 64 gigabytes—to facilitate the BLAT software's memory-intensive calculations. Finally, a download server allows users to download our data; it serves over 600 gigabytes of data every day. Our web servers are hosted by the UCSC ITS Data Center, which is designed to function 24/7, 365 days a year.

Why Parallel Processors?

Computer clusters such as these are a cost effective way to process large amounts of data. Since bioinformatics problems are “embarrassingly parallel,” they do not require high speed inter-process communication to perform calculations. This eliminates the need for high-priced networking equipment. Taking advantage of this fact by employing parallel but separate computation by many processors, we have pioneered the development of “super-computing on-the-cheap” for the specific needs of genome presentation, annotation, and analysis.

The Swarm cluster is the fourth-generation bioinformatics cluster at UCSC, operating alongside the third-generation PitaKluster, which gradually took over for the second-generation system, The KiloKluster. The first generation was a cluster of 100 Pentium III processors that was built to assemble the first working draft of the human genome in June of 2000, using a 10,000-line program written by Jim Kent called GigAssembler.

These computing systems are funded through the Howard Hughes Medical Institute, the National Human Genome Research Institute (NHGRI), the California Institute for Quantitative Biosciences (QB3), and the National Cancer Institute.

Center for Biomolecular Science & Engineering • 1156 High St, Mail Stop CBSE/ITI, Santa Cruz, CA 95064
Phone: 831-459-1477 • Fax: 831-459-1809 • E-Mail:

Questions about the UCSC Genome Browser? E-Mail

© 2009 CBSE. All rights reserved. • Last Modified On April 27, 2009 At 12:20 PM

UCSC Home • BSOE Home • CBSE Home • Internal • Log In