Projects Using the CHTC
|CHTC Quick Facts||Jul'10-Jun'11||Jul'11-Jun'12||Jul'12-Jun'13|
|Million Hours Served||45||70||97|
Many researchers are currently using the CHTC for computational tasks. Here are descriptions of some of the researchers and research groups at the University of Wisconsin that work closely with the CHTC.
Juan De Pablo
Juan de Pablo and the Molecular Thermodynamics and Statistical Mechanics Research Group use computational resources provided by the CHTC to predict the motions of macroscopic objects through simulations of what their microscopic particles are doing.
John L. Markley
John uses the CHTC for research in connection with the Biological Magnetic Resonance Data Bank (BMRB).
Professor Phil Townsend of Forestry and Wildlife Ecology says Our research (NASA & USDA Forest Service funded) strives to understand the outbreak dynamic of major forest insect pests in North America through simulation modeling. As part of this effort, we map forest species and their abundance using multi-temporal Landsat satellite data. My colleagues have written an automatic variable selection routine in MATLAB to preselect the most important image variables to model and map forest species abundance. However, depending on the number of records and the initial variables, this process can take weeks to run. Hence, we seek resources to speed up this process.
Natalia de Leon
Ethanol per acre is determined by the amount of biomass per acre and the quality of that biomass. Quality includes the concentration of fermentable sugars, the availability of those sugars for fermentation, and the concentration of inhibitors to the fermentation process. We are using maize (Zea mays L.) as a model grass to identify genes and pathways underlying these traits. Maize is an excellent model both because it is a potential source of biomass for the lignocellulosic ethanol industry, and also because it is closely related to other important dedicated-bioenergy species including Miscanthus (Miscanthus giganteus) and switchgrass (Panicum virgatum). Our approach is to genetically dissect endogenous variation for biomass quantity and quality in maize, utilizing genetic mapping, association analysis and transcriptional profiling to identify genes and alleles that underlie phenotypic variation among maize genotypes. This forward genetic analysis provides an entry point into genes and pathways that could be further studied and manipulated to maximize ethanol production. For that end, we utilize cutting-edge genomic technologies, such as novel high throughput approaches to sequencing and genome-wide expression profiling technologies, as well as advanced computational procedures which utilize genomic information to understand the molecular basis of quantitative variation. This research is part of the Department of Energy-supported Great Lakes Bioenergy Research Center at the University of Wisconsin, Madison.
Barry Van Veen
The bio-signal processing laboratory develops statistical signal processing methods for biomedical problems. We use CHTC for casual network modeling of brain electrical activity. We develop methods for identifying network models from noninvasive measures of electric/ magnetic fields at the scalp, or invasive measures of the electric fields at or in the cortex, such as electrocorticography. Model identification involves high throughput computing applied to large datasets consisting of hundreds of spatial channels each containing thousands of time samples.
Steve Barnett uses the CHTC in conjunction with the IceCube South Pole Neutrino Observatory
CMS LHC Compact Muon Solenoid
The UW team participating in the Compact Muon Solenoid (CMS) experiment analyzes petabytes of data from proton-proton collisions in the Large Hadron Collider (LHC). We use the unprecedented energies of the LHC to study Higgs Boson signatures, Electroweak Physics, and the possibility of exotic particles beyond the Standard Model of Particle Physics. Important calculations are also performed to better tune the experiment's trigger system, which is responsible for making nanosecond-scale decisions about which collisions in the LHC should be recorded for further analysis.
We are working on investigating the impact that astrophysical jets have on the intergalactic medium (IGM) in galaxy clusters. These jets, eminating from supermassive black holes at the centers of galaxies, can significantly impact the IGM, which in turn impacts galaxy evolution. Computer simulations permit us to test various scenarios like determining how to heat the IGM.
Hazy Research Group
The Hazy Research Group of the department of Computer Sciences is led by Christopher M. Re, with interests in large-scale, and deep data analytics. A machine reading system is a large software system that extracts information and knowledge buried in raw data such as text, tables, figures, and scanned documents. For example, it can extract facts like "Barack Obama wins the 2012 election" from news articles, or "Barnett Formation contains 6% Carbon" from geology journal articles. To extract this kind of information, a machine reading system requires deep understanding and statistical analytics over large document corpora. In the Hazy Research Group, we are building a machine reading system that supports scientific applications like GeoDeepDive, and many other projects. Please visit our YouTube Channel for video overviews of our projects. We leverage the resources of the CHTC, and the national Open Science Grid to enable our machine reading system to quickly perform a whole host of computationally expensive tasks like statistical linguistic processing, speech-to-text transcription, and optical character recognition (OCR). For example, on a crawl of 500 million web pages, we estimated that our deep linguistic parsing would take more than 5 years on a single machine. With the help of CHTC, we were able to do it in just 1 week! Similarly, on a recent batch of 30,000 geology journal articles, we estimated that the OCR task would take 34 years on a single machine. With CHTC, it took about 2 weeks. Thanks CHTC!
David C. Schwartz
David uses the CHTC to comprehensively analyze human and cancer genomes. CHTC provices enough throughput for this project to map data representing the equivalent of one human genome in 90 minutes. See the LMCG (Laboratory for molecular and Computational Genomics) web site for project details.
Natalia B. Perkins
The abstract from the first paper which is based on this research is "Critical Properties of the Kitaev-Heisenberg Model", Craig C. Price and Natalia B. Perkins, Phys. Rev. Lett. 190, 187201. We study the critical properties of the Kitaev-Heisenberg model on the honeycomb lattice at finite temperatures that might describe the physics of the quasi-two-dimensional compounds, Na2IrO3 and Li2IrO3. The model undergoes two phase transitions as a function of temperature. At low temperature, thermal fluctuations induce magnetic long-range order by the order-by-disorder mechanism. This magnetically ordered state with a spontaneously broken Z6 symmetry persists up to a certain critical temperature. We find that there is an intermediate phase between the low-temperature, ordered phase and the high-temperature, disordered phase. Finite-sized scaling analysis suggests that the intermediate phase is a critical Kosterlitz-Thouless phase with continuously variable exponents. We argue that the intermediate phase has been observed above the low-temperature, magnetically ordered phase in Na2IrO3, and also, likely exists in Li2IrO3.
I use CHTC to do structural estimation of economic models. These problems boil down to finding a value in a parameter space that best satisfies some objective criterion that is derived from economic theory and is a function of the data. The difficulty is that the criterion is typically a complicated non-linear function of the parameter that requires some intensive computations for each observation in the data. For problems with lots of data, this can become unwieldy! Rather than work with an MPI model for distributed computing, I have found that exploring the landscape of the objective function with the high throughput CHTC approach allows me to much more quickly gain an understanding of the likely parameter values. This greatly facilitates the econometric estimation of problems that otherwise would not be feasible.
We use CHTC machines to carry out molecular dynamics simulations to study chemical reactions in solution and biomolecules. By using a semi-empirical QM/MM methods developed in our group, these calculations can run efficiently in a largely uncoupled fashion with the CHTC system. This type of computing environment is ideal for many studies that require running a large number of independent computations, such as constructing the free energy landscape of chemical reactions in the condensed phase.
In our research we're working on a novel computational tool for THz-frequency characterization of materials with high carrier densities, such as highly-doped semiconductors and metals. The numerical technique tracks carrier-field dynamics by combining the ensemble Monte Carlo simulator of carrier dynamics with the finite-difference time-domain technique for Maxwell's equations and the molecular dynamics technique for close-range Coulomb interactions. This technique is computationally intensive and each test runs long enough (12-20 hours) that our group's cluster isn't enough. This is why we think CHTC can help, to let us run more jobs than we're able to run now.