2. Antibiotic resistance mediated by multi-drug efflux transportors
3. Computational Flux Evaluation of Microbial System
5. Develop a web interface for statistical structural information
6. Evolutionary Analysis of Dengue Virus Serotype 2 Envelope Proteins
7. Expanding starch network in Arabidopsis for biofuel applications
9. Mechanism of evolution of fungal pathogen
10. Messengers and switches. How do they work at the atomic level?
11. Methods for analyzing short-read sequence data
12. Molecular Dynamics of RNA Aptamers
13. Motion conservation within a family of proteins
14. Small RNAs in plant regulation and development
17. Bioinformatic Analysis of the Genomic Sequence of the Buffalo (bubalus bubalis)
18. Designing Better Zinc Finger Proteins for Gene Therapy
19. Developing Tools for cracking the Protein-RNA Recognition Code: RNABindR & PRIDB
20. Predicting structure and functional sites in the human telomerase RNP complex
21. Analysis of Illumina Sequences for Multiple Strains of Mycoplama hyopneneumoniase
1. 2010 Arabidopsis Metabolomic
Dr. Julie Dickerson (PI)
2010 Plant Metabolomics is using metabolomics as hypothesis-generating functional genomics tool for Arabidopsis genes whose functionality are unknown (GUFs). Metabolomics data were generated from eleven analytical platforms and combined across platforms to formulate initial hypotheses about GUFs. A public database (www.PlantMetabolomics.org) has been developed to provide the scientific community with access to the data along with tools to allow for its interactive analysis. The project will involve improving the on-line analysis methods available on the web site.
2. Antibiotic resistance mediated by multi-drug efflux transportors
Edward Yu (PI)
Our long-term goal is to elucidate the structures by crystallography and develop the fundamental mechanisms that give rise to multiple drug recognition and extrusion in these transporters.
3. Computational Flux Evaluation of Microbial System
Jackie Shanks (PI), Jong Moon Yoon & Ting Wei Tee
Metabolic flux analysis (MFA) is an important tool to quantify the intracellular fluxes in the central metabolism of microorganism. From the flux distribution, it provides insights to wisely select engineering targets to improve the strain performance. Conventional MFA, solely based on stoichiometry of a given biochemical reaction network, is not capable to solve for parallel, reversible and bidirectional reaction fluxes. This limitation is overcome by performing 13C labeling experiment. With intracellular labeling information from NMR or GC-MS and measured extracellular fluxes from HPLC, we are evaluating intracellular fluxes for S.cerevisiae under different levels of fatty acid for development of strain with higher tolerance. However, the computation procedure is complicated and mathematically involved. Therefore, our group developed NMR2Flux program to evaluate a unique flux solution by iterations. Flux evaluation begins with an arbitrary set of fluxes and verified for stoichiometric feasibility. The feasible fluxes are converted to isotopomer distribution using Boolean mapping function method with isotopomer and cumomer balances. Chi square error corresponding to the fluxes is calculated from the experimental and simulated isotopomer abundance. The best fit intracellular flux set satisfies the reaction stoichiometry and has the least chi-square error. A statistical analysis is performed to account for isotopomer measurement error and converts them to confidence intervals of the fluxes using Monte Carlo simulation. In addition, we are doing identifiablity analysis which is led to more reliable flux values by designing optimal mixture of labeled substrates in 13C labeling experiment.
References:
Sriram, G., D. B. Fulton, et al. (2004). "Quantification of compartmented metabolic fluxes in developing soybean embryos by employing Biosynthetic ally directed fractional C-13 labeling, C-13, H-1 two-dimensional nuclear magnetic resonance, and comprehensive isotopomer balancing." Plant Physiology 136(2): 3043-3057
Wiechert, W., M. Mollney, et al. (2001). "A Universal Framework for 13C Metabolic Flux Analysis." Metabolic Engineering 3(3): 265-283.
Wiechert, W. (2001). "C-13 metabolic flux analysis." Metabolic Engineering 3(3): 195-206. Sriram, G., D. B. Fulton, et al. (2007). "Flux quantification in central carbon metabolism of Catharanthus roseus hairy roots by C-13 labeling and comprehensive bondomer balancing." Phytochemistry 68(16-18): 2243-2257.
Sriram, G. and J. V. Shanks (2004). "Improvements in metabolic flux analysis using carbon bond labeling experiments: bondomer balancing and Boolean function mapping." Metabolic Engineering 6(2): 116-132.
4. Computational re-design of protein-protein interfaces: a tool to study immune cell signaling at the molecular level.
Scott Boyken, Mike Zimmermann & Amy Andreotti (PI)
You will use Rosetta and other computational tools to predict mutants that will increases binding affinity of Itk self-association. Itk is a kinase that is essential for proper T-cell signaling and subsequent immune response; Itk is implicated in allergies, auto-immune diseases, and is a potential drug target. Preliminary results in the Andreotti lab indicate that the self-association of Itk down-regulates its activity. Identifying increase-of-affinity mutants will help us better understand this mechanism and how immune cells signal at the molecular level. The predictions you make will then be experimentally tested via NMR by other members of the lab, and you will then have a chance to be part of the computational analysis of the NMR experiments.
5. Develop a web interface for statistical structural information
Zhijun Wu (PI)
The data that have been collected included the statistics for distance, bond angles and torsion angles from a set of protein structures. The challenge is to design a web interface that is attractive and provides the best access to this dataset.
Evolutionary Analysis of Dengue Virus Serotype 2 Envelope Proteins Ragothaman M. Yennamalli, Taner Z. Sen Dengue virus serotype 2 is the most prevalent form seen in endemic regions of the world. The virus infects the mammalian cell with the help of envelope (E) protein, thus making it the first step in infection. Although multiple protein sequences are available till date for E protein from serotype 2, it is still unclear as to how certain mutation affect the infectivity of the protein. The study would involve sequence analyses of E proteins from serotype 2. The outcome of the work will contribute to the efforts of designing suitable drug candidates for future tetravalent vaccine development studies.
6. Evolutionary Analysis of Dengue Virus Serotype 2 Envelope Proteins
Ragothaman M. Yennamalli & Taner Z. Sen (PI)
Dengue virus serotype 2 is the most prevalent form seen in endemic regions of the world. The virus infects the mammalian cell with the help of envelope (E) protein, thus making it the first step in infection. Although multiple protein sequences are available till date for E protein from serotype 2, it is still unclear as to how certain mutation affect the infectivity of the protein. The study would involve sequence analyses of E proteins from serotype 2. The outcome of the work will contribute to the efforts of designing suitable drug candidates for future tetravalent vaccine development studies.
7. Expanding starch network in Arabidopsis for biofuel applications
Ragothaman M. Yennamalli & Taner Z. Sen
The study involves predicting protein-protein interaction partners of proteins in the starch synthesis network using Arabidopsis thaliana as a model organism. Understanding and expanding this network from a systems biology perspective will provide insights into the development of more efficient biofuels in other plant systems, such as maize. During the course of the project, various methods will be used to build consensus predictions for an array of proteins related to the starch network.
8. Kinases: molecular Interactive molecular dynamics (IMD) with with constraints from Elastic Network Models (ENM).
Mike Zimmermann & Robert Jernigan (PI)
IMD is facilitated through the use of VMD and NAMD. One can also define custom forces to pull or direct the motion of molecules. We want to investigate utilizing these custom forces based on the directions from the ENM modes. The IMD simulation could then progress under the influence of the normal modes. This could be applied to coarse grained MD.
9. Mechanism of evolution of fungal pathogen
Madan Bhattacharyya & Xiaoqiu Huang
We have sequenced the soybean pathogen Fusarium virguliforme that causes SDS disease. Currently we are investigating the genomic regions that carry high levels of SNPs to uncover the mechanisms of genome evolution in this fungal pathogen.
10. Messengers and switches. How do they work at the atomic level?
Scott Boyken, Mike Zimmermann & Amy Andreotti (PI)
Kinases serve as molecular messengers, carrying out a cellular game of telephone connecting the biochemical activity of different parts of the cell (for example, extracellular signals to gene expression). In order to carry out these complex signaling pathways, the catalytic activity must tightly regulated, often drawing analogies to circuits (i.e. AND, OR, NOT logic gates). You will perform molecular dynamics simulations to shed light on the atomic level details of how these kinases are turned on and off, specifically comparing the Src and Tec family kinases, which are of great interest for understanding certain cancers and immune cell signaling respectively. These two families share very similar mechanisms that result in very different levels of activity. In the Andreotti lab, we have a plethora of biochemical data on these kinases which will be used to direct and compare these computational efforts, and we hope these simulations can shed light on the atomic details and motions that we cannot currently test experimentally.
11. Methods for analyzing short-read sequence data
Dr. Julie Dickerson (PI)
PLEXdb is a database of plant and pathogen expression data. We are investigating extending our on-line searching and analysis tools to short-read sequence data. The project will involve downloading and processing existing data from the NCBI Short Read Repository and comparing this data to comparable microarray expression data.
12. Molecular Dynamics of RNA Aptamers
Marit Nielsen-Hamilton (PI) & Monica Lamm
A model will be based on the NMR structure and sequence variations with be computationally investigated to improve aptamer affinities for ligands.
13. Motion conservation within a family of proteins
Mike Zimmermann & Robert Jernigan (PI)
How conserved are motions within a related family of structures. If the motion is functionally important then it should be conserved. An initial investigation of some pFAM families was promising.
14. Small RNAs in plant regulation and development
Steve Whitham (PI)
The generation of small RNAs is critical to defense against viruses and protecting the genome against high transposon activities. In addition, some small RNA species, in particular micro RNAs, play key roles in regulation of growth and development. We have recently sequenced small RNA populations in cucumber, melon, and squash in both healthy plants and those infected with a virus. We are interested in using this information to 1) define and catalog the small RNA species present in members of the cucurbit family, 2) compare the small RNA species present in the three different cucurbits, 3) map cucumber small RNA species to the genome, and 4) to study the effects of virus infection on the profile of cucurbit small RNAs.
15. Structural, physicochemical, topological, and geometric characterization and prediction of macromolecular (protein-protein, protein-DNA, and protein-RNA) interactions and interfaces
Vasant Honavar (PI), Drena Dobbs, Robert Jernigan, Li Xue, Rafael Jordan, Fadi Towfic & Rasna Walia
We have developed machine learning approaches to analysis and prediction of protein-protein, protein-DNA, protein-RNA interfaces and epitopes from amino acid sequence (and whenever available), protein structure. We have also constructed comprehensive databases of protein-protein and protein-RNA interfaces. We have recently developed sequence homology based approaches to prediction of protein-protein interface residues. There are several opportunities available for summer students to work on machine learning and other approaches to reliable prediction of protein-protein and protein-nucleic acid interfaces. Additional information about the projects can be found at: http://www.cs.iastate.edu/~honavar/ailab/projects/proteins.html
16. TALE Binding Site Tools
Adam Bogdanove (PI)
& Erin Doyle
Transcription activator-like effectors (TALEs) from the bacterial genus Xanthomonas bind directly to host DNA and activate transcription of genes necessary for bacterial survival. TALE binding site specificity is determined by a "code" in which specific amino acids in the protein's central repeat region correspond to nucleotides in the binding site 1,2. Based on this TALE-DNA binding code, we have developed tools to identify binding sites for known TALEs in a DNA sequence, to locate potential TALE binding sites in a sequence and design TALEs to target these sites, and to search for potential off-target binding sites. Students will work to improve these existing tools, assist with the creation of new tools to optimize the design and assembly of engineered TALEs, and develop a web-based interface to make these tools publicly available. 1. Boch, J., Scholze, H., Schornack, S., Landgraf, A., Hahn, S., Kay, S., Lahaye, T., Nickstadt, A., and Bonas, U. (2009). Breaking the code of DNA binding specificity of TAL-type III effectors. Science 326, 1509-1512. 2. Moscou, M.J., and Bogdanove, A.J. (2009). A simple cipher governs DNA recognition by TAL effectors. Science 326, 1501.
17. Bioinformatic Analysis of the Genomic Sequence of the Buffalo (bubalus bubalis)
Jim Reecy (PI)
The analysis of the buffalo genome is being done in collaboration with the National Bureau of Animal Genetic Resources in India. Some preliminary analysis of the genome sequence has been done with more analysis to continue. The work to be done will include the use of different bioinformatic software in both a linux and a windows environment, possible script programming in Java and Perl, and possible statistical analysis of different techniques used in the analysis. A list of potential software that would be used includes BLAST, bowtie, Exonerate, Perl, Java, SAS, as well as others.
18. Designing Better Zinc Finger Proteins for Gene Therapy
Drena Dobbs Group (PI)
Sangamo Biosciences recently initiated a human clinical trial to assess the efficacy of a Zinc Finger Nuclease (ZFN)-based gene therapy approach to treat AIDS. More general use of ZFNs and other Zinc Finger Proteins (ZFPs) will require protein engineering to improve the binding specificity and affinity of ZFPs in vivo. We are using a combination of computational and wet-lab experiments to design and functionally evaluate improved ZFPs that specifically recognize and bind unique sequences in genomic DNA. Our server, Zinc Finger Targeter (ZiFiT), facilitates the design of novel ZFPs as well as the discovery of rules that govern protein-DNA interactions. We have implemented ZiFDB (a database of experimentally evaluated ZFPs) and ZFNGenome (a gBrowse based server that provides researchers with "targetable" zinc finger nuclease (ZFN) sites in the context of the entire genome of sequenced model organism.
Dobbs lab
projects involve collaborations with Miller (ComS, ISU), Voytas (U Minnesota) and Joung (MassGen/Harvard), and include:
1) design of ZFPs with
improved binding specificity and affinity
2)development of
algorithms for distinguishing ZFPs that bind DNA vs RNA vs protein
3) development of high
throughput DNA binding assays (e.g., FP, SPR, PBM-based) to evaluate affinity & specificity of designed ZFPs
Preferred skills:
Some computer programming ability & basic biology
Web Site:
ZiFiT: http://bindr.gdcb.iastate.edu/ZiFiT
ZiFDB:
http://www.zincfingers.org/
ZFNGenome:
http://bindr2.gdcb.iastate.edu:8888/cgi-bin/gbrowse/arabidopsis/
References:
ZiFiT (Zinc Finger Targeter): an updated zinc finger engineering tool.
Sander JD, Maeder ML, Reyon D, Voytas DF, Joung JK, Dobbs D. Nucleic Acids Res. 2010 Apr 30. [Epub ahead of print]
An affinity-based scoring scheme for predicting DNA-binding activities of modularly assembled zinc-finger proteins.
Sander JD, Zaback P, Joung JK, Voytas DF, Dobbs D. Nucleic Acids Res. 2009 Feb;37(2):506-15. Epub 2008 Dec 4.
Rapid "open-source" engineering of customized zinc-finger nucleases for highly efficient gene modification. Maeder ML, Thibodeau-Beganny S, Osiak A, Wright DA, Anthony RM, Eichtinger M, Jiang T, Foley JE, Winfrey RJ, Townsend JA, Unger-Wallace E, Sander JD, Muller-Lerch F, Fu F, Pearlberg J, Gobel C, Dassie JP, Pruett-Miller SM, Porteus MH, Sgroi DC, Iafrate AJ, Dobbs D, McCray PB Jr, Cathomen T, Voytas DF, Joung JK. Mol Cell. 2008 Jul 25;31(2):294-301.
19. Developing Tools
for cracking the Protein-RNA Recognition Code: RNABindR & PRIDB
Drena Dobbs Group (PI)
Protein-RNA interactions play critical roles in many essential biological processes, including regulatory roles in transcription and translation that have been discovered very recently. We are developing tools to predict which amino acids in proteins bind RNA, and which nucleotides in RNA bind protein. The long-term goal is to decipher the molecular recognition code that mediates protein-RNA interactions.
Dobbs lab projects involve collaborations with Honavar(ComS) & Jernigan (BBMB) groups, and include:
1) design, implementation and evaluation of improved machine learning algorithms to predict RNA binding sites in proteins (& protein binding sites in RNAs); implementation of new capabilities in our web-based server, RNABindR
2) implementation of a web-based server for our Protein-RNA Interface Database (PRIDB), a comprehensive resource for analysis, characterization and visualization of structurally-characterized RNA-protein complexes
Preferred skills:
Some computer programming ability & basic biology
Web
Resources:
RNABindR: http://bindr.gdcb.iastate.edu/RNABindR/ PRIDB: coming soon, with your help!
References:
Struct-NB: predicting protein-RNA binding sites using structural features. Towfic F, Caragea C, Gemperline DC, Dobbs D, Honavar V. Int J Data Min Bioinform. 2010;4(1):21-43.
RNABindR: a server for analyzing and predicting RNA-binding sites in proteins. Terribilini M, Sander JD, Lee JH, Zaback P, Jernigan RL, Honavar V, Dobbs D. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W578-84. Epub 2007 May 5.
20. Predicting structure and functional sites in the human telomerase RNP complex
Drena Dobbs Group ((PI)
The telomerase enzyme plays pivotal roles in cellular senescence and aging. Because it provides a telomere maintenance mechanism for ~90% of human cancers, it is a promising target for cancer therapy. Telomerase is a ribonucleoprotein (RNP) that adds telomeric DNA repeat sequences to the ends of linear chromosomes. Despite its importance, a high-resolution structure of the complete telomerase enzyme complex has been elusive.
Dobbs lab projects involve collaboration with the Jernigan/Kloczkowski and Honavar groups, and include:
1) using computational methods to generate structural models for the individual subunits (both protein and RNA) of telomerase, and using docking and theading methods to generate a model of the complete telomerase complex, both bound to telomeric DNA and unbound.
2) using machine learning algorithms to predict which residues in the hTERT protein interact with DNA, RNA and other proteins and which residues are sites for post-transcriptional modifications
Preferred skills:
Some computer programming ability & basic biology
Web Resources:
http://telomerase.asu.edu/
http://www4.utsouthwestern.edu/cellbio/shay-wright/intro/sw_intro.html
References:
The telomere story or the triumph of an open-minded research. Gilson E, Segal-Bendirdjian E. Biochimie. 2010 Apr;92(4):321-6. Epub 2010 Jan 22.
Structures of telomerase subunits provide functional insights. Sekaran VG, Soares J, Jarstfer MB. Biochim Biophys Acta. 2010 May;1804(5):1190-201. Epub 2009 Aug 7. Review.
Striking similarities in diverse telomerase proteins revealed by combining structure prediction and machine learning approaches. Lee JH, Hamilton M, Gleeson C, Caragea C, Zaback P, Sander JD, Li X, Wu F, Terribilini M, Honavar V, Dobbs D. Pac Symp Biocomput. 2008:501-12.
21. Analysis of Illumina Sequences for
Multiple Strains of Mycoplama hyopneneumoniase
Chris Minion (PI)
The genome is only 900kb, but he has sequence information for 9-10 different strains. Sequences of additional strains will be obtained this summer. The challenge will be to mine this kind of DNA sequence information. He will be available to discuss the project after Wednesday morning.
22.
Identifying genetic molecular
markers from next generation sequencing (454 and Illumina) of the western
terrestrial garter snake (Thamnophis elegans)
Tonia Schwartz, John Van Hemert & Anne Bronikowski (PI)
We have recently sequenced a garter snake transcriptome (mRNA) using 454 next generation sequencing, and are currently doing additional sequencing - Illumina RNA-seq - for gene expression analysis. With all this sequence data we have an opportunity to identify molecular makers that will be useful for future studies on this species.
This
project would have two major goals.
1) Mitochondrial DNA.
a. Use current snake mitochondrial genomes available in GenBank to do a local BLAST against our genetic data (454 and Illumina) to pull out all the mitochondrial DNA sequences.
b. Use these sequences to assemble the mitochondrial genome (the coding portion) for Thamnophis elegans.
c. Align all the sequences for each mitochondrial gene and quantify the amount of mitochondrial sequence diversity (i.e. SNP diversity) for each mitochondrial coding gene.
2)Microsatellites are repeat regions in the genome that are useful for population and parentage studies, we would like to identify all the microsatellite regions in our data that could potentially be used in future studies.
a. Write perl scripts to pull out all the microsatellites (greater then 12 repeat units) from the 454 sequencing data and classify them by repeat motif classes (e.g. AC, GC, CGT, CAGA, etc.), number of repeats units, and the length of the flanking regions (none repeat regions) on either side of the repeat units.
23. Integration and visulazation of genome-wide data (e.g., genome annotation, microarray, RNA-seq, ChIP-seq)
Pat Schnable (PI)
The emergence of high-throughput genome tools have led to the deposition of huge amounts of genome-wide data (e.g., genome annotation, microarray, RNA-seq, ChIP-seq). Integration of these data will enhance our understanding of the associations between these data and biological phenomena. Visualization is an important approach for exploring these genomic data. A variety of genome browsers (e.g., IGB, genome browser, Jbrowse) are available and needed to be adapted for existing data. This project will convert data from different platforms, mainly microarray and next-generation sequencing data, into appropriate formats required by the browsers. The formatted data will be loaded into browsers for visualization and exploration. The data will include chromatin immunoprecipitation sequencing (ChIP-seq) data, single nucleotide polymorphism (SNP) data, comparative genomic hybridization (CGH) and gene expression data from maize inbred lines.