|
||
|
Project III - Protein Metal Ion Site Design Algorithms (Progress Update)Overview:Project III has shown a good deal of progress since the Program Project Proposal was submitted. We have released the first iteration of the interactive, web accessible browser for our Metalloprotein Database (http://metallo.scripps.edu/). Also, we have successfully started on an automated system that will allow the recognition of metal sites in the Protein Data Bank (PDB), and their incorporation into a fully searchable and viewable database. This automatically generated set has been instrumental in the redesign simulations undertaken on the GFP system, providing the basic data about ligand frequency and geometrical parameters needed to create successful templates in the DEZYMER program. Finally, all the basic data generated so far is freely available on the web, so other scientists working on similar problems can benefit from these preliminary studies.Progress since October 1997:We have split the creation of a comprehensive metalloprotein database into two complementary tracks: the originally-planned human-edited, classified, and annotated database of selected metal sites, and a new computer-generated "raw" database of every metal in the Protein Data Bank distribution. This strategy was implemented because we needed templates for the DEZYMER simulations. In addition, early users indicated that they wanted a wider coverage of metalloprotein sites via the same interface that the human-edited database provides. The Metalloprotein Site Database (being created by Jesus M. Castagnetto) is proceeding as outlined in the proposal. It now contains 32 metal sites from 21 X-ray structures. Its objective remains to be a high quality resource, guaranteed by a human editor, offering hand-selected and verified metal environments, classified by (and searchable on) metal site geometry and protein function. For our GFP redesign, we needed up-to-date statistics on ligand atom combinations, metal-ligand distances, and bond and torsion angles. Last November we compiled a comprehensive database, automatically generated from the entire PDB distribution. For the NMR structures, which normally contain multiple models, only the first one was retained. We evaluated the database for frequency of ligands individually and in groups, and for distribution of geometrical parameters; for example, we found 48 Zn Asp-Asp-Glu sites, but Asp-Asp never occurred without Glu, and more than one Glu was rare. As of February 1998 this "raw" database contains 6261 metal sites from 2246 X-ray and NMR structures. Currently, it can be searched by metal type and number of ligand atoms. Search on other characteristics (geometry, metal-ligand distance) will be implemented in the future. As a service to the scientific community, we have made available on the web the underlying ligand atom data files (as well as the derived site "mini-PDB" and site VRML files) used in the creation of this database. This allows other scientists to perform similar analysis and visualization, using spreadsheets and PDB or VRML viewers. The steps involved in the creation of the comprehensive ("raw") database are:
There has been a substantial increase in the number of metalloproteins deposited to PDB; from April to October 1997 the number of Ca, Cu, Fe, Mg, Mn, and Zn metal sites grew by 12% (up 540 from 4424 to 4964). We will maintain the raw database synchronized with each semi-annual PDB update and incorporate information from the edited database through over-rides, annotations, homology recognition, and "hint files" that will direct this automated generation. In the near future we will add a classification of ligand atoms as side chain, main chain, solvent, ion, or co-factor, allowing searches on number(s) and type(s) of ligand(s). After this step has been completed, we will start using algorithms for automatic classification of geometry, with a rotational alignment of the corresponding ligands. We have progressed in understanding and using DEZYMER at TSRI. Last October we began making our own site templates, and in November created a custom rotamer library to accommodate the unusual geometry of the GFP Y66H mutant, in which a His residue replaces the chromophore Tyr. We probed our "raw" database, discovering that none of the 93 PDB His-His-Cys Cu sites involved His Nepsilon atoms; instead only His NDelta atoms act as ligands, providing key information for our DEZYMER runs. In cooperation with Project V, we performed searches for His-His-Cys and His-Cys-Asp/Glu/His sites incorporating the chromophore His, which we are now evaluating for construction (Figure III-A). Because of faster computers and improved analysis tools, a full-rotamer search and refinement, followed by selection of the best residual energy values, is now our standard procedure, as sketched in the proposal.
Our cooperation with the Hellinga lab continues, and Michael Pique will visit Professor Hellinga on March 4-5 to apply Dr. Hellinga's PROSE protein simulated evolution program to our GFP sites. The automated design of a novel sequence for an entire protein by Bassil Dahiyat and Stephen Mayo (Science 1997, 278, 82-87) has confirmed the feasibility and timeliness of our plan to optimize side chain packing design by using combinatorial search and simulated evolution.
|
|
|
Use the interactive interface to search the MDB Database MDB Site: About the MDB | Searching the MDB | MDB's FAQ | News and Release History | Future Plans Advanced Queries | SQL Query | Remote Viewer and How to use it | MDB downloads | Feedback |
||
|
Metalloprotein Structure and Design Site: TSRI Main Page | MetalloProtein Program Main Page | Metalloprotein Database & Browser About the Program | Employment & Research Opportunities Group Picture | For More Information | Feedback & Inquiries | Other Web Resources |
||