TSRI Workshop: Oct 15, 1999

URL: http://metallo.scripps.edu/workshop/workshop.html

Viewing metal-binding sites

By now you have already picked up your favorite protein and are ready to see what it contains. There is a list available in the home directory of your class account with the PDB codes an a brief description of the protein it contains (MetalloList), as well as a list of the names of the files containing the metal-binding sites in the protein (SitesList). HTML versions of this files are also available: MetalloList, SitesList.

The metal-binding sites are saved in PDB format files, and were generated by running the automated indexing tool from the Metalloprotein Database (MDB). Each site file has the same id-code as the protein it comes from with a sequential site number appended to it, e.g. if you are using 2sod, the first site will be saved in 2sod_s1.pdb. Each site has an associated script intended for use with RasMol v2.6, in the example above the script will be named 2sod_s1.spt. Not all the sites we are going to study have been entered into the current version of the MDB on the web.

Using Rasmol

Note: This instructions assume that you have already logged in your class account on one of the workstations.

Let's assume that we are going to observe 1hca (Human Carbonic Anhydrase). The automated indexing program from the MDB found 2 metal sites: 1hca_s1.pdb and 1hca_s2.pdb. To see the first one we will do (from the home directory "/home/class"):


% run_rasmol
[ rasmol will start showing a graphic display and a prompt ]
RasMol> load mdb/1hca_s1.pdb
[ the first binding site containing Zn will show ]
RasMol> script mdb/1hca_s1.spt
[ will highlight the second shell as Van der Waals surface dots and, the metal
as a CPK sphere and the first shell as tubes ]
The image you see may look like this:

Use "dots off" if you want to make the dot surface disappear.

You can use the left button to rotate, shift + left button to zoom, right button to translate. See the "Rasmol v2.6 Quick Reference Card" for details.

If you click on an atom, Rasmol will print information in the commands window, for example, if you click on the Zn atom:

Atom: ZN  2041   Hetero: ZN 1
If you want to obtain the distance between atoms instead, you can use "set picking distance", for bond angles and torsion angles you will use "set picking angle" and "set picking torsion" respectively, and click on the corresponding atoms. The results will be displayed in the commands window. You can use "set picking monitor" to display the atom-atom distance in the structure window.

Selecting parts of the structure

Usually you can select atoms, residues, chains, etc. that are predefined in Rasmol (see quick reference card), and if you have run the script for the metal-site file, you can also pick 4 more sets that get defined for you:

So you can use, for example (omitting the RasMol prompt)

select metal
color blue
select first
color yellow

To color the metal blue and the first shell yellow, you could also pick any display option from the viewer menu (wireframe, sticks, etc.)
To select all atoms, use "select all", or just "select"

For more info, you can check the Rasmol related links at the end of this document.

Viewing all the sites in a protein

There are 3 proteins that have associated scripts that have been manually assembled. If you load these proteins, and then run the script, you will see all the site in the protein. A note with more information will also appear on the commands window.

The list of proteins and scripts is (all in /home/class/mdb/)

What to do now?

Observe the metal sites in the protein(s) you like best, then note the coordination geometry, the metal(s) involved, take geometrical measures (distances, angles, torsions), look what residues are coordinated to the metal, see if you can see patterns of ligation here. Check also for variations in coordination (if any) among different sites in the protein with the same metal center. After you have explored for some time, see if you can find similar metal sites in the MDB (see below "Using the search forms"), and also how does the metal-ligand distances you have found compare with others in the MDB (see "Getting statistics on M-L distances", you can also check for patterns of ligation in that page)

If you have experience with Rasmol

And you want to see what one of the scripts looks like, here is the one in 1hca_s1.spt:


define t1 HIS94,HIS96,HIS119,HOH279;
define first t1;
define t1 PHE66,ASN67,GLN92,PHE93,GLY104,GLU106,ALA116,GLU117,LEU118,VAL121;
define t2 VAL143,LEU144,GLY145,THR199,TRP209,ASN244,HOH320,HOH324,HOH331,HOH333;
define second t1,t2;
define t1 1;
define metalres t1;
define t1  atomno=2041
define metal t1
define site metal,first,second

select second
color cyan
dots

select first,metalres
color cpk
cpk 70
wireframe 40

select metal
cpk

The first part is for definitions, and the rest is for selection and display options. Basically a sequence of commands you can also type manually in the RasMol commands window. The scripts for showing all the metal sites in a protein are not more complex than this, only longer and with more definitions.


Querying the MDB

There are several query interfaces to the MDB, we have straightforward HTML forms, interactive interfaces with a java macromolecular viewer (e.g. search on the "Raw" database), remote call search interfaces, etc. In this workshop we will use 2 of the search interfaces. One to search for other sites with similar structural properties as the one you are interested, and another to perform some simple statistical analysis of metal-ligand distances and of liganding patterns in metal sites.

Using the search forms

(URL: http://metallo.scripps.edu/advanced/)

In the image below, you will find two search forms that allow you query for metal-binding sites that have characteristics similar to the one you were studying.

Let us assume that you were looking at a binding site with a 4-coordinated Zinc ion, and that the 4 liganding atoms belong to 3 Histidine residues and one water molecule (the zinc-binding site in 1hca is an example). Then, you can use "Simplified Search Form" to do an exploratory search of similar sites. You can search for sites in which metal = zn, number of ligands = 4, and that will give you over 500 hits. Then you can refine the search an look for sites in high resolution proteins: resolution <= 1.5 Angstroms, and you will find only 5 hits.

You can try different upper limits, combine it with the r-value, maybe restrict to a particular author, etc. You may also try checking for other metals with the same number of ligands (let's say copper, or cadmium), or the same metal with different number of ligands, etc. For more info on the type of input each field supports, read the footnotes (click the footnote link next to each field).

If you want to do something more complex and narrow your search, you can try the "Advanced Search Form", in which you can use fields similar to the ones in the previous form, but also add things like ranges of years, limit the ligand type to search on, select that at least one ligand has to be a particular amino acid, and even put a range for to the metal-ligand distance range. You can then make searches that would correspond to asking: "Search all 4-coordinated zinc sites, in which 3 of the ligands need to be amino acids and one water, and at least one of the protein ligands need to be histidine. Limit the metal-ligand range to the 2.0-2.3 Angstroms range, and use only metalloproteins with a resolution of 1.2-2.0 Angstroms", and if you do that search you will find about 36 hits.

Getting statistics on M-L distances

(URL: http://metallo.scripps.edu/beta/hist/)

Now, what about some statistics on the metal sites indexed in the MDB? We can get some information, such as the distribution of metal-ligand distances or the liganding patterns for a particular metal and coordination number.

For example, if we were interested in 4-coordinated sites containing copper ions, and in particular the Cu-His distances, we will use the "Create M-L distance plot" form and find that there are 380 Cu-His entries matching the conditions, and that the average distance is 2.07 Angstroms with a range of 1.72-2.69 Angstroms and a standard deviation of 0.13.

Similarly, if we were looking to compare the observed metal-ligand pattern in the studied structures with others in the MDB, we will use the "Create Metal-ligand pattern plot" to search for copper-bindng site with a coordination number of 4, and find that the four most prevalent patterns for the CuL4 binding sites are: Cys His His Met, His His His His, Cys Gly His His, and His His His H2O. This type of information is particularly useful when designing new metal sites in protein scaffolds.


References


Jesus M. Castagnetto (jesusmc@scripps.edu)