Theory
ANM (Anisotropic network model) is a simple NMA tool for analysis of vibrational motions in
molecular systems introduced in 2000 (1,2).
It uses
Elastic
Network (EN) methodology and
represents the system in the residue level. The
macromolecule is thus represented as a network, or graph. Each node is the
Cα atom of a residue and the overall potential is
simply the sum of
harmonic potentials
between interacting nodes. The network includes all
interactions within a cutoff distance which is the only predetermined parameter in
the model. Information about the
orientation of each interaction with respect to the global coordinates
system is considered within the Force constant matrix and allows
prediction of anisotropic motions. The model was
successfully applied for exploring the relation between
function and dynamics for many proteins (e.g. 3,4,5). The force constant
of the system can
be described by a
Hessian
matrix:

(1)
Each element
Hi,j is a
3×3 matrix which holds the anisotropic informations regarding
the orientation of nodes
i,j.
Each such sub matrix (or the "super element" of the
Hessian) is defined
as:

(2)
namely the second partial derivatives of the potential
V.
These partial derivatives comprise a simple matrix of cosines. In
practice the off-diagonal super elements are calculated by:

(3)
where γ is an interaction constant which taken as constant in the original ANM.
si,j is the
instantaneous distance between nodes
i
and
j.
The diagonal super elements are calculated by:

(4)
Note that the force constant matrix
H
holds information regarding the orientation of the nodes,
but
not regarding the type of the interaction (such is whether the
interaction is covalent or non-covalent, hydrophobic or
non-hydrophobic, etc.). In addition, the distance between the
interacting nodes is not considered directly. To account for
the
distance between the interactions we can weight each interaction
between nodes
i, j
by the distance:
(5)
where
p is an empirical parameter. The new off-diagonal elements of the
Hessian matrix take the form of:

(6)
The inverse of the
Hessian
matrix is the covariance matrix of
3N
multi-variant Gaussian distribution which hold the desired information
about the fluctuations. The
Hessian,
however is not
invertible, as its rank is 3N-6 (6 variables responsible to a rigid body motion). To
obtain a pseudo inverse, a solution to the eigenvalue problem
is
obtained:

(7)
The pseudo inverse is composed of the 3N-6 eigenvectors and
their respective non-zero eigen values according to:

(8)
Where
λi are the eigenvalues of H sorted by their size from small to large and U
i the corresponding eigenvectors. The eigenvectors (the columns of the matrix U)
describe the vibrational direction and the relative amplitude in the different modes. The mean
square fluctuations of individual residues can be obtained by summing
the fluctuations in the individual modes as follows:
(9)
The cross correlation between different residues can be also determined
from the inverse Hessian matrix:
(10)
For more advanced reading about the theory behind the model and its applications
see references 11,12.
Back to the top of the doumentation page
Description of the web site
Applicability
The
web site is currently built to be used with proteins. We intend to
add options for analysis of nucleic acids in the near
future. For
full functionality, the size of the protein should be up to 1500
residues. For such proteins the calculation of 20 dominant
modes is performed within seconds. The calculation of the theoretical
B-factors based on all the modes is done within 2 minutes. The analysis
pages could appear in a delay, depending on the communication speed and
the memory of the client computer (downloading large result files to
the user machine), but will not appear within more than a minute or two
for such moderate-size proteins in any case. For larger proteins some problems
with the graphics are expected. The calculation of the b-factors might
take longer. The
dominant modes calculation, however, will be performed and text files
with the results will still be available within few minutes. The site
was tested with proteins up to 3500 residues. For proteins
larger
than 3500 residues expect severe functional problems. Note that
the Jmol applet on the main interface page could be slow for large
proteins, as
the presented file becomes huge and there are
memory demanding. However, the graphic pages that
appears under the create
PDB files option should behave much better because only
information about individual modes is saved in the memory at each time.
The site was successfully tested on WindowsXP using the following
browsers:
- Mozilla FireFox1.5.0.1
- Internet Explorer 6.0, 7.0
- Opera 8.53
- Netscape 8.1
On Linux Suse, using the following browsers:
- Mozilla FireFox
- Konqueror
On MacOSX, using the following browsers:
The site is not fully functional using IE and Safari on Macintosh.
BLZPACK -
the
eigensolver
This site uses BLZPACK
(6) for extracting a subset of eigenmodes. Currently we calculate and report the slowest 20 modes.
BLZPACK is a Fortran 77 implementation of the block Lanczos
algorithm or the solution of the eigenvalue problems.
It was originally intended to solve large, sparse, generalized problems
which makes it very appropriate for the use here. The program
calculates a subset of the eigenpairs and not all of them. This is
sufficient since usually interested in a small set of eigenpairs
corresponding to the smallest eigenvalues ("slow modes"). It is an
extremely
efficient algorithm (6). For molecules with 300 residues (900×900
ANM Hessian matrix) the solution
is obtained within 2 seconds and for proteins with thousands of
residues it is obtained within few minutes.
Jmol - the
molecular graphics tool
Jmol (7) is a free code Java based molecular graphics tool.
It
was chosen to provide the graphics for this site as the code is actively and rapidly
developed, and since it was originally
designed to
support visualization of vibrations. This support
includes the option for animation of all vibrational modes in one
session with only button click required to switch between the modes.
Directional vectors can be presented for easy visualization of the
direction of motion at each mode. Detailed documentation and description
of Jmol capabilities are available on the Jmol site. In general, many
options can be controlled by the mouse from the pop up menu (right mouse
click on the applet). Other
options are available by command line scripting language which
is
highly compatible with the RasMol scripting language (open the
"console" option from the pop up menu to execute commands). For several options which are
directly related to the vibrations and for some important visualization
features we built a control panel for the user as will be described.
We highly acknowledge the work done
by the Jmol team that allow us (and other) to present sophisticated
data in a convenient and high
quality graphical environment.
Other software applied
Matlab (9) is used for matrix manipulations and a fast calculation of
theoretical B-factors based on all the modes using the PowerB method (6). Perl
modules
are applied for web programming and preparations of graphs and figures.
C code is used for parsing of structures and calculation of the Hessian
matrix. The Matlab, Perl and the C codes are available upon request.
Back to the top of the doumentation page
Input
parameters
The required input is very simple. The user should only
specify
a coordinate file in a PDB format. There are 2 ways to specify
the file:
- Specification of a 4 letter code PDB ID. The user can
choose by a checkbox if to use the original PDB coordinates (default) or the putative
biological unit
suppled by the PDB. The latter should reflect better the biological
multimeric state of the protein.
- Uploading of a coordinates file from the user computer to
the server.
The other input parameters are usually not required for a successful run.
Still, they might be important and relevant and
influence the quality of the results:
The user should determine if she wishes to include all polypeptides
chains in the calculation (default) or only individual chain. In the
latter, a chain ID (usually a Latin letter A,B etc.) should be
specified on the appropriate text field in the input page. An input
chain symbols such as "*", " " , "-", "_" , "0" will cause
inclusion of all polypeptide chains in the calculation.
In addition, the user can define which models will be included in the
calculation. Models are often (but not always) used to include
alternative
conformations in the file. The two most frequent situations in which the
coordinates file include different models are NMR files and
files with
the biological unit information. In the first case the user should
choose which of the models in the files (usually marked as
integer
numbers) should serve as the initial state for the calculation. In the second
case the user should probably specify "all" in the text field for model
as all the coordinates in the file (even those found under different
models) are part of one biological unit.
The user has control over two parameters in the model. The first is the cutoff which
define
the maximum interaction range between Cα atoms. Fig. I
shows
the range of cutoffs performance of the model (in terms of
correlation with the experimental fluctuations) for a range of cutoffs.
It can be clearly seen that on average for globular proteins the best
results are obtained for cutoffs of 15-21Å. For individual
proteins, however, for example extended proteins or membrane proteins,
smaller cutoffs maybe preferred. It should be also noted that
calculation with larger cutoffs are slower (the Hessian matrix is less
sparse).

Fig I. The agreement between the theoretical fluctuations in ANM and the experimental values
The second parameter is the distance weighting factor (Eq.
5,6). Large scale statistical analysis indicates the best value for
this parameter is 2.5. Using this value the correlation between
theoretical and experimental b-factors are better by 4% comparing to
the original ANM from about 55% to about 59%. When running with p=0 (default), the original non-weighted ANM is executed.
Output text files
The user can download a series of text files including the
outcome of the calculation plus additional analysis:
- Slow
eigenvectors (oanm.slwevs) : List the 20 eigen vectors
which correspond to smallest 20 eigen values.
- Slow
Eigenvlaues (oanm.eigvals)
: Lists the 36 smallest eigenvalues. The 6 smallest eigenvalues should
be roughly zero. Starting from
the
seventh, the eigenvalues are related to the vibrational frequencies
(proportional to the square root of the eigenvalue) and to the contribution to the magnitude of fluctuations (proportional to 1/eigenvalue).
- X
components of slow eigenvectors (oanm.slwX)
: List the X axis component of the 20 eigenvectors (a column
to each vector) which correspond to smallest 20 non-zero eigenvalues.
- Y components
of slow eigenvectors (oanm.slwY) : List the Y
axis component of the 20 eigenvectors which correspond to smallest 20
non-zero eigenvalues.
- Z
components of slow eigenvectors (oanm.slwZ)
: List the Z axis component of the 20 eigenvectors which
correspond to smallest 20 non-zeroeigenvalues.
- Normalized
slow modes. Fluctuations of residues in each individual mode
(oanm.slwmodes)
: The normalized self fluctuations (the norm of each vector is 1) for each
residues in each mode (a column to each mode). The first column is the
residue index.
- Hessian
matrix in a coordinates (sparse) format (oanm.hes) : The Hessian
calculated according to Eq. 3 or 6. The Hessian is brought in a format i, j, value. In the
original matrix H we have H[i][j]=value,
H[j][i]=value. Zero entries in Hessian are not
listed in this file.
- Theoretical and experimental b-factors (oanm.bfactors)
: Includes the theoretical and experimental B-factor (temperature factor). The first column is the residue index, the second is ANM
calculated theoretical B-factors and the third is the x-ray crystallographic b-factors taken from the PDB file.
- Correlation
between theoretical and experimental b-factors (oanm.corr)
: Pearosn's correlation coefficient between theoretical and experimental b-factors.
- Estimated
value of spring constant (oanm.gamma)
: Based on the difference between the experimental and calculated
b-factors, it is feasible to calculate the structure-specific spring
constant (gamma).
- Gamess
file (protein.out) : See description in the section below.
Output coordinate files
- Coordinate
files in PDB format for individual mode fluctuations
: PDB files for individual modes can be constructed. The files include
set of models that describe the fluctuations in this mode in a linear
sequence. These files can be downloaded and read by any
application capable of showing show/animate PDB structures. The
user control the amplitude of the fluctuations and the number
of frames in the animation (related to the smoothness of the
motion in the animation). Note that the animation should be performed
in a palindromic fashion and not in a loop fashion (the first frame and
the last frame are in two extremes of the fluctuations).
- Coordinate
files in PDB format with anisotropic temperature factors:
The model predicts also the magnitude of the fluctuations in each individual axis
and the covariance between the fluctuations in the different
directions. These are directly related to the anisotropic temperature
factors sometimes reported in the protein data bank. The vaules of the
anisotropic factors can be based purely on the theoretical results of
ANM, or they can be scaled according to the experimental b-factors.
This scaling improves significantly the prediction of the mean square
fluctuations in the X, Y, Z directions and slighly the correlation
between the motions in the different directions.
- Gamess file
: This file includes the input coordinates and the 20 most
significant
eigenvectors. The format of these sections is according to Gamess (10) file
format. This file is also readable by Jmol.
Using the
main interface
- Modes:
Visualization of a mode of the 20 modes calculated. The fluctuations
of the selected mode appear and the molecule is colored
according
to the self-fluctuations. Red colors correspond to large fluctuations
and blue colors to small fluctuations. By default, the vibrations in
the first mode are animated and the residues are colored accordingly.
- Frequencies of vibrations
: The default state of the analysis page presents the vibration frequencies of the
different modes in the same proportions as the real frequencies. The
user, however, can change the frequency for each mode using the pulldown
menu for the purpose of visualization or analysis. In this case the
proportions between the vibrational frequencies of the different modes
are no realistic any more (the change holds only to the mode where it
was applied). The visualization frequencies (in Hertz units) are, of
course, the vibrational frequencies as appear on the screen and not the
real vibrational frequencies of the molecule.
- Amplitude of vibrations
: The amplitude of the vibrations can be changed by the user to
emphasize
the motion or to observe minor fluctuations that can not be distinguished
otherwise. It is important to note the model does not always accurately represents the real size of the
fluctuations but only the relative amplitude between the residues.
- Vectors
: Vectors represent the direction of motion of each residue in each vibrational mode can
be shown or hidden by checking on the vectors check box. The user can
change the color of the vectors, their width and their length.
- Display
: The size of individual atoms (space fill) and the width of
the
bonds (wire frame) can be controlled by the pulldown menus in the graphical
user interface, in addition to the available built in options
in
Jmol. If for some reason the desired action is not performed, try to
click on other value in the same pulldown menu and then press again on
the initial one. Another possibility is that not all atoms are
selected. From the Jmol menu (right click on the applet area) choose
"select all" and then try again to perform the action.
- Labels
: Residues can be labeled by their type and their number in the
coordinates part of the PDB input file. Individual atoms can be labeled,
as well as all atoms. The clear option in the menu remove all the
labels. Note that the regular labeling options of Jmol will not be
functional with this file format.
- Colors
: The protein can be colored by the polypeptide chains, by the
experimental temperature (b) factors and by the fluctuations in each
individual mode. The coloring scale is red-white-blue, where red
indicates larger fluctuations. The coloring mode chosen
from this pulldown menu is disconnected of the display method or the
vibrational mode, so potentially we can have the residues vibrate
according to one mode but colored according to a different mode.
- ANM model :
Shows the network of the interactions for 10.0Å cutoff.
- Chain connectivity
: Color each
polypeptide chain in different color and shows only the trace of the
chains with no other non bonded interactions between sequentially
distal residues. Because of the poor annotation in this file format,
the Jmol built-in menu does not support equivalent options. The chain
connectivity option may take some time to be executed for
large proteins.
Additional
visual analysis options
- Self fluctuations
and b factors as a function of residue index : The graph presents the
fluctuations of
individual residues according to experimental b-factors (or mean square
fluctuations in NMR structures), according to the normalized values (in
units of standard deviations from the mean), according to the
calculated mean squared fluctuations or according to the fluctuations in
each individual mode. For each polypeptide chain a separate
graph is given. A single parameter or any combination of the above
parameters can be chosen for the display. The graphs (in .png format)
can be downloaded. The residue numbers on the X-axis are taken from the coordinates part of the PDB file.
- Correlation
analysis by two dimensional matrices
: The web site offers elaborated options
to explore
the correlations in fluctuations between residues. The
user can choose to explore the correlations between residue pairs in
each individual
mode. In addition, a range
of modes (within the first 20)
can be given and then the correlation according to the sum of all the
modes in this range is obtained. A correlation based on all modes can also
be obtained.
Initially, when linking to the covariance page from the main interface
page, a correlation matrix based on the first mode is shown. Using the
options in the lower-left frame the user can choose to analyze
a different correlation matrix as described above. The
structure in
the upper-left Jmol frame vibrates in the same mode as that of the
correlation matrix . If a range of modes is given, the structure
vibrates in the lowest mode in this range.
- The color scheme of correlation matrix in the right frame is
such that correlated residues are colored by red and anti-correlated
residues are colored by blue. Weak correlations appear in light colors.
Each cell in this matrix is a hot link to a 10×10 matrix
which
zoom in to this region in the matrix. For big proteins this allows an
analysis of each pair, which is not feasible in the resolution
of the large matrix. Moving the mouse over of the cells of the small
10×10 matrix, boxes appear which give information
about the
correlation value as well as the distance between the residues (the
Cα atoms). This
matrix is also "hot". Clicking on each cell will highlight the residues
on
top of the structure in the Jmol frame. The colors of the residues in
the Jmol frame are the same as
those on the correlation matrix and indicate the degree of correlation
between the residues. The correlation matrices are standard
figures in a .png format and can be downloaded from this page.
- Inter residue distance fluctuations and deformation energy
: The web site also offers options to explore
the changes in distance between residues. The
user can explore the square-distance fluctuations between residue pairs in
each individual mode.
Initially, when linking to the "distance fluctuations and deformation energy page" from the main interface
page, a square-distance fluctuation matrix and a deformation energy graph based on the first mode are shown. Using the
options in the lower-left frame the user can choose to analyze
a different mode. The structure in
the upper-left Jmol frame vibrates in the same mode as that of the matrix. The user can choose to analyae the vectorial diffence,
which considers also the orientation of the distance vector. The deformation energy is proportional to the sum of the square fluctuations
with all interacting residues
- The color scheme of matrix is such that large inter-residue fluctuations appear as blue and small fluctuations as red.
Each cell in this matrix is a hot link to a 10×10 matrix which zoom into a sub-region in the matrix. For big proteins this allows an
analysis of each pair, which is not feasible in the resolution
of the large matrix. Moving the mouse over of the cells of the small
10×10 matrix, boxes appear which give information
about the fluctuation value as well as the equilibrium distances (between
Cα atoms). This
matrix is also "hot". Clicking on each cell highlights the residues
on top of the structure in the Jmol frame. The colors of the residues in
the Jmol frame are the same as
those on the square distance fluctuations matrix. The matrices are
figures in a .png format and can be downloaded from this page.
A graph of deformation energy for each residue shows the potential energy of the residue considering
all the interactions ("springs") in which it takes part. For each residue all the entries in its row/column matrix for which there is an interaction (based on the
user's cutoff) are summed up. The number bellow the graph shows the ovewrall energy of the system. According to the equipartition theorem the overall interanl potential
energy of the system in each independent mode is constant and equals to KbT/2 (around 0.3 Kcal/mol at 300K).
- Eigenvalues
: A figure with the distribution of the 20 dominant eigenvalues is
shown. This allows for a quick visual inspection for the
relative
contribution of each mode and possible degeneracy between the modes.
Known problems and possible solutions
- A problem with the functionality of the site on Mac
computers
still exists. Some of the options, involving interactions
with
the Jmol applet using a graphical user interface, are not
working on most browsers. Other options are still functional and the text files
holding the computational results are available in any case.
- Delays in the calculation of B-factors and
correlation
matrix based on all modes. Since these options essentially requires
more heavy computations the results might appear in delay comparing to the other data.
- Options at the main graphical user interface are sometimes not
responding well. If an action chosen from the pulldown windows is not
performed, try to click on other value in the same pulldown menu and then press again on
the initial one. It is also possible that not all atoms are
selected. From the Jmol menu (right click on the applet area) choose
"select all" and then try again to perform the action.
- If the motion is very slow and the applet does not responds
well, most likely the molecule is very large. You may try to use the site with a computer with larger memory.
Selected reading and references
(1) Dynamics of proteins predicted by molecular dynamics simulations and
analytical approaches: application to alpha-amylase inhibitor. Doruker, P, Atilgan, AR & Bahar, I. Proteins 40, 512-524, (2000).
(2) Anisotropy of fluctuation dynamics of proteins with an elastic network
model. Atilgan, AR, Durrell, SR, Jernigan, RL, Demirel, MC, Keskin, O. & Bahar, I. Biophys. J. 80, 505-515, (2001).
(3) Computational prediction of
allosteric structural changes by a simple mechanical model: application
to hemoglobin T to R transition. Chunyan, X, Tobi, D & Bahar, I. J. Mol. Biol. 333, 153-168 (2003).
(4) Global ribosome motions revealed with elastic network model. Wang, Y,
Rader, AJ, Bahar, I & Jernigan RL. Structure. 147, 302-314 (2004).
(5) Relating molecular flexibility to function: A case study of tubulin. Keskin, O, Durell, SR, Bahar, I, Jernigan, RL, Covell, DG. Biophys J. 83, 663-680 (2002).
(6) http://crd.lbl.gov/~osni/marques.html#BLZPACK
(7)
oGNM: Online Computation of Structural Dynamics Using the Gaussian Network Model.
Yang, LW, Rader, AJ, Liu, X, Jursa, CJ, Ching, SC,
Karimi, H & Bahar, I. Nuc. A. Res. In press (2006).
(8)
http://jmol.sourceforge.net/
(9) The
MathWorks, Inc
(10)
General Atomic and Molecular Electronic Structure System. Schmidt,
MW , Baldridge, KK, Boatz, JA ,Elbert, ST, Gordon,
MS, Jensen, JH, oseki, SK, Matsunaga, N, Nguyen,
KA, Su SJ, Windus, TL, Dupuis, M & Montgomery, JA. J.Comput.Chem. 14, 1347-1363 (1993).
(11) Normal Mode Analysis: Theory and Applications to Biological and
Chemical Systems (2006) Mathematical Biology Series, CRC Press.
(12)
Elastic network models for understanding biomolecular machinery: from enzymes to supramolecular assemblies. Chennubhotla, C, Rader, AJ., Yang, LW, and Bahar, I. , Phys. Biol. 2, S173-S180 (2005).