Data analysis for trp-cage folding MD trajectoryΒΆ

You can try this tutorial online:

- click binder icon
- Choose "File" then "New Notebook"
- copy and paste each command and click Cell-->Run (or Ctrl-Enter)

Analysis_Folding_TrpCage_Peptide_MD

Trajectory analysis for MD Simulation from Folding Trp-Cage Peptide

Reproduce some analysis in http://ambermd.org/tutorials/basic/tutorial3/section6.htm

Download data

url: http://ambermd.org/tutorials/basic/tutorial3/section6.htm

Require

  • pytraj/cpptraj
  • nglview (https://github.com/arose/nglview)

    # Linux users
            conda install -c ambermd nglview
    
            # for Mac user, please see the instruction from above website
    
  • pandas

    conda install pandas
    
In [1]:
# uncomment below to download, untar files.

# !wget http://ambermd.org/tutorials/basic/tutorial3/files/production.tar.gz
# !wget http://ambermd.org/tutorials/basic/tutorial3/files/TC5b.prmtop
# !tar -xf ./production.tar.gz
# !mv TC5b.prmtop production/
# !wget http://ambermd.org/tutorials/basic/tutorial3/files/lowest_energy_struct.pdb.2455
# !mv lowest_energy_struct.pdb.2455 production/
# !ls production/

Load trajectory files by pytraj

In [2]:
import pytraj as pt
traj = pt.load('production/equil*.mdcrd.gz', top='production/TC5b.prmtop')
traj
Out[2]:
pytraj.Trajectory, 50000 frames: 
Size: 0.339746 (GB)
<Topology: 304 atoms, 20 residues, 1 mols, non-PBC>
           

Visualization with nglview

In [3]:
# note: you need to run the notebook
import nglview as nv

view = nv.show_pytraj(traj)
view
In [4]:
# reset representation
view.representations = []
view.parameters = {'theme': 'dark'}
view.add_representation('licorice', selection='not hydrogen')
view.add_representation('cartoon')
In [5]:
# jump to specific frame
view.frame = 2000
In [6]:
# if you're using notebook, you should see somethinge similiar to below image
from IPython.display import Image
Image('nglview_trpcage_amber_tutorial_3.png', width=600)
Out[6]:

Plot RMSD vs time to lowest structure (taken from tutorial)

In [7]:
# load reference (we don't need to use a topology file since this is pdb)
lowest_energy_pdb = pt.load('production/lowest_energy_struct.pdb.2455')
lowest_energy_pdb
Out[7]:
pytraj.Trajectory, 1 frames: 
Size: 0.000007 (GB)
<Topology: 304 atoms, 20 residues, 1 mols, non-PBC>
           
In [8]:
rmsd_to_lowest_pdb = pt.rmsd(traj, ref=lowest_energy_pdb, mask='@N,CA,C')
rmsd_to_lowest_pdb
Out[8]:
array([ 5.27339494,  5.14608589,  4.98715608, ...,  3.73567383,
        3.52249127,  3.46483843])
In [9]:
%matplotlib inline

from matplotlib import pyplot as plt

plt.plot(rmsd_to_lowest_pdb)
plt.xlabel('Time (ps)')
plt.ylabel('RMSD (angstrom)')
Out[9]:
<matplotlib.text.Text at 0x2aaae7322278>

Compute dihedral angle for TRP6

In [10]:
# store data in pandas's DataFrame
# 
df_trp6_dihedrals = pt.multidihedral(traj, resrange='6', dtype='dataframe')
df_trp6_dihedrals.head(6)
Out[10]:
phi_6 psi_6 chip_6 omega_6
0 -76.611207 151.569302 -171.449215 -178.380826
1 -73.083864 169.196403 -163.917153 -175.234276
2 -141.113262 160.778650 -155.444528 -177.581732
3 -141.529617 150.066684 176.778163 -167.869044
4 -131.870487 134.767010 -159.404503 168.943205
5 -166.370848 156.078155 153.671127 -171.181556
In [11]:
plt.plot(df_trp6_dihedrals['phi_6'], 'bo', markersize=1.)
Out[11]:
[<matplotlib.lines.Line2D at 0x2aaae73af8d0>]

Hbond

In [12]:
hbond_data = pt.hbond(traj)
hbond_data
Out[12]:
<pytraj.hbonds.DatasetHBond
donor_aceptor pairs : 486>
In [13]:
## Get donor and acceptor mask, only show first 10
hbond_data.donor_aceptor[:10]
Out[13]:
['ASN1_OD1-LEU2_N-H',
 'LYS8_O-GLY11_N-H',
 'ARG16_O-ARG16_NE-HE',
 'ARG16_O-ARG16_NH2-HH21',
 'PRO12_O-GLY15_N-H',
 'GLY10_O-ARG16_NH2-HH21',
 'ASP9_OD1-GLY10_N-H',
 'PRO12_O-SER14_N-H',
 'PRO17_O-ARG16_NE-HE',
 'PRO17_O-ARG16_NH2-HH21']

plot total hbond number vs rmsd

In [14]:
# plot total hbond number vs rmsd
# color by frame number
n_hbonds_per_frame = hbond_data.total_solute_hbonds()
fig = plt.scatter(n_hbonds_per_frame, rmsd_to_lowest_pdb, 
                  marker='o', c=range(traj.n_frames), alpha=0.8, cmap='Spectral')
plt.colorbar()
Out[14]:
<matplotlib.colorbar.Colorbar at 0x2aaaf805f160>
In [15]:
# get distance and angle mask for each hbond
h_dist_mask, h_angle_mask = hbond_data.get_amber_mask()
print(h_dist_mask)
print(h_angle_mask)
[':1@OD1 :2@H' ':8@O :11@H' ':16@O :16@HE' ..., ':16@O :14@H' ':1@O :20@H'
 ':5@NE2 :1@HD22']
[':1@OD1 :2@H :2@N' ':8@O :11@H :11@N' ':16@O :16@HE :16@NE' ...,
 ':16@O :14@H :14@N' ':1@O :20@H :20@N' ':5@NE2 :1@HD22 :1@ND2']
In [16]:
# compute hbond distance
# use dtype='dataframe' to dump distance data to pandas' DataFrame
h_dist = pt.distance(traj, h_dist_mask, dtype='dataframe')

# need to update mask
h_dist.columns = h_dist_mask
In [17]:
# just want to show some data
h_dist[[':1@OD1 :2@H', ':8@O :11@H', ':16@O :16@HE']].head(5)
Out[17]:
:1@OD1 :2@H :8@O :11@H :16@O :16@HE
0 2.031473 1.959615 3.494677
1 3.711942 2.547225 1.944752
2 2.531966 2.664229 2.016934
3 3.249249 2.672545 5.467710
4 2.917321 3.800039 4.817248
In [18]:
# stats
h_dist[[':1@OD1 :2@H', ':8@O :11@H', ':16@O :16@HE']].describe()
Out[18]:
:1@OD1 :2@H :8@O :11@H :16@O :16@HE
count 50000.000000 50000.000000 50000.000000
mean 3.497438 5.384411 5.260892
std 0.983849 1.442871 1.389453
min 1.644560 1.811608 1.648349
25% 2.724144 4.646580 4.074415
50% 3.305591 5.598265 5.260824
75% 4.291924 6.154997 6.261976
max 6.212335 9.442201 8.120241
In [19]:
# plot
h_dist[':1@OD1 :2@H'].hist()
Out[19]:
<matplotlib.axes._subplots.AxesSubplot at 0x2aaaf8006470>
In [20]:
h_dist[':1@OD1 :2@H'].describe()
Out[20]:
count    50000.000000
mean         3.497438
std          0.983849
min          1.644560
25%          2.724144
50%          3.305591
75%          4.291924
max          6.212335
Name: :1@OD1 :2@H, dtype: float64