Trajectory warm up

Register to load from disk

In [1]: import pytraj as pt

In [2]: traj = pt.iterload('tz2.nc', 'tz2.parm7')

In [3]: traj
Out[3]: 
pytraj.TrajectoryIterator, 101 frames: 
Size: 0.000503 (GB)
<Topology: 223 atoms, 13 residues, 1 mols, non-PBC>

Register to load from disk with frame stride

# specify start=2, stop=50, step=2
In [4]: pt.iterload('tz2.nc', 'tz2.parm7', frame_slice=(2, 50, 2))
Out[4]: 
pytraj.TrajectoryIterator, 24 frames: 
Size: 0.000120 (GB)
<Topology: 223 atoms, 13 residues, 1 mols, non-PBC>
           

# skip every 2 frames with stride option
In [5]: pt.iterload('tz2.nc', 'tz2.parm7', stride=2)
Out[5]: 
pytraj.TrajectoryIterator, 51 frames: 
Size: 0.000254 (GB)
<Topology: 223 atoms, 13 residues, 1 mols, non-PBC>

Register to load several files from disk

In [6]: pt.iterload(['tz2.0.nc', 'tz2.1.nc'], 'tz2.parm7')
Out[6]: 
pytraj.TrajectoryIterator, 202 frames: 
Size: 0.001007 (GB)
<Topology: 223 atoms, 13 residues, 1 mols, non-PBC>

Register to load several files from disk with frame stride

In [7]: pt.iterload(['tz2.0.nc', 'tz2.1.nc'], 'tz2.parm7', frame_slice=[(0, 8, 2), (0, 5, 1)])
Out[7]: 
pytraj.TrajectoryIterator, 9 frames: 
Size: 0.000045 (GB)
<Topology: 223 atoms, 13 residues, 1 mols, non-PBC>
           

# use stride to simplify loading if do not need to specif start and stop frames
In [8]: pt.iterload(['tz2.0.nc', 'tz2.1.nc'], 'tz2.parm7', stride=3)
Out[8]: 
pytraj.TrajectoryIterator, 68 frames: 
Size: 0.000339 (GB)
<Topology: 223 atoms, 13 residues, 1 mols, non-PBC>

Write to disk

In [9]: pt.write_traj('my_traj.nc', traj, overwrite=True)

Extract coordinates from PDB structure

In [10]: pdb = pt.fetch_pdb('1ES7')

# get coordinates for CA atoms
In [11]: pdb['@CA'].xyz
Out[11]: 
array([[[-13.162,  61.638,  32.312],
        [-14.246,  65.16 ,  33.26 ],
        [-17.468,  66.762,  34.475],
        ..., 
        [-27.367,  43.164,   2.811],
        [-24.433,  45.091,   1.312],
        [-22.127,  43.142,  -1.024]]])

# another way with more memory efficient
In [12]: pt.get_coordinates(pdb, mask='@CA')
Out[12]: 
array([[[-13.162,  61.638,  32.312],
        [-14.246,  65.16 ,  33.26 ],
        [-17.468,  66.762,  34.475],
        ..., 
        [-27.367,  43.164,   2.811],
        [-24.433,  45.091,   1.312],
        [-22.127,  43.142,  -1.024]]])

Write PDB file to netcdf format (require Netcdf lib)

In [13]: pdb = pt.fetch_pdb('1ES7')

In [14]: pt.write_traj('mypdb.nc', traj=pdb, overwrite=True)

# can also use the shortcut
In [15]: pdb.save('mypdb.nc', overwrite=True)

Fancy indexing

In [16]: traj[0]
Out[16]: <Frame with 223 atoms>

In [17]: traj[:2]
Out[17]: 
pytraj.Trajectory, 2 frames: 
Size: 0.000010 (GB)
<Topology: 223 atoms, 13 residues, 1 mols, non-PBC>
           

In [18]: traj[:2, '@CA']
Out[18]: 
pytraj.Trajectory, 2 frames: 
Size: 0.000001 (GB)
<Topology: 12 atoms, 12 residues, 12 mols, non-PBC>

Iterate whole trajectory

In [19]: for frame in traj:
   ....:     pass
   ....: 

In [20]: frame
Out[20]: <Frame with 223 atoms>

Iterate a part of trajectory

  • with stop value
In [21]: for frame in pt.iterframe(traj, stop=5):
   ....:     print(frame)
   ....: 
<Frame with 223 atoms>
<Frame with 223 atoms>
<Frame with 223 atoms>
<Frame with 223 atoms>
<Frame with 223 atoms>
  • with given frame indices
In [22]: for frame in pt.iterframe(traj, frame_indices=[0, 5, 20, 50]):
   ....:     print(frame)
   ....: 
<Frame with 223 atoms>
<Frame with 223 atoms>
<Frame with 223 atoms>
<Frame with 223 atoms>
  • with given mask
In [23]: for frame in pt.iterframe(traj, frame_indices=[0, 5, 20, 50], mask='@CA'):
   ....:     print(frame)
   ....: 
<Frame with 12 atoms>
<Frame with 12 atoms>
<Frame with 12 atoms>
<Frame with 12 atoms>

Perform calculation

# rmsd to first frame with mask='@CA'
# python starts counting from 0
In [24]: pt.rmsd(traj, ref=0, mask='@CA')
Out[24]: array([ 0.    ,  2.546 ,  4.2233, ...,  4.9719,  5.5395,  4.832 ])

Convert data to pandas DataFrame

In [25]: df = pt.multidihedral(traj, resrange='3-7', dtype='dataframe')

In [26]: type(df)
Out[26]: pandas.core.frame.DataFrame

In [27]: df.head(1)
Out[27]: 
       phi_3       psi_3     omega_3       phi_4      psi_4   chip_4  \
0 -79.867366  117.355933 -179.707558 -108.912042 -55.981169 -45.0272   

      omega_4       phi_5       psi_5     chip_5     omega_5     phi_6  \
0  168.029851 -136.352598  129.444453 -158.59654 -179.635141 -122.4413   

       psi_6     chip_6     omega_6      phi_7       psi_7     omega_7  
0  144.69102 -83.364416  175.649333 -63.342516 -128.126881  178.856418  

In [28]: df.tail(1)
Out[28]: 
         phi_3       psi_3     omega_3      phi_4      psi_4     chip_4  \
100 -64.649757  103.607363  162.122206 -96.395583 -34.251063 -74.762777   

        omega_4       phi_5       psi_5      chip_5     omega_5     phi_6  \
100  171.968921 -119.367165  119.273812 -170.889973  165.773805 -96.48103   

          psi_6      chip_6     omega_6       phi_7      psi_7     omega_7  
100  107.297804  179.500233 -168.223579  156.863923 -51.867563  160.892212  

Convert to different file format

# to DCD format
In [29]: pt.write_traj('traj.dcd', traj, overwrite=True)

Combine with cpptraj command style

In [30]: pt.compute(['rms', 'radgyr @CA nomax', 'distance :3 :7'], traj)
Out[30]: 
OrderedDict([('RMSD_00000',
              array([ 0.    ,  4.0162,  6.4142, ...,  8.275 ,  8.1941,  7.7792])),
             ('RoG_00001',
              array([ 8.1092,  7.7643,  8.0969, ...,  6.3806,  6.2414,  6.4899])),
             ('Dis_00002',
              array([ 11.1745,   8.5743,   7.4076, ...,   9.0644,   6.5968,   8.6387]))])

Convert out-of-core TrajectoryIterator to in-memory Trajectory

In [31]: traj2 = traj[:]

# apply any transformations
# superpose to first frame
In [32]: pt.superpose(traj2)
Out[32]: 
pytraj.Trajectory, 101 frames: 
Size: 0.000503 (GB)
<Topology: 223 atoms, 13 residues, 1 mols, non-PBC>
           

# use the same syntax to perform calculation
In [33]: pt.rmsd(traj2, ref=0)
Out[33]: array([ 0.    ,  4.0162,  6.4142, ...,  8.275 ,  8.1941,  7.7792])

Parallel computing with TrajectoryIterator

# serial: pt.rmsd(traj)
# parallel
In [34]: pt.pmap(pt.radgyr, traj, n_cores=3)
Out[34]: 
OrderedDict([('RoG_00000',
              array([  9.4451,   9.5746,  10.1923, ...,   7.4834,   7.4324,   7.6204]))])

# chain a series of cpptraj's commands
In [35]: pt.pmap(['radgyr nomax', 'molsurf @CA', 'multidihedral resrange 3-4 psi phi'], traj, n_cores=4)
Out[35]: 
OrderedDict([('RoG_00000',
              array([  9.4451,   9.5746,  10.1923, ...,   7.4834,   7.4324,   7.6204])),
             ('MSURF_00001',
              array([ 430.269 ,  431.9791,  429.7778, ...,  429.8838,  435.3642,
                      430.1734])),
             ('phi:3',
              array([ -79.8674, -120.1549,  -52.19  , ...,  -53.6994,  -62.3338,
                      -64.6498])),
             ('psi:3',
              array([ 117.3559,  108.9836,  132.5747, ...,  122.6888,  117.9584,
                      103.6074])),
             ('phi:4',
              array([-108.912 ,  -33.8487,  -99.5304, ...,  -95.8418,  -74.1416,
                      -96.3956])),
             ('psi:4',
              array([-55.9812, -53.1521, -23.1955, ..., -56.0168, -57.3813, -34.2511]))])