Nucleic Acid Analysis

pytraj.nastruct(traj=None, ref=0, resrange=None, resmap=None, hbcut=3.5, frame_indices=None, pucker_method='altona', dtype='nupars', baseref=None, groove_3dna=True, top=None)

compute nucleic acid parameters. (adapted from cpptraj doc)

Parameters:

traj : Trajectory-like

ref : {Frame, int}, default 0 (first frame)

resrange : None, str or array-like of integers

resmap : residue map, example: ‘AF2:A’

hbcut : float, default=3.5 Angstrong

Distance cutoff for determining basepair hbond

pucker_method : str, {‘altona’, ‘cremer’}, default ‘altona’

‘altona’ : Use method of Altona & Sundaralingam to calculate sugar pucker ‘cremer’ : Use method of Cremer and Pople to calculate sugar pucker’

frame_indices : array-like, default None (all frames)

baseref : {None, str}, default None

if given filename, use it for base reference (versionadded: 1.0.6)

groove_3dna : bool, default True

if True, major and minor groove will match 3DNA’s output.

dtype : str, {‘nupars’, ‘cpptraj_dataset’}, default ‘nupars’

Returns:

out : nupars object. One can assess different values (major groove width, xdips values

...) by accessing its attribute. See example below.

Examples

>>> import pytraj as pt
>>> import numpy as np
>>> traj = pt.datafiles.load_rna()
>>> data = pt.nastruct(traj, groove_3dna=False)
>>> data.keys()[:5] 
['buckle', 'minor', 'major', 'xdisp', 'stagger']
>>> # get minor groove width values for each pairs for each snapshot
>>> # data.minor is a tuple, first value is a list of basepairs, seconda value is
>>> # numpy array, shape=(n_frames, n_pairs)
>>> data.minor 
(['1G16C', '2G15C', '3G14C', '4C13G', '5G12C', '6C11G', '7C10G', '8C9G'],
 array([[ 13.32927036,  13.403409  ,  13.57159901, ...,  13.26655865,
         13.43054485,  13.4557209 ],
       [ 13.32002068,  13.45918751,  13.63253593, ...,  13.27066231,
         13.42743683,  13.53450871],
       [ 13.34087658,  13.53778553,  13.57062435, ...,  13.29017353,
         13.38542843,  13.46101475]]))
>>> data.twist 
(['1G16C-2G15C', '2G15C-3G14C', '3G14C-4C13G', '4C13G-5G12C', '5G12C-6C11G', '6C11G-7C10G', '7C10G-8C9G'],
array([[ 34.77773666,  33.98158646,  30.18647003, ...,  35.14608765,
         33.9628334 ,  33.13056946],
       [ 33.39176178,  32.68476105,  28.36385536, ...,  36.59774399,
         30.20827484,  26.48732948],
       [ 36.20665359,  32.58955002,  27.47707367, ...,  33.42843246,
         30.90047073,  33.73724365]]))
>>> # change dtype
>>> data = pt.nastruct(traj, dtype='cpptraj_dataset')

Longer Examples

In [1]: import pytraj as pt

In [2]: traj = pt.iterload('data/adh026.3.pdb')

In [3]: traj
Out[3]: 
pytraj.TrajectoryIterator, 3 frames: 
Size: 0.000035 (GB)
<Topology: 518 atoms, 16 residues, 2 mols, non-PBC>
           

# check unique residue names
In [4]: traj.top.residue_names
Out[4]: {'C', 'C3', 'G', 'G5'}

In [5]: na = pt.nastruct(traj)

In [6]: na
Out[6]: <nupars, keys = ['slide', 'shear', 'shift', 'stretch', 'buckle', 'hb', 'hrise', 'tip', 'ydisp', 'tilt', 'bp', 'roll', 'rise', 'twist', 'minor', 'open', 'incl', 'zp', 'htwist', 'pucker', 'prop', 'major', 'xdisp', 'stagger']>

# get values for major groove width
In [7]: na.major
Out[7]: 
(['3G14C-4C13G', '4C13G-5G12C', '5G12C-6C11G'],
 array([[ 19.9015,  20.751 ,  19.9822],
        [ 19.5528,  20.6965,  19.6335],
        [ 19.2871,  21.201 ,  19.9055]]))

# get values for several parameters
In [8]: for key in ['minor', 'twist', 'incl']:
   ...:     print(na[key])
   ...: 
(['3G14C-4C13G', '4C13G-5G12C', '5G12C-6C11G'], array([[ 15.8085,  15.818 ,  15.4856],
       [ 15.9407,  16.2324,  16.3369],
       [ 15.9514,  16.7085,  16.6283]]))
(['1G16C-2G15C', '2G15C-3G14C', '3G14C-4C13G', '4C13G-5G12C', '5G12C-6C11G', '6C11G-7C10G', '7C10G-8C9G'], array([[ 34.7777,  33.9816,  30.1865, ...,  35.1461,  33.9628,  33.1306],
       [ 33.3918,  32.6848,  28.3639, ...,  36.5977,  30.2083,  26.4873],
       [ 36.2067,  32.5896,  27.4771, ...,  33.4284,  30.9005,  33.7372]]))
(['1G16C-2G15C', '2G15C-3G14C', '3G14C-4C13G', '4C13G-5G12C', '5G12C-6C11G', '6C11G-7C10G', '7C10G-8C9G'], array([[ 15.1276,   5.5395,  10.4794, ...,   1.663 ,  -8.9116,  12.9009],
       [  0.8921,  18.6342,   3.6122, ...,  -8.3858,  19.0858,   5.6582],
       [  3.5335,  14.6935,  11.2296, ...,  10.8699,  -3.0495,  21.6089]]))

# get some statistics (syntax might be changed in future)
In [9]: import numpy as np

# get mean
In [10]: na._summary(np.mean)
Out[10]: 
{'bp': array([ 1.,  1.,  1.]),
 'buckle': array([ 3.9745,  4.4635,  3.8478]),
 'hb': array([ 3.,  3.,  3.]),
 'hrise': array([ 2.9544,  2.8557,  2.8099]),
 'htwist': array([ 32.8285,  31.2766,  32.1462]),
 'incl': array([  7.2931,   7.5347,  10.0523]),
 'major': array([ 20.2115,  19.9609,  20.1312]),
 'minor': array([ 15.704 ,  16.17  ,  16.4294]),
 'open': array([ 1.608 , -0.2505, -1.071 ]),
 'prop': array([-8.377 , -7.3107, -5.6306]),
 'pucker': array([  76.5285,   98.5826,  101.2006]),
 'rise': array([ 3.2847,  3.2277,  3.2468]),
 'roll': array([ 3.8703,  3.8303,  5.5281]),
 'shear': array([ 0.0847, -0.2052,  0.0952]),
 'shift': array([-0.3018,  0.1676, -0.1152]),
 'slide': array([-1.9909, -2.0282, -2.2355]),
 'stagger': array([ 0.1386,  0.1052,  0.0765]),
 'stretch': array([-0.0427, -0.1148, -0.0769]),
 'tilt': array([-2.3159, -0.8674, -1.2469]),
 'tip': array([ 5.1296,  1.7297,  2.8354]),
 'twist': array([ 31.8104,  30.5342,  31.0631]),
 'xdisp': array([-4.3154, -4.6259, -5.0389]),
 'ydisp': array([-0.0041, -0.5068, -0.1015]),
 'zp': array([ 2.4829,  2.6598,  2.6448])}

# get std
In [11]: na._summary(np.std)
Out[11]: 
{'bp': array([ 0.,  0.,  0.]),
 'buckle': array([ 6.7023,  5.5069,  5.8921]),
 'hb': array([ 0.,  0.,  0.]),
 'hrise': array([ 0.5058,  0.391 ,  0.4398]),
 'htwist': array([ 3.5496,  3.7633,  3.9124]),
 'incl': array([ 8.007 ,  9.3024,  7.303 ]),
 'major': array([ 0.3828,  0.5212,  0.7975]),
 'minor': array([ 0.1545,  0.1677,  0.3396]),
 'open': array([ 3.1167,  2.5142,  2.3933]),
 'prop': array([ 5.4986,  6.2782,  6.6499]),
 'pucker': array([ 131.9281,  146.2912,  146.136 ]),
 'rise': array([ 0.1861,  0.2306,  0.219 ]),
 'roll': array([ 4.4903,  5.2958,  4.3097]),
 'shear': array([ 0.4224,  0.2623,  0.308 ]),
 'shift': array([ 0.334 ,  0.5723,  0.443 ]),
 'slide': array([ 0.4734,  0.5112,  0.2721]),
 'stagger': array([ 0.2288,  0.2353,  0.2485]),
 'stretch': array([ 0.1633,  0.0791,  0.1337]),
 'tilt': array([ 4.3807,  2.1819,  4.2355]),
 'tip': array([ 10.0093,   3.8594,   8.9471]),
 'twist': array([ 4.4756,  3.6107,  4.1022]),
 'xdisp': array([ 1.7488,  1.8645,  1.3394]),
 'ydisp': array([ 1.1157,  1.2389,  0.579 ]),
 'zp': array([ 0.3308,  0.375 ,  0.2435])}

# only interested in some parameters?
In [12]: na._summary(np.mean, keys=['major', 'minor', 'twist'])
Out[12]: 
{'major': array([ 20.2115,  19.9609,  20.1312]),
 'minor': array([ 15.704 ,  16.17  ,  16.4294]),
 'twist': array([ 31.8104,  30.5342,  31.0631])}

# explain keywords
In [13]: print(na._explain())

        [shear] Base pair shear.
        [stretch] Base pair stretch.
        [stagger] Base pair stagger.
        [buckle] Base pair buckle.
        [prop] Base pair propeller.
        [open] Base pair opening.
        [hb] Number of hydrogen bonds between bases in base pair.
        [pucker] Base sugar pucker.
        [major] Rough estimate of major groove width, calculated between P atoms of each
        base.
        [minor] Rough estimate of minor groove width, calculated between O4 atoms of
        each base.
        [shift] Base pair step shift.
        [slide] Base pair step slide.
        [rise] Base pair step rise.
        [title] Base pair step tilt.
        [roll] Base pair step roll.
        [twist] Base pair step twist.
        [xdisp] Helical X displacement.
        [ydisp] Helical Y displacement.
        [hrise] Helical rise.
        [incl] Helical inclination.
        [tip] Helical tip.
        [htwist] Helical twist.
        

# if we have long analysis, we can temporarily save ``na`` to disk by `pytraj.to_pickle`` and load back later.
In [14]: pt.to_pickle(na, 'na_.pk')

# load back pickle object
In [15]: na2 = pt.read_pickle('na_.pk')

In [16]: na2
Out[16]: <nupars, keys = ['slide', 'shear', 'shift', 'open', 'buckle', 'hb', 'hrise', 'tip', 'ydisp', 'tilt', 'bp', 'roll', 'rise', 'twist', 'minor', 'stretch', 'incl', 'zp', 'htwist', 'pucker', 'prop', 'major', 'xdisp', 'stagger']>

In [17]: na2._summary(np.mean, ['major', 'twist'])
Out[17]: 
{'major': array([ 20.2115,  19.9609,  20.1312]),
 'twist': array([ 31.8104,  30.5342,  31.0631])}