Protocol Online logo
Top : New Forum Archives (2009-): : Bioinformatics and Biostatistics

Splitting a multi-model PDB file into separate files for each model - (Aug/19/2015 )

hi

how can i split a NMR-type multi-model pdb file into individual models? i want to display structure of a certain protein using rasmol software but i see multiple superimposed models while opening the pdb file in the software. can i manually delete extra model from the pdb file?

(i got pdb file from www.rcsb.org)

thanks

-marzieh-

Note: I've never done this with NMR files, only crystallography:

 

You should be able to split them, the PDB file when you open it with a text editor or something similar (Word etc.) should have a line which has only one word on it, which will be "TER" this tells you (and Rasmol) that this is the end of one of the molecules and that the other should now start. It should be fine to delete any portions, so that the file starts with your desired protein and ends in "TER".

-bob1-

thank you

but how can i select between 8 models? (seemingly the models are not exactly overlay with each other)

-marzieh-

You will have to do this manually. Each protein is separated represented individually in the file, with coordinates for each aminoacid and sidechain so the file should look something like this (from this solution NMR file on PDB, if you want to look at the full file):

REMARK 999 SEQUENCE                                                             
REMARK 999 A SEQUENCE DATABASE REFERENCE FOR THIS PROTEIN DOES NOT CURRENTLY    
REMARK 999 EXIST.                                                               
DBREF  2MT7 A    1    35  PDB    2MT7     2MT7             1     35             
SEQRES   1 A   35  GLY ASN ASP CYS LEU GLY PHE TRP SER ALA CYS ASN PRO          
SEQRES   2 A   35  LYS ASN ASP LYS CYS CYS ALA ASN LEU VAL CYS SER SER          
SEQRES   3 A   35  LYS HIS LYS TRP CYS LYS GLY LYS LEU                          
SHEET    1   A 2 VAL A  23  CYS A  24  0                                        
SHEET    2   A 2 CYS A  31  LYS A  32 -1  O  LYS A  32   N  VAL A  23           
SSBOND   1 CYS A    4    CYS A   19                          1555   1555  2.10  
SSBOND   2 CYS A   11    CYS A   24                          1555   1555  2.00  
SSBOND   3 CYS A   18    CYS A   31                          1555   1555  2.02  
CRYST1    1.000    1.000    1.000  90.00  90.00  90.00 P 1           1          
ORIGX1      1.000000  0.000000  0.000000        0.00000                         
SCALE1      1.000000  0.000000  0.000000        0.00000                                   
MODEL        1                                                                  
ATOM      1  N   GLY A   1       0.462  -0.262   1.412  1.00 44.13           N  
ATOM      2  CA  GLY A   1       0.944  -0.465   0.058  1.00 43.21           C  
ATOM      3  C   GLY A   1       1.013  -1.931  -0.319  1.00 61.33           C  
:
:
:
ATOM    517 HD22 LEU A  35      15.057 -28.131  -8.503  1.00  1.23           H  
ATOM    518 HD23 LEU A  35      16.430 -28.917  -7.722  1.00 31.13           H  
TER     519      LEU A  35                                                      
ENDMDL                                                                          
MODEL        2                                                                  
ATOM      1  N   GLY A   1       2.117   0.429   0.262  1.00 72.24           N  
ATOM      2  CA  GLY A   1       3.517   0.077   0.112  1.00 52.30           C    

As you can see the first model is represented by "MODEL 1" and is ended by "TER" and "ENDMDL" and then the file goes on to MODEL 2... You can extract each model separately by looking for those bits and use those to make a PDB file of your own: use a texteditor such as WinEdit (on windows) or TextEdit (Mac) and just change the file suffix from ".txt" to ".pdb" then open in rasmol.

-bob1-

thanks for your replies

is there no need to the first records on the top of the pdb file (such as name, title etc.) that are not repeated for each model (for viewer software or for servers such as procheck etc. )?

my second question is, which model could be selected between 8 to show the structure of the protein? (given that other 7 models will be ignored)

-marzieh-

That's up to you. Just make sure that the proteins you are visualizing are not the result of interactions (i.e. protein x in complex with protein y), in which case you have to make sure that you are choosing the right one to act as a model.

-bob1-