Splitting a multi-model PDB file into separate files for each model - (Aug/19/2015 )
hi
how can i split a NMR-type multi-model pdb file into individual models? i want to display structure of a certain protein using rasmol software but i see multiple superimposed models while opening the pdb file in the software. can i manually delete extra model from the pdb file?
(i got pdb file from www.rcsb.org)
thanks
Note: I've never done this with NMR files, only crystallography:
You should be able to split them, the PDB file when you open it with a text editor or something similar (Word etc.) should have a line which has only one word on it, which will be "TER" this tells you (and Rasmol) that this is the end of one of the molecules and that the other should now start. It should be fine to delete any portions, so that the file starts with your desired protein and ends in "TER".
thank you
but how can i select between 8 models? (seemingly the models are not exactly overlay with each other)
You will have to do this manually. Each protein is separated represented individually in the file, with coordinates for each aminoacid and sidechain so the file should look something like this (from this solution NMR file on PDB, if you want to look at the full file):
REMARK 999 SEQUENCE REMARK 999 A SEQUENCE DATABASE REFERENCE FOR THIS PROTEIN DOES NOT CURRENTLY REMARK 999 EXIST. DBREF 2MT7 A 1 35 PDB 2MT7 2MT7 1 35 SEQRES 1 A 35 GLY ASN ASP CYS LEU GLY PHE TRP SER ALA CYS ASN PRO SEQRES 2 A 35 LYS ASN ASP LYS CYS CYS ALA ASN LEU VAL CYS SER SER SEQRES 3 A 35 LYS HIS LYS TRP CYS LYS GLY LYS LEU SHEET 1 A 2 VAL A 23 CYS A 24 0 SHEET 2 A 2 CYS A 31 LYS A 32 -1 O LYS A 32 N VAL A 23 SSBOND 1 CYS A 4 CYS A 19 1555 1555 2.10 SSBOND 2 CYS A 11 CYS A 24 1555 1555 2.00 SSBOND 3 CYS A 18 CYS A 31 1555 1555 2.02 CRYST1 1.000 1.000 1.000 90.00 90.00 90.00 P 1 1 ORIGX1 1.000000 0.000000 0.000000 0.00000 SCALE1 1.000000 0.000000 0.000000 0.00000 MODEL 1 ATOM 1 N GLY A 1 0.462 -0.262 1.412 1.00 44.13 N ATOM 2 CA GLY A 1 0.944 -0.465 0.058 1.00 43.21 C ATOM 3 C GLY A 1 1.013 -1.931 -0.319 1.00 61.33 C : : : ATOM 517 HD22 LEU A 35 15.057 -28.131 -8.503 1.00 1.23 H ATOM 518 HD23 LEU A 35 16.430 -28.917 -7.722 1.00 31.13 H TER 519 LEU A 35 ENDMDL MODEL 2 ATOM 1 N GLY A 1 2.117 0.429 0.262 1.00 72.24 N ATOM 2 CA GLY A 1 3.517 0.077 0.112 1.00 52.30 C
As you can see the first model is represented by "MODEL 1" and is ended by "TER" and "ENDMDL" and then the file goes on to MODEL 2... You can extract each model separately by looking for those bits and use those to make a PDB file of your own: use a texteditor such as WinEdit (on windows) or TextEdit (Mac) and just change the file suffix from ".txt" to ".pdb" then open in rasmol.
thanks for your replies
is there no need to the first records on the top of the pdb file (such as name, title etc.) that are not repeated for each model (for viewer software or for servers such as procheck etc. )?
my second question is, which model could be selected between 8 to show the structure of the protein? (given that other 7 models will be ignored)
That's up to you. Just make sure that the proteins you are visualizing are not the result of interactions (i.e. protein x in complex with protein y), in which case you have to make sure that you are choosing the right one to act as a model.