Protocol Sheet: ACETK.PRMCHARMM.v1

Assigning CHARMM-CGenFF parameters to a ligand

Author: Toni Giorgino ( acemd@googlegroups.com )
Copyrights 2011, University Pompeu Fabra. All rights reserved.

Software requirements: VMD. The latest CHARMM forcefield files, including CGenFF.
Knowledge requirements: Use of CHARMM topology and parameter files, CHARMM system building.
Input requirements: See below.

Disclaimer: This protocol is based on a best practice. In no way we guarantee that it is optimal for your system. You use it at your own risk. Note that the CGenFF and the ParamChem web sites are under development. Results may vary when new releases are out. Procedure tested on 27/feb/2011 with VMD 1.8.7 and CHARMM parameter set 36a.

Example files

Download them here.

Basic procedure

Small organic molecules that are not standard biological molecules (amino acids, lipids, etc.) often do not have pre-built files to match the force field you are going to use, which means you will have to make them on your own. The procedure outlined in this page generates CHARMM-style topology (PSF) and parameter files (INP) of a small organic molecule; the molecule can then be embedded in a larger biomolecular system, typically built with the CHARMM all-atom forcefield.

This procedure relies on the ParamChem web service, which performs fragment-based parameter assignments by analogy; the parameters themselves belong to the general CHARMM forcefield (CGenFF) [4]. Further QM-based optimization of parameters, e.g. dihedrals (i.e. "parametrization" in the strict sense), is beyond the scope of this tutorial [1].

In some cases, you may be able to parametrize molecules and/or build the system using CHARMM-GUI ("PDB reader" function).

Step 1: Prerequisites for the input

For the following procedure to work, you will need the ligand's structure with the following characteristics:

  • molecular structure in Tripos MOL2 file formats, with coordinate information (in this example: ligand.mol2)

  • the same molecule in PDB format, for system preparation
  • explicit hydrogens should be present
  • protonation states should be appropriate for the simulation conditions
  • total charge should be known
  • all of the atoms of the parametrized ligand should be present when you generate a PSF structure.

Please see the "solution to common problems" section if some of the conditions are not met.

Step 2: Upload molecule to the ParamChem server

  1. Register with the ParamChem web site (if not already done) and note the password.

  2. Select "upload molecule" and upload the MOL2 file.

  3. If the process is successful, you will be able to download a stream file. Save and rename it with the toppar_ prefix and the .str extension. For example: toppar_ligand_cgenff.str.

  4. Inspect the file to verify the quality of guessed parameter assignments. The server inserts notes concerning the 'penalty' of charges and bonded parameters. If penalties are low, you can proceed.

Step 3: Split the stream file into topology and parameters

Two new files, the topology and parameter components, must now be generated from the stream file.

  1. In top_ligand.rtf (topology file): copy the lines between read rtf card append and the next END (both excluded). Manually modify the line starting with RESI in order to set your chosen residue name. This will be the name of the residue in the PDB file.

  2. In par_ligand.inp (parameter file): copy the lines between read param card flex append and the next END (both excluded).

Note: The parameter file may or may not contain values, depending on the molecule.

Step 4: Generate PSF

The system can now be built. Likely you will use either autopsf or psfgen. In either case, you will add the newly-generated residue topology file top_ligand.rtf and the CGenFF topology file top_all36_cgenff.rtf to the list of topology files considered by the building program.

  • If using autopsf, this is achieved clicking "Add" in the "topology files" section.

  • In psfgen, it is done with the "topology top_ligand.rtf" directive.

Structure building is based on manipulation or merging of PDB files and uses initial coordinate information stored therein. Please refer to the appropriate tutorials [3].

Step 5: Merge parameter files

If any of the sections of the par_ligand.inp file is not empty, the corresponding lines should be merged in the parameter file chosen for the simulation. For example, the widely-used CHARMM22 All-Hydrogen Parameter File for Proteins and Lipids is found in a file named par_all27_prot_lipid_na.inp. Make a local copy of the latter and insert the sections of par_ligand.inp in the corresponding places. Do the same with the CGenFF parameter file par_all36_cgenff.prm. A pre-merged version of the two is here.

NAMD users can skip step 5 using .str files directly.

Solution to common problems

  • Molecule not in MOL2 format. Software like MarvinSketch and openbabel can perform conversions, while computing bonds and bond orders from the three-dimensional structure. Inspect and fix them manually if necessary.

  • Molecule not in PDB format. As above, openbabel or other molecular manipulation software (VMD, AmberTools's antechamber, MarvinSketch, etc.) can perform the conversion. Note that the conversion is lossy.

  • Error messages from ParamChem may be due to wrong bond order: correct connectivity and bond order information in the MOL2 file is crucial. In particular, make sure that carbon atoms have the correct valence. Some software (notably, VMD) outputs incorrect bond orders in MOL2 files. To fix them, you have two options:

    • (recommended) use a chemically-aware software (see above) to produce the correct ones; or
    • enable the guess bond orders from connectivity check-box in ParamChem. Note that this will disable some consistency checks; in particular, it may fail to produce reasonable parameters if the input connectivity is wrong.

  • Missing explicit hydrogen atoms, protonation states, total charge can be computed within MarvinSketch.

  • Stereoisomerism. If the specie being parametrized is compound by two or more conformations that do not interconvert within your sampling time, make sure that the initial coordinates used for the simulation reflect the configuration of interest. However, stereoisomers have the same parameter set.

  • Missing starting coordinates, or 2D coordinates only available. You may obtain an initial guess of the molecular geometry running a short simulation or minimization on the ligand. See e.g. how it is done in basic energy minimization in CHARMM.

  • Guess coordinates works poorly. Although generally unnecessary, the "guess coordinates" output can be improved adding an IC table to the RTF. A workaround to get the IC table is to submit the molecule (in the appropriate geometry) to the SwissParam website. Open the resulting topology file, and copy only the lines beginning by IC into the topology created by ParamChem.

Example files

Download them here.

References

[1] QM-based fitting of parameters, http://dogmans.umaryland.edu/~kenno/tutorial/#equil.

[2] Parametrization by analogy, http://www.ks.uiuc.edu/Training/Tutorials/science/forcefield-tutorial/forcefield-html/node6.html.

[3] NAMD tutorial, Generating a Protein Structure File (PSF)

[4] Vanommeslaeghe K, Hatcher E, Acharya C, Kundu S, Zhong S, Shim J, et al. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J. Comput. Chem. 2010;31(4):671–90.

Copyright 2008-2011. All rights reserved.