Protein-DNA Interface Alignment Software

The protein-DNA interface alignment algorithm is introduced and explained in Siggers et al. (see reference). Here we provide both the linux executables and all source code for programs needed to perform an interface alignment between any two protein-DNA complexes based on their PDB files.

Download

Description

The protein-DNA alignment software allows one to align the interfacial amino acids from two protein-DNA complexes based on the geometric relationship of each amino acid to its local DNA. The algorithm is described in detail in the paper Siggers et al. Structural Alignment of Protein-DNA Interfaces: Insights inot the Determinants of Binding Specificity. JMB 345(5):1027-45 (2004). The programs described here allow the user to take two PDB files that both contain protein-DNA complexes and to perform an interface alignment of the two complexes. The programs will output the aligned residues and their corresponding residue-residue similarity scores, s(i,j). The alignment program will also report the interface alignment score (IAS) which provides a measure of the similarity of the docking geometry for the two proteins onto their DNA substrates.

Documentation

Quick tutorial: How to align two protein-DNA interfaces from PDB files

Here we demonstrate how to perform an alignment between the two homoeodomain-DNA complexes 1hdd.pdb (Engrailed) and 9ant.pdb (Antp).

1. Determine the complex you want to align. If the PDB files that you have contain multiple chains, then you might want to break them up into individual pdb files. For example, if the PDB file contained two protein chains,A and B, and two DNA chains, C and D. Then you may want to break up the PDB file FOO.pdb into a FOO_A.pdb and a FOO_B.pdb which contain the protein chains A and B, respectively, and both DNA chains. The reason for this is that if you are aligning two PDB files that both contain 2 protein chains, the algorithm maintains the order of the chains as they appear in the PDB file and simply assumes that they belong to one big chain. Therefore, if a PDB file FOO_1.pdb contained protein chains A and B, in that order, and FOO_2.pdb contained protein chains F and G, in that order, then the algorithm could not align A to G and B to F simultaneously. However, for this example I will leave the multi-chain PDB files of 1hdd and 9ant unaltered.

2. Change base (nucleotide) names from one letter to three letter version, e.g. T->THY. To do this simply run

 > perl    change_nucleotide_names.pl   1hdd.pdb

-- this will re-write 1hdd.pdb with the 3 letter names

3. Format the two PDB files:

 > interface_format.exe    -i 1hdd.pdb&nbsp >    1hdd.db
 > interface_format.exe    -i 9ant.pdb&nbsp >    9ant.db

4. Align 1hdd.db and 9ant.db:

 > interface_align.exe    -i 1hdd.db    -j 9ant.db   > output

Output should look like:

1hdd.pdb9ant.pdb     S(i,j)  <br>
LYS D 57LYS B57     =    5.9
LYS D 55LYS B55     =    3.5
    :
    :    
ARG C5ARG A   5     =    2.7
57 aligned residues IAS = 318.0

Compile Protein-DNA Interface Alignment Software

Required executables

interface_format.exe : This will format the PDB file into a new format which describes the interface in terms of the individual amino acid-nucleotide pairs and their geometric relationship. We refer to these files as *.db files.

interface_align.exe : This will read the 2 formatted PDB files (*.db files) and will perform the interface alignment.

change_nucleotide_names.pl : This will change all the DNA base names to a three letter version (e.g. T->THY). This is required for the interface_format.exe program.

Compiling executables

To compile the interface_align.exe program, go to the directory that contains all the source code, i.e. the INTERFACE_ALIGN directory from the tar file that you downloaded. Once in this directory simply type

  > gcc -o interface_align.exe *.c -lm

The same can be done for the interface_format.exe executable using the source files in the INTERFACE_FORMAT directory.

References

Siggers TW, Silkov A, Honig B. Structural alignment of protein–DNA interfaces: insights into the determinants of binding specificity. J Mol Biol. 2005 Feb 4;345(5):1027-45.

Acknowledgments

Protein-DNA Interface Alignment Software is supported by a funding from the National Science Foundation Grant # DBI-9904841.

Developed in the Honig Lab.

Questions

Address questions related to Protein-DNA Interface Alignment Software to honig_software@c2b2.columbia.edu.