Target Explorer

Target Explorer automates the entire process from the creation of a customized library of binding sites for known transcription factors through the prediction and annotation of putative target genes that are potentially regulated by these factors.

Start webserver

Click here to launch Target Explorer.

Description

Target Explorer is an automated process of prediction of complex regulatory elements for specified set of transcription factors in Drosophila melanogaster genome.

Sequencing of several eukaryotic genomes during the last decade opens tremendous perspectives for our understanding of gene function and regulation. While significant progress was achieved in gene prediction and functional annotation, our ability to identify regulatory elements required for the correct expression of genes is limited. Gene regulatory elements consist of short conserved binding sites for specific transcription factors (TFs) that control the level of gene expression. The set of target genes for particular TF may need to be expressed at different levels that can be accomplished by having binding sites with different but similar sequences and different affinities for the protein. Due to intrinsic sequence variability of binding sites they must be represented by a model that summarize information about their alignment. The simplest method for describing of binding site is the IUPAC consensus sequence which indicate the predominant nucleotide or nucleotide combination at each position in a set of training sequences. While it is easy to write a consensus sequence, it is not so straightforward to find one that is optimal for predicting the occurrence of new sites. An alternative to consensus sequences is a weight matrix representation of the sites. Positional weight matrices store the frequency of each nucleotide at each position of the motif. The score for any particular site is calculated as the sum of matrix values for that site's sequence. Any sequence that differs from the consensus will have a lower score, but the decrease depends on the differences. This is convenient way to account for the fact that some positions are more conserved than others, and presumably are more important for the activity of the site.

Binding sites are often organized in functional groups called modules where TFs bind to promoter regions and regulate transcription as synergistic (cooperative) or antagonistic complex. Utilization of information about close positioning of known TF binding sites leads to more accurate prediction of novel regulatory regions. The short length and degenerative nature of binding site sequences lead to the large number of false-positive biologically non-functional predictions for single TF. Predictions can be further strengthened if TF is known to function as a homo- or hetero-oligomer, and several clustered binding sites are found.

The next step would be the development of tool that combine binding sites prediction with context information taken from the sequence annotation.

Target Explorer is a complex tool with user-friendly self-explanatory Web-interface that allows the user to:

create customized library of TF binding site matrices based on user defined sets of training sequences
search for new clusters of binding sites for specified set of TFs
extract annotation for potential target genes

Target Explorer was specifically designed for well-annotated Drosophila melanogaster genome, but some options can be used for any sequence of interest.

References

Sosinsky A, Bonin CP, Mann RS, Honig B. Target Explorer: an automated tool for the identification of new target genes for a specified set of transcription factors. Nucleic Acids Res. 2003 Jul 1;31(13):3589-92.

Acknowledgments

Target Explorer is supported by a funding from the National Science Foundation Grant # DBI-9904841.

Developed in the Honig Lab.

Questions

For questions related to Target Explorer, contact as1689@columbia.edu.