p53FamTaG

user: guest login login
ITB

Introduction

The p53 gene family consists of the three genes p53, p63 and p73, which have polyhedral non-overlapping functions in pivotal cellular processes such as DNA synthesis and repair, growth arrest, apoptosis, genome stability, angiogenesis, development and differentiation. These genes encode sequence-specific nuclear transcription factors that recognise the same responsive element (RE) in their target genes. Their inactivation or aberrant expression may determine tumour progression or developmental disease. The discovery of several protein isoforms with antagonistic roles, which are produced by the expression of different promoters and alternative splicing, widened the complexity of the scenario of the transcriptional network of the p53 family members. Therefore, the identification of the genes transactivated by p53 family members is crucial to understand the specific role for each gene in cell cycle regulation. We have combined a genome-wide computational search of p53 family REs and microarray analysis to identify new direct target genes. The huge amount of biological data produced has generated a critical need for bioinformatic tools able to manage and integrate such data and facilitate their retrieval and analysis. We have developed the p53FamTaG database (p53 FAMily TArget Genes), a modular relational database, which contains p53 family direct target genes selected in the human genome searching for the presence of the REs and the expression profile of these target genes obtained by microarray experiments. p53FamTaG database also contains annotations of publicly available databases and links to other experimental data.


Database Content

Genes containing REs18110
REs in promoters13649
REs in 5'UTRs563
REs in introns49678
Microarray data of genes containing the REs4874
Experimentally demonstrated target genes183
ChiP-PET high-confidence binding loci341

The p53FamTaG annotates p53 family putative direct target genes selected in the human genome searching for the presence of the p53 responsive elements (REs). In silico analysis was performed applying the DNAfan tool (Gisel et al, Bioinformatics 2004, 20:3676-9) that uses the PatSearch algorithm (Grillo et al., Nucleic Acids Res 2003, 31:3608-12) with syntax criteria we defined on the structure of 109 REs of 132 human experimentally demonstrated target genes. The search for p53 gene family REs has been limited to a region ranging from -3500 (in the promoter region) to +20000 (in intron regions) with respect to the gene TSS (Transcription Start Site). In this search the 5'UTR were included and the annotated coding exons were excluded. The consensus syntax has been modified including 3 decamers spaced by 0 to 8 bases, allowing up to 3 total mismatches in two adjacent decamers, tolerating up to 3 mismatches in the third decamer. A post-processing phase of PatSearch matching the inferred PWM was also carried out. The data has been integrated with the microarray results produced in our Lab from the overexpression of p53, p53 mutated form R175Hp53, TAp63α, ΔNp63α TAp73α, TAp73β at 6h and 12 h after induction. In order to examine the consequences of the overexpression of different members of the p53 family on their target genes under comparable conditions, we created human embryonic kidney Flp-In T-Rex-293 stable isogenic cell lines expressing the different genes under the control of a tetracycline inducible promoter. p53FamTaG also includes the annotation of 132 well known direct target genes, a global map of p53 binding sites performed by Chip-PET analysis (Wei et al., Cell. 2006 124:207-219), and links with other primary bioinformatic resources. For each gene containing the RE, the database provides the gene name (HUGO), the alias name, the EnsEmbl stable gene ID and RefSeq ID, the chromosome, the RE structure (decamers, spacers, length, sequence), the RE chromosomal position and gene region localization (promoter, 5'UTR, intron) and the microarray results. Moreover, the database provides a hyperlink to PubMed for experimentally demonstrated target genes and to UCSC for high confidence binding loci defined by Chip-PET analysis. The current version of the p53FamTaG database is based on the EnsEmbl release 34 (October-2005)


Search options and export of data

The user can query the database to find out whether or not a gene of interest contains a p53 gene family RE and how this gene is expressed under overexpression of the three members of the p53 gene family in human 293 T-rex cells. One particularly noteworthy feature of the database is the possibility to export the sequences of the REs (with the exact indication of their structures) including full information in FASTA format.

Database query form

This query form allows the users to search the database for p53 gene family target genes. The search criteria are EnsEmbl and RefSeq idenfiers, HUGO or alias gene names. All search fields accept lists of items separated by a comma and execute the search in OR mode. All three different identifiers can be used in the same search.

Query result page

This query report lists the matching database records ordered by ENSG and the information on the retrieved entries:

the Gene Name column displays the HUGO and aliases gene names. The HUGO name is a hyperlink to the HGNC database. The book-button allows the PubMed reference to be consulted for the experimentally demonstrated p53 family target genes

the EnsEmbl column displays the ENSG stable gene ID which is a link to the EnsEmbl database

the RefSeq column displays the RefSeq identifier, associated to the EnsEmbl gene ID, linked to the RefSeq database

the Localization column displays the genomic regions the RE is found in (intron, promoter, and 5'UTR)

the Chr column indicates the chromosome number where the RE is found

the Strand column indicates the orientation of the gene with an arrow

the REs column displays the number of REs found in the gene. The number, when pressed, leads to a new page containing more information (see below Details of the REs)

in the Array column, a magnifying glass-button provides a link to the microarray data in case the gene has significant results (see below Details of the Microarray results)

In the UCSC column, the UCSC-button links those genes, identified as high confidence binding loci by ChIP-PET analysis, to the UCSC database (Wei et al., Cell. 2006 124:207-219). In the UCSC database the annotation (Pet cluster sequences) can be found in the ENCODE Chromatin Immunoprecipitation tracks under the p53 ChiP-PET analysis (GIS p53 5FU HCT116 Track Settings).

Details of the REs of one ENSG

This page is the link of the number of REs present in the query report. At the top it displays information on the selected gene and below the number of REs and the details for each of them:

Export Selected button: allows the export of the sequences of the selected RE

Select/Deselect page buttons: allow users to select/deselect all the REs for export

Select: this field allows the selection of a specific RE sequence for export

Start: genomic coordinate of the RE

Size: length of the RE

Localization: gene region where the RE is found

Pattern: graphical representation of the three decamers and the number of spacer bases between the decamers

RE Sequence export form

>ENSG00000135679_2|MDM2 CHR:12 STRAND:1 CHR_START:67488947 gactcagctt ttcctctt gagctggtca agttc agacacgttc
>ENSG00000135679_3|MDM2 CHR:12 STRAND:1 CHR_START:67490147 tgaggagttc a agactagcct ggc caacatggtg
>ENSG00000135679_4|MDM2 CHR:12 STRAND:1 CHR_START:67492696 agtgtggccc aggctggtct tgaa cacctagcct
>ENSG00000135679_5|MDM2 CHR:12 STRAND:1 CHR_START:67499663 tgagtagctg ggattac aggcatgcgc caccatgccc
>ENSG00000135679_6|MDM2 CHR:12 STRAND:1 CHR_START:67501281 gagggggttt cagcatgttg gcc aggctggtct

This output shows the sequence of the selected REs in FASTA format, with the indication of their structure (decamers and spacer bases) and the exact genomic coordinates

Details of Microarray results

The genes having a result in the microarray data are linked to this page by pressing the magnifying glass present in the Array column of the query report. This page shows the expression profile of the gene in all the stable isogenic cell lines with graphical icons (for each Celera probe ID representing a gene on the array) and lists:

For each probe ID detecting the gene, the results of the gene expression of Samples (S) compared to the Control (C) are indicated as


Note

Current Curators: D. Catalano, F. Licciulli, G. Grillo, A. Turi, A.Gisel
Use this email to ask questions, report problems, or suggest improvements: bigstaff(at)ba.itb.cnr.it


Acknowledgments

This work was supported by grants from MIUR: Cluster C03 Prog. 2 L.488/92; PON - Avviso n. 68 del 23.01.02 Progetto B.I.G; Contributo Straordinario D.M. n. 1105 del 09/10/2002 (Progetto n. 187);PNR 2001-2003 (FIRB art.8) D.M.199, Strategic Program: Post-genome, grant 31-063933; FIRB 2003 art. 8 D.D. 2187 del 12-12-2003 LIBI. We thank Dr. D. D'Elia for critical discussion