Here [1], we present genome-scale computational analyses indicating that ~76% of human core promoters lack TATA-like elements, have a high GC content, and are enriched in Sp1 binding sites (cfr GC box). We further identify two motifs - M3 (SCGGAAGY) and M22 (TGCGCANK) - that occur preferentially in human TATA-less core promoters. About 24% of human genes have a TATA-like element and their promoters are generally AT-rich; however, only ~10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR). In contrast, ~46% of human core promoters contain the consensus INR (YYANWYY) and ~30% are INR-containing TATA-less genes. Significantly, ~46% of human promoters lack both TATA-like and consensus INR elements. Surprisingly, mammalian-type INR sequences are present - and tend to cluster - in the transcription start site (TSS) region of ~40% of yeast core promoters and the frequency of specific core promoter types appears to be conserved in yeast and human genomes. Gene Ontology analyses reveal that TATA-less genes in humans, as in yeast, are frequently involved in basic “housekeeping” processes, while TATA-containing genes are more often highly regulated, such as by biotic or stress stimuli.
