There were 378,247 Ts assigned a CLS high CLS values indicate high crosslinking more efficiency and strong RBP RNA interac tions. low CLS values indicate low crosslinking efficiency or weak transient RBP RNA interactions. Consistent with the distribution of gPAR CLIP reads, CLS values were highest in 3 UTRs followed by 5 UTRs and CDSs. These observations support 3 UTRs as the primary sites for RBP RNA interactions for non translating mRNAs. To determine if enrichment of gPAR CLIP reads on UTRs was biased because of the U richness of UTRs, we compared the proportion of Us in each cross linking site to its coverage in gPAR CLIP and observed only a weak positive correlation, which by itself cannot account for the four fold enrichment of gPAR CLIP reads on UTRs.
A previous comparative analysis of seven Saccharomyces genomes revealed that approximately 14% of evolutionarily constrained Inhibitors,Modulators,Libraries bases lie outside protein coding regions, often located in UTRs. These conserved regions could represent functional elements interacting with cis acting factors. We found direct evidence of RBPs crosslinking to 35% of conserved sequence blocks in UTRs as defined by phastCons, a score representing the likelihood that a base falls in a conserved element 405 of 1,549 5 UTR blocks and 1,036 of 2,536 3 UTR blocks completely overlap with at least one RBP crosslinking site, which is significantly higher than randomly defined con trol blocks. At the gene level, ATG8, a key autophagy gene, con tains two major crosslinking sites that overlap with conserved sequence blocks in its 3 UTR.
Similarly, TOM40, which encodes a translocase that mediates import of mitochondria loca lized proteins into the mitochondria, contains two major 3 UTR crosslinking sites in regions with high local con servation. To further elucidate the connection between RBP binding and conservation, Inhibitors,Modulators,Libraries we binned Ts by CLS values and observed that Ts in all 3 and Inhibitors,Modulators,Libraries 5 UTR bins, as well as the majority of CDS bins, were more conserved than randomly binned Ts, suggesting that Inhibitors,Modulators,Libraries RBP crosslinking sites are under purifying selection. Unexpectedly, 3 and 5 UTR nucleotides in the lowest CLS bins exhibited extremely high conservation. Since a low CLS can indicate inefficient RNA capture, and gPAR CLIP inefficiently captures highly structured, dou ble stranded RNA, we hypothesized that low CLS high conservation bins represent con served, secondary structure motifs recognized by RBPs.
For example, She2p binds a distinct stem loop structure in several bud localized mRNAs, including ASH1, for which the She2p 3 UTR recognition element is weakly represented in our gPAR CLIP dataset. To determine if Ts with low CLS values are located Inhibitors,Modulators,Libraries in RNA regions with a high degree of secondary structure, we computed the probability of each T being unpaired using RNAplfold, a local buy inhibitor thermodynamic folding algo rithm.