N6-methyladenosine (m6A) is the most prevalent RNA modification on mRNAs and lncRNAs. Evidence increasingly demonstrates its crucial importance in essential molecular mechanisms and various diseases. With recent advances in sequencing techniques, tens of thousands of m6A sites are identified in a typical high-throughput experiment, posing a key challenge to distinguish the functional m6A sites from the remaining ‘passenger’ (or ‘silent’) sites. Given that functionally important m6A sites increase organismal fitness and hence are more likely to be conserved during evolution, we performed a comparative conservation analysis of the human and mouse m6A epitranscriptomes. Specifically, a novel scoring framework, ConsRM, was devised to quantitatively measure the degree of conservation of individual m6A RNA methylation sites. ConsRM integrates positional mapping, tissue-specific mapping, support from multiple studies, sequence similarity, genome conservation and a positive-unlabeled learning model, which integrated 63 genomic features as well sequence features to trace epitranscriptome layer conservation. We showed that the newly developed scoring framework can effectively distinguish the conserved and un-conserved m6A sites with a series validation experiments in mouse, fly and zebrafish. Further analysis revealed that the m6A sites with a higher ConsRM score are more functionally important. The germline mutations were less likely to affect the highly conserved m6A sites compared with less conserved sites; while an opposite trend was observed for the somatic mutations found in cancer cells, suggesting that dysregulation of the conserved m6A sites is more likely to be associated with disease pathogenesis, and this portion of functionally relevant m6A sites was captured with ConsRM scoring framework. Besides, the conserved m6A sites were also more likely to fall within the binding regions of various RNA binding proteins, especially m6A readers.
© Copyrights The Meng Lab. All Rights Reserved