Introduction of PSI-Mouse


Pseudouridine (Ψ) is the first discovered and the most abundant RNA modification that has been widely study in the past decades. Ψ is very important to various kinds of RNA, such as rRNA, tRNA and mRNA. To identify potential Ψ sites in RNA sequence, the laborious and high cost experimental approaches have some limitations, such as its limited coverage. Alternatively, the computational methods for Ψ site prediction provide a more cost-effective avenue. Only two sequence-based models was available in the field of mouse Ψ site prediction, and therefore, it is highly desirable to propose a high-accuracy model using additional genomic features. In our study, a high-accuracy prediction framework PSI-Mouse was established for Ψ site prediction in mouse transcriptome. Beside the conventional sequence-derived features, PSI-Mouse firstly introduced 38 additional genomic features to the prediction model, and achieved satisfactory improvement in the prediction accuracy. Furthermore, PSI-Mouse also provides gene annotations and various post-transcriptional analysis, and 3282 experiment-validated mouse Ψ sites were included with query function, which serves as a useful tool to explore the underlying regulatory mechanisms of computationally predicted Ψ sites.

Ψ-WHISTLE takes two kind of formats as input file.

(1): The tab-delimited txt format, which contains genomic position information (genome assembly: mm10) with four columns: chromosome, start position, end position, and strand.

Example:
  • 13 12451426 12451626 -
  • 17 25753789 25754789 -
  • 16 86430925 86430925 +
  • 1 6248095 6248095 +

(2): The standard FASTA format.

Example:
  • >test1
    AUGGGGGUGGAACUCAUGAUGGAAUUGGAGCCUUUACAAGGGAAUGAAGA GACAAGAGCUCUCUUUAUGCCACGUGAGGAUACAGCAAGGCCCCAAUCUG CAAGCCAGGAAGAGUCGUCACGAGAACCAGACCAUGCAGGAACUCUGAUC
    GUGGA
  • >test2
    UACUAAUUUUCAAAGGCGGGGUUCUGCCAGGCAUAGUCUUUUUUUCUGGC GGCCCUUGUGUAAACCUGUCUUUCAGACCUUGUUGGACAUCCCGUACAAU CAAGAUGUUCCUGUAUGUUGUUUGCAGUCUGGCGGUUUGCUUUCGAGGAC
    UAUUUAUU

(3): Users can either use the example file we provided or upload their own file.



(1): Chose the feature and annotation types to start the analysis.

* The email address is strongly recommended, as the analysis may take some times and users can get notification email when their job is finished.


(2): When users submit their job, a unique job ID will be created. Users can wait in this page for their job to be finished, or use the job ID to retrieve their job.


When users leave the job page or lose the page on accident, the job ID can be used to retrieve their job.



For users upload genomic position information using both genomic and sequence-derived features.

(1): For users upload genomic position information using both genomic and sequence-derived features, an overall summary table is provided. Users can view the basic information of each putative Ψ sites with various annotations, such as gene symbol, likelihood ratio (LR), confidence level, and the number of related post-transcriptional analysis.


(2): By clicking the Ψ_ID, users can view the detail information of each putative Ψ site.


(3): The statistic table and diagram are provided for users to view their results clearly.


For users upload sequence information using sequence-derived features.

(1): Firstly, a map of all putative Ψ sites in given sequence is provided, users can know the position of each Ψ modification clearly. The detail information of each computational predicted Ψ site are then listed, with 41 bp sequence, sample name, likelihood ratio, and confidence level.