Injecting structure-aware insights for the learning of RNA sequence representations to identify m6A modification sites
Injecting structure-aware insights for the learning of RNA sequence representations to identify m6A modification sites
Blog Article
N6-methyladenosine (m6A) represents one of the most prevalent methylation modifications in eukaryotes and it is crucial to accurately identify its modification sites on RNA sequences.Traditional machine learning based approaches to m6A modification site identification primarily focus on RNA sequence data but often incorporate additional biological domain knowledge and rely on manually crafted features.These methods typically overlook the structural insights inherent in RNA sequences.To address this limitation, we propose M6A-SAI, an advanced predictor for RNA m6A modifications.
M6A-SAI leverages a transformer-based deep learning framework to integrate structure-aware insights into sequence representation learning, thereby enhancing the here precision of m6A modification site identification.The core innovation of M6A-SAI lies in its ability to incorporate structural information through a multi-step process: initially, the model utilizes a Transformer encoder to learn RNA sequence representations.It then constructs a similarity graph based on Manhattan distance to capture sequence correlations.To address the limitations of the smooth similarity graph, M6A-SAI integrates a structure-aware optimization block, which refines the graph by defining blunt wraps anchor sets and generating an awareness graph through PageRank.
Following this, M6A-SAI employs a self-correlation fusion graph convolution framework to merge information from both the similarity and awareness graphs, thus producing enriched sequence representations.Finally, a support vector machine is utilized for classifying these representations.Experimental results validate that M6A-SAI substantially improves the recognition of m6A modification sites by incorporating structure-aware insights, demonstrating its efficacy as a robust method for identifying RNA m6A modification sites.