CRISPR-FMC: a dual-branch hybrid network for predicting CRISPR-Cas9 on-target activity
CRISPR-FMC: a dual-branch hybrid network for predicting CRISPR-Cas9 on-target activity
Chuxuan Li,Jian Li,Quan Zou,Hailin Feng
TLDR
CRISPR-FMC is presented, a dual-branch hybrid neural network that integrates One-hot encoding with contextual embeddings from a pre-trained RNA-FM model that consistently outperforms existing baselines in both Spearman and Pearson correlation metrics.
Abstract
Introduction Accurately predicting the on-target activity of sgRNAs remains a challenge in CRISPR-Cas9 applications, due to the limited generalization of existing models across datasets, small-sample settings, and complex sequence contexts. Current methods often rely on shallow architectures or unimodal encodings, limiting their ability to capture the intricate dependencies underlying Cas9-mediated cleavage. Methods We present CRISPR-FMC, a dual-branch hybrid neural network that integrates One-hot encoding with contextual embeddings from a pre-trained RNA-FM model. Multi-scale convolution (MSC), BiGRU, and Transformer blocks are employed to extract hierarchical sequence features, while a bidirectional cross-attention mechanism with a residual feedforward network enhances multimodal fusion and generalization. Results Across nine public CRISPR-Cas9 datasets, CRISPR-FMC consistently outperforms existing baselines in both Spearman and Pearson correlation metrics, showing particularly strong performance under low-resource and cross-dataset conditions. Ablation experiments confirm the contribution of each module, and base substitution analysis reveals a pronounced sensitivity to the PAM-proximal region. Discussion The PAM-proximal sensitivity aligns with established biological evidence, indicating the model’s capacity to capture biologically relevant sequence determinants. These results demonstrate that CRISPR-FMC offers a robust and interpretable framework for sgRNA activity prediction across heterogeneous genomic contexts.
