YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Fine-Tuning RNA Language Models to Predict Branch Points

This repository contains several fine-tuned RNA language models for predicting branch points within intronic sequences. The models are fine-tuned using the MultiMolecule library and evaluated on experimental datasets.

The following RNA language models were fine-tuned:

SpliceBERT
RNABERT
RNA-FM
RNA-MSM
ERNIE-RNA
UTR-LM

The dataset contains 177980 samples and is an experimental-data only subset of the dataset used to train BPHunter.

It has been split into approximately 80/10/10 train/validation/test by chromosome type:

Train: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY,
Validation: chr9, chr10
Test: chr8, chr11

Training Details

Each model was trained on the full dataset for 3 epochs with a batch size of 16, except for RNA-FM, which required a reduced batch size of 12 due to VRAM limitations. The following hyperparameters were used for most models, including RNABERT, RNA-FM, RNA-MSM, and UTR-LM:

Optimizer: AdamW
Learning rate: 3e-4
Weight decay: 0.001

However, SpliceBERT and ERNIE-RNA failed to converge using these parameters. To address this, we adjusted the hyperparmeters to:

Learning rate: 2e-5
Weight decay of 0.01

The adjustments were made based on empirical observations during early training. While ideally, comprehensive hyperparameter tuning would be done for each model to optimize perforance, this was not feasible within the scope of the project due to the high computational cost and training time required.

GitHub

All code used to create and evaluate this model can be found at this link.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support