Aptamers are oligonucleotides or peptides with unique binding properties for specific target molecules,
and they have shown great potential in diagnostics, therapeutics, and bio-sensing. However, the current
in vitro SELEX-based method for discovering new target-selective aptamers is challenging, time-consuming,
and often unsuccessful in finding high-affinity aptamers. Recently, in silico methods have gained immense
attention. However, since labeled interaction-pair data collection is expensive and needs highly trained
specialists, available data is sparse. Further, since acquiring positive-class samples is even more
challenging, available datasets showcase high-class imbalance. This makes designing deep learning models
incredibly challenging, as they require a sufficiently large training set and are biased towards the dominant
class. Additionally, current models cannot be updated in real-time, and end-to-end re-training is necessary
for each new aptamer-target interaction pair discovery. The present work is the first to address both these
challenges. We present cAPTured, a novel fuzzy continual learning method for predicting aptamer-target protein
interaction pairs in a continual learning environment. cAPTured continually updates its learned feature space
on a non-stationary interaction-pair data stream. We performed extensive evaluation studies and experiments
to establish the effectiveness of the proposed approach. cAPTured outperforms existing methods on the
benchmark dataset by a significant margin.
Aptamers – short sequences of RNA, DNA, or peptides, have emerged as powerful tools with
unique binding properties for specific target molecules. They hold immense potential especially
in diagnostics, therapeutics, and biosensing. However, traditional lab-based methods for
discovering aptamers are labor-intensive, costly and often fail to yield high-affinity aptamers.
The current benchmark is SELEX which takes up to 02 years for finding a single high affinity
Aptamer. The goal of this project was not only to develop a faster alternative of SELEX to
accelerate the discovery of Aptamers and cut down the required time but also outperform it in
terms of precision.
We developed “cAPTured”, a fuzzy continual machine learning model for predicting
aptamer-target protein interactions. cAPTured used four distinct feature encodings: k-mer and
revck-mer based aptamer-sequence encoding, as well as AAC and PseAAC based target
protein-sequence encoding. These encoding methods were used to extract essential
information from aptamer and protein sequences. Subsequently, cAPTured fused and
embedded these encoded features into a low-dimensional latent space, preserving statistically
significant features. Here, the goal was to design the model to be agnostic to distribution shifts,
allowing it to adapt and update its learned feature space in real-time based on a non-stationary
interaction-pair data stream.
Our results found that cAPTured cuts the time required for aptamer discovery from the current
benchmark of around 2-3 years (which is required by SELEX) to just a few minutes. cAPTured
outperforms existing benchmarks by exhibiting a 04% increase in precision when tested on
benchmark lab datasets. Notably due to its inherent feature engineering and fuzzy neural
network-based design, cAPTured excels in handling limited data scenarios (with just 400, 300 or
even 200 training samples) and maintaining relevance over time (05 pairs of distribution shifts),
preventing the model from becoming outdated as new aptamer-target interaction pairs are
discovered. This was additionally verified by mapping a t-SNE plot. These results underscore
the potential of cAPTured as a valuable tool in the field of bioinformatics.
In conclusion, using cAPTured, bioinformatics engineers can massively reduce the experiment
duration to find a high affinity aptamer by 700x times from the current benchmark in addition to
increasing the precision by 04%. cAPTured’s ability to adapt to distribution shifts and its robust
performance in limited data scenarios position it as a versatile and enduring tool.
@inproceedings{chharia2023captured,
title={cAPTured: Neural Reflex Arc-Inspired Fuzzy Continual Learning for Capturing in Silico Aptamer-Target Protein Interactions},
author={Chharia, Aviral and Saran, Runjhun and Narayan, Apurva},
booktitle={2023 International Joint Conference on Neural Networks (IJCNN)},
pages={1--9},
year={2023},
organization={IEEE}
}