Filters


Type of Resource

Top Features

Machine learning

Deep learning

Anticancer peptides

Antimicrobial peptides

Anti-inflammatory peptides

Antiviral peptides

Ensemble learning

Feature selection

Random forest

Cell-penetrating peptides


Shi, H's method

Webserver
Antihypertensive peptides CNN GRU Deep learning

A predictor for antihypertensive peptides combining Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU), integrating Kmer, DDE, EBGW, EGAAC, and novel DBPF features, achieving 96.23% and 99.10% accuracy in tenfold cross-validation.

Shoombuatong W, et.al's work

Tool & Review
Anticancer peptides Machine learning QSAR Therapeutic peptides Bioactivity

This review focuses on cancer as a global health burden, elaborating on anticancer peptides (ACPs) as emerging therapeutic options due to high selectivity, low toxicity, and production cost advantages, while analyzing inherent challenges in therapeutic peptide design. It highlights cutting-edge applications of machine learning in parsing ACP bioactivity data, covering sequence feature extraction, model construction, and QSAR analysis, with future research outlooks. Data and R codes used in the analysis are available on GitHub.

SiameseCPP

Tool
Cell-penetrating peptides Siamese network Contrastive learning

SiameseCPP is a sequence-based Siamese network predictor that leverages contrastive learning and a pretrained model with Transformer and gated recurrent units for discriminative representation of cell-penetrating peptides.

SkipCPP-Pred

Webserver
Cell-penetrating peptides Random forest Machine learning

SkipCPP-Pred is a sequence-based predictor for cell-penetrating peptides, using adaptive k-skip-n-gram feature representation and random forest classifier, improving prediction via high-quality dataset with reduced redundancy.

SPdb

Database
Signal peptides Database Protein targeting Experimental validation Sequence retrieval

SPdb is a specialized database for signal peptides, integrating resources from UniProt (former Swiss-Prot) and EMBL databases. With semi-automated updates and manual verification, it ensures data accuracy, currently containing 18,146 entries including 2,584 experimentally validated sequences and 15,562 computationally predicted/unverified entries. It facilitates understanding of signal peptides in protein targeting, with real-time updates synchronized to primary databases.