Peptides Bioactive Functional Annotation

Filters

Type of Resource

Database

Webserver

Tool

Top Features

Machine learning

Deep learning

Anticancer peptides

Antimicrobial peptides

Anti-inflammatory peptides

Antiviral peptides

Ensemble learning

Feature selection

Random forest

Cell-penetrating peptides

AVPIden

Webserver

Antiviral peptides Two-stage classification Imbalanced learning Explainable AI Virus-target prediction

AVPIden is a two-stage classification tool for antiviral peptide (AVPs) prediction and virus-targeting function identification. The first stage discriminates AVPs from broad-spectrum peptide libraries (including non-AMP and non-AVP antimicrobial peptides), while the second stage characterizes targeted virus families/species (6 families like Coronaviridae/Retroviridae and 8 viruses including FIV/HIV/SARS-CoV). Using imbalanced learning, the model optimizes prediction via multi-dimensional descriptors (amino acid composition, physicochemical properties, structural features) and employs explainable machine learning based on Shapley values to quantify feature contributions to antiviral activity.

Direct Link Publication

AVPpred

Webserver

Antiviral peptides Support Vector Machine Physicochemical properties Virus therapy Sequence analysis

AVPpred is the first dedicated webserver for predicting antiviral peptides (AVPs), designed to accelerate antiviral drug discovery. Built on 1245 experimentally validated AVPs against human viruses (influenza, HIV, HCV, SARS, etc.), the tool uses 951 non-redundant peptides for training and 105 for validation, employing Support Vector Machine (SVM) with multi-dimensional features (motifs, sequence alignment, amino acid composition, physicochemical properties) via 5-fold cross-validation.

Direct Link Publication

BACTIBASE

Database

Bacteriocins Antimicrobial peptides Database Food preservation Drug development

BACTIBASE is an integrated open-access database designed for characterizing bacterial antimicrobial peptides (bacteriocins). The second release expands entries by 44%, sourcing data from published literature and high-throughput datasets, with manually curated bacteriocin sequence annotations. New features include homology search, multiple sequence alignment, Hidden Markov Model (HMM) analysis, molecular modeling tools, and taxonomic browser retrieval, supporting food preservation, food safety research, and new drug development.

Direct Link Publication

BactPepDB

Database

Prokaryotes Small peptides Genomic annotation Sequence classification Structure prediction

BactPepDB is a genomic annotation database for prokaryotic small peptides, re-annotating all complete prokaryotic genomes (chromosomal and plasmid DNA) in RefSeq, focusing on coding sequences of 10-80 amino acids. Identified peptides are classified into three categories: ① previously annotated in RefSeq; ② intragenic-overlapping or intergenic peptides; ③ potential pseudogenes (intergenic sequences overlapping with larger annotated genes). The database provides taxonomic homologs, predicted signal sequences, transmembrane segments, disulfide bonds, secondary structures, and PDB 3D structures, supporting searches for candidate peptides and similar sequences by genomic localization or predicted attributes, offering systematic annotation resources for prokaryotic small peptide function discovery.

Direct Link Publication

BAGEL3

Database

Bacteriocins RiPPs Genome mining Posttranslational modification Bioinformatics tool

BAGEL3 is a professional genome mining tool for identifying bacteriocins and ribosomally synthesized posttranslationally modified peptides (RiPPs), upgraded from BAGEL2. Its core advantage lies in ORF prediction-independent prokaryotic genome mining, covering expanded classes of modified peptides (lantipeptides, sactipeptides, glycocins, etc.). Using a dual strategy of direct gene mining and indirect context gene mining, it leverages genetic context information (e.g., modification enzyme clusters) to enhance mining efficiency for highly modified peptides. The database updates bacteriocin and context protein datasets, supports user submissions of novel peptides, and optimizes output for large metagenomic datasets with full annotation of candidate gene genetic contexts.

Direct Link Publication