Home >> Links
In this page:
Nucleotide Sequence Databases 核酸序列数据库
- National Center for Biotechnology
- European Bioinformatics Institute
- DNA Data Bank of Japan
Protein Sequence Databases 蛋白质序列数据库
SWISS-PROT & TrEMBL
- Protein sequence
database and computer annotated supplement
Protein Resource) is the world's most comprehensive catalog of information on proteins. It is a central repository of protein sequence and function created by
joining the information contained in Swiss-Prot, TrEMBL, and PIR.
- Protein Information Resource
- HUman Proteome Organization
Database Searching by Sequence Similarity 序列相似性搜索
Sequence Alignment 序列比对
- multiple sequence alignment
Clustal Omega @ EBI
- multiple sequence
- pretty printing and shading
of multiple alignments
- Splign is a utility for
computing cDNA-to-Genomic, or spliced
sequence alignments. At the heart of the
program is a global alignment algorithm that
specifically accounts for introns and splice
- an mRNA-to-genomic alignment
- a program to align cDNA
and genomic DNA
- computes alignments of
similar regions in two (long) DNA sequences
- align + detect conserved regions
in long genomic sequences
Human Genome Databases 人类基因组数据库
Next Generation Sequencing 下一代测序
Sequencing Journal for Next Generation
- Integrated solutions 集成解决方案
CLCbio Genomics Workbench
- de novo
and reference assembly of Sanger, Roche FLX,
Illumina, Helicos, and SOLiD data.
Commercial next-gen-seq software that
extends the CLCbio Main Workbench software.
Includes SNP detection, CHiP-seq, browser
and other features. Commercial. Windows, Mac
OS X and Linux.
- Galaxy = interactive and
reproducible genomics. A job webportal.
- Integrated Solutions for
Next Generation Sequencing data analysis.
- Next gen visualization
and statistics tool from SAS. They are working with
NCGR to refine this tool and produce others.
- de novo and reference
assembly of Illumina, SOLiD and Roche FLX
data. Uses a novel Condensation Assembly
Tool approach where reads are joined via
"anchors" into mini-contigs before assembly.
Includes SNP detection, CHiP-seq, browser
and other features. Commercial. Win or MacOS.
SeqMan Genome Analyser
- Software for
Next Generation sequence assembly of
Illumina, Roche FLX and Sanger data
integrating with Lasergene Sequence Analysis
software for additional analysis and
visualization capabilities. Can use a hybrid
templated/de novo approach. Commercial. Win
or Mac OS X.
- SHORE, for Short Read, is a
mapping and analysis pipeline for short DNA
sequences produced on a Illumina Genome
Analyzer. A suite created by the 1001
Genomes project. Source for POSIX.
- Fledgling commercial
- Align/Assemble to a reference 参考序列比对/组装
- Blat-like Fast Accurate Search
Tool. Written by Nils Homer, Stanley F.
Nelson and Barry Merriman at UCLA.
- Ultrafast, memory-efficient
short read aligner. It aligns short DNA
sequences (reads) to the human genome at a
rate of 25 million reads per hour on a
typical workstation with 2 gigabytes of
memory. Uses a Burrows-Wheeler-Transformed (BWT)
Link to discussion thread here. Written
by Ben Langmead and Cole Trapnell. Linux,
Windows, and Mac OS X.
Exonerate - Various forms of pairwise
alignment (including Smith-Waterman-Gotoh)
of DNA/protein against a reference. Authors
are Guy St C Slater and Ewan Birney from
EMBL. C for POSIX.
- GenomeMapper is a short
read mapping tool designed for accurate read
alignments. It quickly aligns millions of
reads either with ungapped or gapped
alignments. A tool created by the 1001
Genomes project. Source for POSIX.
- GMAP (Genomic Mapping and
Alignment Program) for mRNA and EST
Sequences. Developed by Thomas Wu and Colin
Watanabe at Genentec. C/Perl for Unix.
- The Genomic Next-generation
Universal MAPper (gnumap) is a program
designed to accurately map sequence data
obtained from next-generation sequencing
machines (specifically that of Solexa/Illumina)
back to a genome of any size. It seeks to
align reads from nonunique repeats using
statistics. From authors at Brigham Young
University. C source/Unix.
MAQ - Mapping and Assembly with
Qualities (renamed from MAPASS2).
Particularly designed for Illumina with
preliminary functions to handle ABI SOLiD
data. Written by Heng Li from the Sanger
Centre. Features extensive supporting tools
for DIP/SNP detection, etc. C++ source
MOSAIK - MOSAIK produces gapped
alignments using the Smith-Waterman
algorithm. Features a number of support
tools. Support for Roche FLX, Illumina,
SOLiD, and Helicos. Written by Michael
Strömberg at Boston College. Win/Linux/MacOSX
- Tools for reference
alignment of paired-end and single-end
Illumina reads. Uses a Needleman-Wunsch
algorithm. Can support Bis-Seq. Commercial.
Available free for evaluation, educational
use and for use on open not-for-profit
projects. Requires Linux or Mac OS X.
- It supports Illumina, SOLiD and
Roche-FLX data formats and allows the user
to modulate very finely the sensitivity of
the alignments. Spaced seed intial filter,
then NW dynamic algorithm to a SW(like)
local alignment. Authors are from CRIBI in
- Assembles 20 - 64 bp Illumina
reads to a FASTA reference genome. By Andrew
D. Smith and Zhenyu Xuan at CSHL. (published
in BMC Bioinformatics). POSIX OS required.
- Assembles to a reference
sequence. Developed with Applied Biosystem's
colourspace genomic representation in mind.
Authors are Michael Brudno and Stephen
Rumble at the University of Toronto. POSIX.
- An application for the
Illumina Sequence Analyzer output that uses
the probability files instead of the
sequence files as an input for alignment to
a reference sequence or a set of reference
sequences. Authors are from BCGSC. Paper is
Alignment Program). A program for efficient
gapped and ungapped alignment of short
oligonucleotides onto reference sequences.
The updated version uses a BWT. Can call
SNPs and INDELs. Author is Ruiqiang Li at
the Beijing Genomics Institute. C++, POSIX.
-(Sequence Search and
Alignment by Hashing Algorithm) is a tool
for rapidly finding near exact matches in
DNA or protein databases using a hash table.
Developed at the Sanger Centre by Zemin
Ning, Anthony Cox and James Mullikin. C++
- Aligns SOLiD data. SOCS is built
on an iterative variation of the Rabin-Karp
string search algorithm, which uses hashing
to reduce the set of possible matches,
drastically increasing search speed. Authors
are Ondov B, Varadarajan A, Passalacqua KD
and Bergman NH.
- The SWIFT suit is a software
collection for fast index-based sequence
comparison. It contains: SWIFT — fast local
alignment search, guaranteeing to find
epsilon-matches between two sequences. SWIFT
BALSAM — a very fast program to find
semiglobal non-gapped alignments based on
k-mer seeds. Authors are Kim Rasmussen
(SWIFT) and Wolfgang Gerlach (SWIFT BALSAM)
- A versatile software tool for
efficiently solving large scale sequence
matching tasks. Vmatch subsumes the software
tool REPuter, but is much more general, with
a very flexible user interface, and improved
space and time requirements. Essentially a
large string matching toolbox.
- De novo Align/Assemble 从头比对/组装
- Assembly By Short Sequences.
ABySS is a de novo sequence assembler that
is designed for very short reads. The
single-processor version is useful for
assembling genomes up to 40-50 Mbases in
size. The parallel version is implemented
using MPI and is capable of assembling
larger genomes. By Simpson JT and others at
the Canada's Michael Smith Genome Sciences
Centre. C++ as source.
- ALLPATHS: De novo assembly of
whole-genome shotgun microreads. ALLPATHS is
a whole genome shotgun assembler that can
generate high quality assemblies from short
reads. Assemblies are presented in a graph
form that retains ambiguities, such as those
arising from polymorphism, thereby providing
information that has been absent from
previous genome assemblies. Broad Institute.
- Edena (Exact DE Novo Assembler)
is an assembler dedicated to process the
millions of very short reads produced by the
Illumina Genome Analyzer. Edena is based on
the traditional overlap layout paradigm. By
D. Hernandez, P. François, L. Farinelli, M.
Osteras, and J. Schrenzel. Linux/Win.
- A Consistency-based Consensus
Algorithm for De Novo and Reference-guided
Sequence Assembly of Short Reads. By Tobias
Rausch and others. C++, Linux/Win.
- De novo assembly of short
reads. Authors are Dohm JC, Lottaz C,
Borodina T and Himmelbauer H. from the
Max-Planck-Institute for Molecular Genetics.
SSAKE - The Short Sequence Assembly by
K-mer search and 3' read Extension (SSAKE)
is a genomics application for aggressively
assembling millions of short nucleotide
sequences by progressively searching for
perfect 3'-most k-mers using a DNA prefix
tree. Authors are René Warren, Granger
Sutton, Steven Jones and Robert Holt from
the Canada's Michael Smith Genome Sciences
- Part of the SOAP suite. See
- De novo assembly of short reads
with robust error correction. An improvement
on early versions of SSAKE.
- a de novo genomic
assembler specially designed for short read
sequencing technologies, such as Solexa or
454. Need about 20-25X coverage and paired
reads. Developed by Daniel Zerbino and Ewan
Birney at the European Bioinformatics
- SNP/Indel Discovery 单核苷酸多态性位点/插入缺失位点寻找
- ssahaSNP is a polymorphism
detection tool. It detects homozygous SNPs
and indels by aligning shotgun reads to the
finished genome sequence. Highly repetitive
elements are filtered out by ignoring those
kmer words with high occurrence numbers.
More tuned for ABI Sanger reads. Developers
are Adam Spargo and Zemin Ning from the
Sanger Centre. Compaq Alpha, Linux-64,
Linux-32, Solaris and Mac
- PyroBayes is a novel base
caller for pyrosequences from the 454 Life
Sciences sequencing machines. It was
designed to assign more accurate base
quality estimates to the 454 pyrosequences.
Developers at Boston College.
- Genome Annotation/Genome
Browser/Alignment Viewer/Assembly Database 基因组注释/基因组浏览器/比对浏览器/装配数据库
- An information-rich genome
assembler viewer. EagleView can display a
dozen different types of information
including base quality and flowgram signal.
Developers at Boston College.
- LookSeq is a web-based
application for alignment visualization,
browsing and analysis of genome sequence
data. LookSeq supports multiple sequencing technologies, alignment sources, and viewing
modes; low or high-depth read pileups; and
easy visualization of putative single
nucleotide and structural variation. From
the Sanger Centre.
MapView - MapView: visualization of
short reads alignment on desktop computer.
From the Evolutionary Genomics Lab at
Sun-Yat Sen University, China. Linux.
SAM - Sequence Assembly Manager. Whole
Genome Assembly (WGA) Management and
Visualization Tool. It provides a generic
platform for manipulating, analyzing and
viewing WGA data, regardless of input type.
Developers are Rene Warren, Yaron
Butterfield, Asim Siddiqui and Steven Jones
at Canada's Michael Smith Genome Sciences
Centre. MySQL backend and Perl-CGI web-based
- Includes GAP4. GAP5 once
completed will handle next-gen sequencing
data. A partially implemented test version
- A visual tool for analyzing cross_match alignments. Developed by Rene Warren and Steven Jones at Canada's Michael
Smith Genome Sciences Centre. Python/Win or
- Counting 计数
- The source code and data for
the "Shotgun Bisulphite Sequencing of the
Arabidopsis Genome Reveals DNA Methylation
Patterning" Nature paper by
Cokus et al. (Steve Jacobsen's lab at
- Program used by Johnson et al.
(2007) in their Science publication
- CNV-seq, a new method to
detect copy number variation using
high-throughput sequencing. Chao Xie and
Martti T Tammi at the National University of
- perform analysis of ChIP-Seq
experiments. It uses a naive algorithm for
identifying regions of high coverage, which
represent Chromatin Immunoprecipitation
enrichment of sequence fragments, indicating
the location of a bound protein of interest.
Original algorithm by Matthew Bainbridge, in
collaboration with Gordon Robertson. Current
code and implementation by Anthony Fejes.
Authors are from the Canada's Michael Smith
Genome Sciences Centre. JAVA/OS independent.
Latest versions available as part of the
Vancouver Short Read Analysis Package
- Model-based Analysis for
ChIP-Seq. MACS empirically models the length
of the sequenced ChIP fragments, which tends
to be shorter than sonication or library
construction size estimates, and uses it to
improve the spatial resolution of predicted
binding sites. MACS also uses a dynamic
Poisson distribution to effectively capture
local biases in the genome sequence,
allowing for more sensitive and robust
prediction. Written by Yong Zhang and Tao
Liu from Xiaole Shirley Liu's Lab.
- PeakSeq: Systematic Scoring of
ChIP-Seq Experiments Relative to Controls. a
two-pass approach for scoring ChIP-Seq data
relative to controls. The first pass
identifies putative binding sites and
compensates for variation in the mappability
of sequences across the genome. The second
pass filters out sites that are not
significantly enriched compared to the
normalized input DNA and computes a precise
enrichment and significance. By Rozowsky J
et al. C/Perl.
QuEST - Quantitative Enrichment of
Sequence Tags. Sidow and Myers Labs at
Stanford. From the 2008 publication
Genome-wide analysis of transcription factor
binding sites based on ChIP-Seq data.
- Site Identification from Short
Sequence Reads. BED file input. Raja Jothi @
this thread for ChIP-Seq.
- Alternate Base Calling 可替换碱基识别
- R-based framework for base
calling of Solexa data. Project
- "a novel Illumina
Genome-Analyzer (Solexa) base caller"
- Transcriptomics 转录组学
- Mapping and Quantifying
Mammalian Transcriptomes by RNA-Seq.
Supports Bowtie, BLAT and ELAND. From the
- G-Mo.R-Se is a method aimed
at using RNA-Seq short reads to build de
novo gene models. First, candidate exons are
built directly from the positions of the
reads mapped on the genome (without any ab
initio assembly of the reads), and all the
possible splice junctions between those
exons are tested against unmapped reads.
From CNS in France.
- A software tool for
spliced and unspliced alignments and SNP
detection of short sequence reads. From the
Evolutionary Genomics Lab at Sun-Yat Sen
- Optimal Spliced Alignments of
Short Sequence Reads. Authors are Fabio De
Bona, Stephan Ossowski, Korbinian
Schneeberger, and Gunnar Rätsch. A paper is
- a fast splice
junction mapper for RNA-Seq reads. It aligns
RNA-Seq reads to mammalian-sized genomes
using the ultra high-throughput short read
aligner Bowtie, and then analyzes the
mapping results to identify splice junctions
between exons. TopHat is a collaborative
effort between the University of Maryland
and the University of California, Berkeley
Rice databases 水稻数据库
- Beijing Genomics Institute Rice Information System
- Rice Genome Research Program
- is currently the most comprehensive regulatory database on Oryza sativa based on genome annotation. It was displayed in three levels: GEM, PPIs and GRNs to facilitate biomolecular regulatory analysis and gene-metabolite mapping.
- MIPS Oryza sativa database
- Rice genetics and genomics
Oryza Tag Line database
- T-DNA insertion mutants of rice
- Oryza sativa protein-protein interactions network
Plant Promoter and Regulatory Element Resources 植物启动子和调控元件资源
- Currently contains two databases, AtcisDB (Arabidopsis thaliana cis-regulatory database) and AtTFDB (Arabidopsis thaliana transcriptionfactor database).
- A genome-wide map of putative transcription factor binding sites in Arabidopsis thaliana.
- The Arabidopsis thaliana promoter binding element database, an aid to find binding elements and check data against the primary literature.
- Database of Arabidopsis Transcription Factors (DATF) contains known and predicted Arabidopsis transcription factors with sequences and many other features including 3D structure templates, EST expression information, transcription factor binding sites and Nuclear Location Signals.
- Databases of Orthologous Promoters, a database containing orthologous clusters of promoters from Homo sapiens, Arabidopsis thaliana and other organisms.
- A public web resource composed by a collection of databases, computational and experimental resources that relate to the control of gene expression in the grasses, and their relationship with agronomic traits. GRASSIUS currently contains regulatory information on maize, rice, sorghum and sugarcane.
- Database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences.
-Database with annotated, non-redundant collection of proximal promoter sequences for RNA polymerase II with experimentally determined transcription start sites (TSS) from various plant species.
- Plant Transcription Factor Database, an integrative plant transcription factor database that provides a web interface to access large (close to complete) sets of transcription factors of several plant species, currently encompassing Arabidopsis thaliana (thale cress), Populus trichocarpa (poplar), Oryza sativa(rice), Chlamydomonas reinhardtii and Ostreococcus tauri.
- (Plant Promoter DB) Database that provides transcription start sites (TSS) and other structural information for Arabidopsis and rice promoters.
- Database on eukaryotic transcription factors, their genomic binding sites and DNA-binding profiles. Commercial site.
Databases of other Organisms 其他物种数据库
Genome-wide Analysis 基因组分析
- comparative analysis of
completely sequenced microbial genomes
- phylogenetic classification of
orthologous proteins from complete genomes
- detect whether a given query
gene occurs repeatedly with certain other
genes in potential operons
- automatic whole genome
- various whole genome
Protein Properties 蛋白质理化参数
Protein Domains 蛋白质结构域
- alignments and hidden Markov
models covering many common protein domains
- analysis of domains in proteins
- protein domain database
- integration of Pfam, PRINTS,
PROSITE, SWISS-PROT + TrEMBL
- database of protein families
- Protein 3D structure models
- NCBI protein structure database
- PRINTS Database
- groups of conserved
motifs used to characterise protein families
- multiply aligned ungapped
segments corresponding to the most highly
conserved regions of proteins
- yet more protein families
based on Hidden Markov Models
Motif and Pattern Search in Sequences 序列模体和样式搜索
Gibbs Motif Sampler
- identification of
conserved motifs in DNA or protein sequences
- gene regulatory
- motif discovery and search in
protein and DNA sequences
- tools for creating and using
Hidden Markov Models
- discover patterns in unaligned
- a web facility for
exploring small hydrogen-bonded motifs
Protein 3D Structure 蛋白质三维结构
- protein 3D structure database
RasMol / Protein Explorer
- molecule 3D
- UCL BSM CATH classification
- fold classification based on
structure-structure alignment of proteins
- homology modeling server
Structure Prediction Meta-server
- protein structure alignment
- 3D structure alignment server
- defines secondary structure and
solvent exposure from 3D coordinates
Secondary Structure of Proteins
PredictProtein & PHD
- predict secondary structure, solvent accessibility, transmembrane helices, and other stuff
- protein secondary structure
PSIpred (& MEMSAT & GenTHREADER)
protein secondary structure prediction (&
transmembrane helix prediction & tertiary
structure prediction by threading)
- Structural analysis 结构分析
- A collection of protein families
- Simple Modular Architecture Research Tool
- Structural Classification of Proteins
- Motif-based sequence analysis tools
- A Secondary Structure Prediction Server
- A highly accurate method for protein secondary structure prediction
- Cysteine state and Disulfide Bond partner prediction
- Full-chain protein structure prediction server
- Molecular Graphics 分子图像
- Open Source viewer that includes features for morphing proteins and visualization of lipophilic and electrostatic potentials.
- A protein visualization and modeling program
Create beautiful publication quality images and movies. Users can superpose and analyse structures as well. The program runs 'out of the box' on Linux, MacOSX and Windows platforms.
Interactive molecular modeling system, free to academic/non-profit; displays multiple sequence alignments and associated structures, atom-type and H-bond identification, molecular dynamics trajectories (AMBER format), and offers ligand-screening interface (DOCK), filter by number/position of H-bonds, and extensibility to create custom modules - for Windows, Linux, Mac OS X, IRIX, and Tru64 Unix
Simultaneously displays structure, sequence, and alignment, with annotation and alignment editing features, for use with 3-D structures from NCBI's Entrez; available for Windows, Macintosh, and Unix
A program for building, displaying and manipulating all kinds of crystal and molecular structures.
Embedded Python Molecular Viewer (ePMV) is an open-source plug-in that runs molecular modeling software directly inside of professional 3D animation applications
Foldit is a crowdsourcing computer game based on protein modeling.
Open GL graphics program displays small, large, and multiple molecules; measures distances and angles, superimposes structures, calculates RMSD between atom coordinates, structurally aligns chains, and displays dynamics trajectories. For Mac OS X incl. 10.2
Jmol is a free, open source molecule viewer for students, educators, and researchers in chemistry and biochemistry. It is cross-platform, running on Windows, Mac OS X, and Linux/Unix systems.
Mage and Kinemages
Interactive molecular display for research and educational uses. Free, open source for Windows and Mac (OSX or PPC), Unix, and Linux. A Java version does 3-D Web display without plug-ins.
Marvin is a collection of tools for drawing, displaying and characterizing chemical structures, queries, macromolecules and reactions for all operating systems, web pages and custom applications.
Interactively generate heterogeneous PDB-based membranes with varying lipid compositions and semi-automatic protein placement. Supports membrane patches and vesicles, microdomains as well as stacking of monolayer and/or bilayer membranes.
an iPad application for viewing and manipulating 3D chemical and molecular structures. Structures can be downloaded and displayed from the PubChem, PDB, and NCBI structure databases together with the sequences for proteins and nucleic acids. Structures can be drawn as tubes, ball and stick, or space filling modes. Coloring options include residue, charge, hydrophobicity, rainbow, and molecule. Parts of structures can be hidden or displayed with mixed coloring and drawing modes.
Molecule World for iPhone
- Molecule World for iPhone can be used on the iPhone or iPod touch to display and manipulate 3D chemical and molecular structures from the PubChem, PDB, or NCBI structure databases. Drawing options include ball and stick, space fill, and ball and stick modes. Coloring options include rainbow, residue, charge, hydrophobicity, and molecule. Proteins, nucleic acids, and heterogens can be displayed in different modes.
Molecule World DNA Binding Lab
- A classroom ready iPad application for exploring the ways chemicals and proteins bind to DNA. The DNA Binding Lab uses Molecule World?s rendering engine and display features to highlight different molecules and understand how they intact. The DNA Binding Lab includes instructions, three examples, and 40 unknowns that can be assigned to students. Photo sharing capabilities allow students to share their work with teachers to aid with assessment.
An iPhone application for PDB structures
- A program for displaying structures in both detailed and schematic formats and writing images in various formats for Unix
a style="color: #45962E" href="http://www.molviz.org/" target="_blank">MolviZ.org
Free, interactive visualization tutorials
- Molecular Visualization Program and GUI of ZMM. MVM is a free molecular viewer that can be used to display protein, nucleic acids, oligosacharides, small and macromolecules. It has an intuitive interface. In addition to being a molecular viewer, it is the user interface of a very powerful molecular mechanics engine (ZMM).
PMV (Python Molecular Viewer)
An interactive molecular visualization and modeling environment for manipulation and viewing of multiple molecules.
Program to view and manipulate PDB files on a PocketPC
Protein structure annotation using sequence profiles
Versatile annotation and high quality visualization of macromolecular structures
Analysis and visualization of macromolecular motions
Free viewer to display and manipulate PDB files and create animations and slides of proteins for Windows. Online ordering of protein 3D prints in several color schemes.
Mapping protein sequence annotations onto a protein structure and visualizing them simultaneously with the structure.
A free and open-source molecular graphics system for visualization, animation, editing, and publication-quality imagery. PyMOL is scriptable and can be extended using the Python language. Supports Windows, Mac OSX, Unix, and Linux
An open source (GPL), interactive, high quality molecular visualization system. QuteMol exploits the current GPU capabilites through OpenGL shaders to offers an array of innovative visual effects.
- A free viewing system for PDB coordinate files that runs on Mac (PPC), Windows, Unix, and Linux systems. Open source versions are also available.
- A set of tools for generating high quality raster images of proteins or other molecules. Freeware for Mac OSX, Windows, Unix, and Linux
RasTop (v. 2.0)
A free user-friendly graphical interface to RasMol molecular visualization software (v. 18.104.22.168), available for Windows and Linux
- A program for molecular illustration and error analysis, for for Mac OSX, Windows, Unix, and Linux
RCSB MBT Viewers
The MBT toolkit is a framework that allows to create various viewers. It is used for 4 different viewers on the RCSB PDB web site.
A Tcl/Tk script responsible to redirect PDB files or RasMol scripts to multiple RasMol sessions; can be used as a Web browser helper application or as a standalone program for Mac (OSX or PPC), Windows, or Unix
Schrödinger Product Suites
Schrödinger's full product offerings range from general molecular modeling programs to a comprehensive suite of drug design software, as well as a state-of-the-art suite for materials research. All products are run with Maestro, a unified interface for all Schrödinger software, which is available for Mac, Windows, and Linux.
The Structural Proteomics Application Development Environment (SPADE) provides community tools for development and deployment of essential structure and sequence equipment. Includes a chemical probing suite to support experimental verification of predicted structural models. Written in Python with scripting tools available. Runs on Windows, Linux and Mac.
Align proteins by sequence and 3D structure.
Swiss PDB viewer
A 3D graphics and molecular modeling program for the simultaneous analysis of multiple models and for model-building into electron density maps. The software is available for Mac (OSX or PPC), Windows, Linux, or SGI
A free and open-source tool with PDB format visualization support written in fast memory efficient C++ code. Supports Windows, Mac OSX, Unix, and Linux.
- VMD (Visual Molecular Dynamics) runs on many platforms including MacOS X, and several versions of Unix and Windows. VMD provides visualization, analysis, and Tcl/Python scripting features, and has recently added sequence browsing and volumetric rendering features. VMD is distributed free of charge.
- A complete molecular graphics and modeling program, including interactive molecular dynamics simulations, structure determination, analysis and prediction, docking, movies and eLearning for Windows, Linux and MacOSX.
- A molecular visualization tool that supports PDB, MOL, MOL2/SYBYL and XYZ file formats. The rendering engine can output high quality molecular graphics. Zeus provides a sequence search that can highlight within the molecular structure. Ramachandran plots of internal dihedral angles can be generated and exported. PDB files can be automatically downloaded from the RSCB PDB.
Phylogeny & Taxonomy 进化与分类
The Tree of Life
- index of the world's
- a database of phylogenetic
- Molecular Evolutionary Genetics Analysis
- Phylogenetic Analysis by Maximum Likelihood
- a phylogeny software based on the maximum-likelihood principle
- package of programs for
- user friendly tree displaying
for Macs & Windows
Gene Prediction 基因预测
Gene Structure Analysis 基因结构分析
Gene Expression Databases 基因表达数据库
Gene Regulation 基因调控
- For identifying
conserved and shared cis regulatory elements
between a pair of genes.
- For identifying
conserved and shared cis regulatory elements
between a set of co-expressed genes.
- eukaryotic promoter database
- DataBase of Transcriptional
Start Sites (human)
- Saccharomyces cerevisiae promoter
- Drosophila Core Promoter Database
- a database on
transcriptional regulation in E. coli
- protein binding sites on E.
- prediction of
promoter regions in mammalian genomic
- search for transcription
factor binding sites
- cis-element cluster finder
Gene regulatory Tools
Small RNA/MicroRNA 小分子RNA/微RNA
microRNA Targets & Expression
- Plant microRNA Knowledge Base
- MicroRNA Target Prediction
— miRNA target prediction for human, drosophila and zebrafish genomes
— a comprehensive repository for miRNAs and their predicted targets
— an online database for miRNA target prediction and functional annotations in animals
— a genomic maps of microRNA genes and their target genes in mammalian genomes
— a database providing comprehensive resource of miRNA deregulation in various human diseases
— a comprehensive database of experimentally supported animal microRNA targets
— microRNA targets for vertebrates, fly and nematodes
— a search for the presence of conserved sites that match the seed of each miRNA
Target Gene Prediction at EMBL
— miRNA-Target predictions for Drosophila miRNAs
- microRNA Expression Databases 微RNA表达数据库
— predicted microRNA targets & target downregulation scores. Experimentally observed expression patterns
— Human MicroRNA Disease Database (HMDD) is a database that contains the experimentally supported miRNA-disease association data, which are manually curated from publications. The dysfunction evidence or miRNAs
and literature PubMed ID are also given
— a web query-driven database integrating the experimentally supported transcription factor and miRNA regulatory relations
- RNA Secondary Structure Prediction RNA二级结构预测
— a prediction of miRNA-mRNA interaction
— tools for predicting the secondary structure of RNA and DNA, mainly by using thermodynamic methods
—a web tool for detection of miRNA binding sites in an RNA sequence
—miRNA End Energy calculator which takes miRNA duplex to calculate free energy for 5 base pairs at one end plus a dangling nucleotide
— a method for detecting miRNA foldbacks based on hidden Markov model (HMM)
— a multiple alignment tool for RNA sequences using progressive alignment based on pairwise structural alignment algorithm of SCARNA. Good for large scale analyses.
— a tool for finding the minimum free energy hybridisation of a long and a short RNA
- MicroRNA Homologous Prediction 微RNA同源预测
— a web-based tool used for homologous miRNA gene search in several species
—a global view of homologous miRNA genes in many species
— prediction of guide strand of microRNAs
— Sequence evaluation of microRNA properties
- MicroRNA Deep Sequencing 微RNA深度测序
— A microRNA detection and analysis tool for next-generation sequencing experiments
— A software pipeline for the analysis of microRNA Deep Sequencing data
— Discovering known and novel miRNAs from deep sequencing data
Metabolic, Gene Regulatory & Signal Transduction Network Databases 代谢、基因调控和信号转导网络数据库
Systems Biology 系统生物学
Synthetic Biology 合成生物学
BBOCUS (BackTranslation Based On Codon Usage Strategy) by Ferro and Purrello lab, a re-implementation of the algorithm in Graziano Pesole's BACKTR. It's based on cluster analysis (Complete Linkage algorithm), that requires a similarity matrix D containing distance between each pair of sequences of mRNA.
DNA Tools DNA工具
- by Benchling, Inc. Free online tools for vector editing, restriction analysis, primer search, multi-sequence alignment, and more.
- by Schepartz lab. Calculate extinction coefficients, Tm's, and base composition for your DNA or RNA; calculate amino acid composition and extinction coefficient for your protein.
Clipboard by Austin Che. Web tool for getting complement, reverse complement, translation and restriction enzyme analysis of a DNA sequence.
- by Molecula Maxima. An integrated development environment and a compiler for a high-level bio-programming language for Synthetic Biology. Based on iGEM conventions.
- by Hoover and Lubkowski. A web tool for optimizing melting temperature during gene synthesis.
File format converter
- by Austin Che. Web tool for converting between sequence file formats.
Geneious by Biomatters. Comprehensive suite of tools for molecular biology.
- is the industry's most user friendly genetic engineering design tool. It allows you to manipulate genetic information; from genes to plasmids to whole genomes. You can rapidly access extensive libraries of genetic parts, and easily order your final design from a variety of providers.
- by Boeke lab. Collection of online (and some command line) tools for codon optimization and shuffling, restriction site editing, and so on.
- by DNA2.0. Combine genetic building blocks by drag-and-drop, codon optimize, restriction site editing, sequence oligo design etc.
- is a design tool that uses collections or libraries of genetic parts and explicit design rules describing how these parts should be combined to engineer genetic constructs.
- by New England Biolabs, Inc.Tool for finding restriction sites, et cetera.
Synthetic Gene Designer
- by Gang Wu. A web platform that allows codon optimization to various extent. Compatible with non-standard genetic codes.
- by Informax, Inc. Free-to-academics tool for sequence analysis and data management.
j5, DeviceEditor, and VectorEditoronline tools
- j5: DNA assembly design automation for (combinatorial) flanking homology (e.g., SLIC/Gibson/CPEC/SLiCE/yeast) and type IIs-mediated (e.g., Golden Gate/FX cloning) assembly methods.
- DeviceEditor: a visual DNA design canvas that serves as front-end for j5.
- VectorEditor: a visual DNA editing and annotation tool.
- (Molecular Cloning Designer Simulator) by Zhenyu Shi, aims to set up a standard way for designing, describing and simulating molecular cloning and genetic engineering.
RNA Tools RNA工具
- by Ambion, Inc. Website with many useful nucleic acid parameters.
- by Michael Zuker. is for predicting RNA and DNA folds, calculating Tm's and free energies.
Protein Tools 蛋白质工具
- by NCBI. A helper application for your web browser that allows you to view 3-dimensional structures from NCBI's Entrez retrieval service. It doesn't read PDB files but can be more straightforward to use than DeepView.
DeepView by GlaxoSmithKline & Swiss Institute of Bioinformatics. Awesome program for viewing and studying protein structure.
ExPASy Proteomics server
- by the Swiss Institute of Bioinformatics. Collection of links to many pages to calculate parameters of your favorite proteins.
- by Sali Lab. For homology or comparative modeling of protein three-dimensional structures.
Zinc Finger Tools by Barbas Lab. Design Zinc Finger DNA binding proteins.
CAD Tools CAD工具
- by Deepak Chandran. Construct computational models using biological parts, cells, and modules.
- by Kent McClymont and Orkun Soyer. Construct thermodynamically feasible metabolic paths among user-defined compounds.
General Tools 一般工具
- by Institut Pasteur. E. coli genome site; get sequences, see the position of your gene in the chromosome, see the function of your gene, and other fun stuff. You can also search for protein sequences/motifs within the E. coli genome.
Registry of Standard Biological Parts
- by MIT. Open repository of BioBricks; the place for all your standard biological parts.
- A site where you can explore the various features of the JBEI Registry software, and even get some work done! A DNA part, plasmid, microbial strain, and Arabidopsis Seed online repository with physical sample tracking capabilities.
PaR-PaR Laboratory Automation Platform allows researchers to use liquid-handling robots effectively, enabling experiments that would not have been considered previously. After minimal training, a biologist can independently write complicated protocols for a robot within an hour.
Other Databases (Annotations, Ontologies, Consortia, etc.)
Miscellaneous Tools 其他工具
Computational Resources 计算资源
Human Diseases and Cancer Databases 人类疾病与癌症数据库
- a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily.
- an integrated database of human maladies and their annotations,modeled on the architecture and richness of the popular GeneCards databse of human gene.
- (The Cancer Genome Atlas)
a collaboration between the National Cancer Institute(NIC) And the National Human Genome Research Institute (NHGRI)that has generated comprehensive,mutil-dimensional maps of the key genomic changes in 33 types of cancer.
- (Cancer Cell Line Encycclopedia) a collaboration between the Board Institute,and the Novartis Institute for Biomedical Research and its Genomics Institute of the Novartis Research Foundation to conduct a detailed genetic and pharmacologic characterization of a large panel of human cancer models, to develop integrated computational analyses that link distinct pharmacologic vulnerabilities to genomic patters and to translate cell line integrative genomics into cancer patient stratiffication.The CCLE provides public access to genomic data,analysis and visualization for about 1000 cell lines.
- Cancer Genome Anatomy Project
- The Cancer Genome Atlas
- Cancer Genetics With an Edge
- Cancers Genomes and their Implications for Curing Cancer by Bert Vogelstein
- Cancer Risk Prediction Models and Assessment
- Database of 100,000 Chest X-Ray images, associated data, and diagnoses
- (Digital Imaging and Communications in Medicine)the international standard to transmit, store, retrieve, print, process, and display medical imaging information.
- The Cancer Imaging Archive (TCIA)
- Cross-sectional MRI Data in Young, Middle Aged, Nondemented and Demented Older Adults; Longitudinal MRI Data in Nondemented and Demented Older Adults
- Alzheimer’s Disease Neuroimaging Initiative (ADNI) unites researchers with study data as they work to define the progression of Alzheimer’s disease. ADNI researchers collect, validate and utilize data such as MRI and PET images, genetics, cognitive tests, CSF and blood biomarkers as predictors for the disease.
- The Federal Interagency Traumatic Brain Injury Research (FITBIR) informatics system: MRI, PET, Contrast, and other data on a range of TBI conditions
- STructured Analysis of the Retina: This research concerns a system to automatically diagnose diseases of the human eye.
- Cancer Digital Slide Archive
- Whole-slide images from The Cancer Genome Atlas's (TCGA) glioblastoma multiforme (GBM) samples
- The Cancer Imaging Archive
- The image data in The Cancer Imaging Archive (TCIA) is organized into purpose-built collections of subjects. The subjects typically have a cancer type and/or anatomical site (lung, brain, etc.) in common.
- Johns Hopkins Medical Institute
- DTI Atlases: adults, children, ...
- Duke Center for In Vivo Microscopy
- Small animal MRI, CT, ...
National Alliance for Medical Image Computing (NAMIC): Lupus white matter lesions, Brain MRI: 2-4 years old, Prostate
NLM: Imaging Methods Assessment and Reporting: Liver tumors with segmentations
100 Healthy Brain MRI: 18-90 years old
- UCI Machine Learning Repository
- The father of internet data archives for all forms of machine learning.
- Computer Vision Online Image Archive
- Large listing of multiple databases in computer vision and biomedical imaging
- Cornell Visualization and Image Analysis (VIA) group
- Provides a list of available databases, many of which are also listed here.
- UT Health Science Center Image Collections:
- List of medical images, atlases, and databases available on the web.
- OmniMedicalSearch.com: Medical Image Databases & Libraries
- Digital Database for Screening Mammography (DDSM)
- Large collection with normal and abnormal findings and ground truth.
- Digital Retinal Images for Vessel Extraction (DRIVE)
- Digital images and expert segmentations of retinal vessels.
- Japanese Society of Radiological Technology (JSRT) Database
- Digital Chest X-ray images with lung nodule locations, ground truth, and controls.
- Segmentation in Chest Radiographs (SCR) database
- Digital Chest X-ray images with segmentations of lung fields, heart, and clavicles.
- Public Lung Database to Address Drug Response
- Well documented chest CT images.
- Mammographic Image Analysis Society (mini-MIAS) Database
- Mammographic images and markup.
- Standard Diabetic Retinopathy Database (DIARETDB1)
- Digital retinal images for detecting and quantifying diabetic retinopathy.
- SpineWeb is an online collaborative platform for everyone interested in research on spinal imaging and image analysis.
- MR data of Hips, knees and other sites affected by OA
- MR pediatric repository
- MR brain
- 3D photography
- NIH-funded datasets.
For example, from UNC
Bioinformatics on-line course materials and tutorials 生物信息学在线课程教材
Background Information & News 背景信息与新闻
Bioinformatics Conferences 国际会议
Other Collections 其他资源
Nucleic Acids Research (NAR) database category list
Nucleic Acids Research (NAR) web server category list
- Online Bioinformatics Resources Collection
Bioinformatics, Databases and Software for
- Covers recent literature,
tutorials, links, bioinformatics database,
jobs, and news, updated daily
- critical mass of content exceeds 50,000 papers from more than sixty authoritative journals, all full-text searchable and linked to external bibliographic databases.
- new developments in genome bioinformatics and computational biology
- Awesome Bioinformatics 按照用途类别列举了生物信息学常用的一些工具和包，比较全，包括python，R及其他一些软件包
ExPASy Proteomics tools
- Computational Tools (Nature Reviews Genetics)
- BioThink: Bioinformatics Services, Scientific Editing, Synthetic Biology
- Sequence Manipulation Suite
Bioinformatics Journals 相关期刊
Up to Top