LON-CAPA Nucleic Acid Databases

Welcome to the GenomeWeb
Nucleic Acid Databases

Search for:

These are a collection of nucleic acid database sites.

Search Major Sequence Databases

Search Databases at EBI (EMBL)

Search Databases at NCBI (GenBank)

Search Databases at GSDB

Search Databases at DDBJ

BLAST search of databases at NCBI

BLAST search of databases at NCGR

BLASTula, the server of Blast servers

BLAST2 search of databases at EMBL

INCA with BLAST / Entrez

dbEST Directory (GenBank)

dbGSS - Genome Survey Sequence

SRS-FASTA: Similarity Search of GenBank Subsets

Sequence Retrieval System (SRS)

WWW-Query - sequence data and multivariate analysis

Miscellaneous Nucleic Databases

REBASE The Restriction Enzyme Database

Multi-Cut - A Data Base of Restriction Endonuclease Buffers

Sequence Tag Alignment and Consensus Knowledgebase (STACK)

Vector Sequence Database

Codon Usage Database

ImMunoGeneTics Database (IMGT)

EPD Eukaryotic Promotor Database

The Tumor Gene Database

The 100 Kb Club

Nucleic Acid Database (NDB) Project

DNA Patents Database

Molecular Probe Data Base (MPDB)

Detailed information on the above options

Search Databases at EBI (EMBL)
The EBI provides facilities to search for sequences by text or by sequence similarity and to submit new sequences.

Search Databases at NCBI (GenBank)
The NCBI provides facilities to search for sequences by text or by sequence similarity and to submit new sequences.

Search Databases at GSDB
The GSDB provides facilities to search for sequences by text or by sequence similarity and to submit new sequences.

Search Databases at DDBJ
The DDBJ provides facilities to search for sequences by text or by sequence similarity and to submit new sequences.

BLAST search of databases at NCBI
BLAST is a program that allows you to search for similarity between your query sequence and the gene sequences held at the NCBI.

BLAST search of databases at NCGR
BLAST is a program that allows you to search for similarity between your query sequence and the gene sequences held at the NCGR.

BLASTula, the server of Blast servers
BLASTula, the server of Blast servers: a group of pages offering a unique access to more than 40 different Blast servers world-wide operating on original sets of sequences.

BLAST documentation
"classical" BLAST analyses
BLAST searches operating on specialised databases
enhanced BLAST analyses (Wu-Blast, Beauty-Blast, Prodom-Blast, ...)

BLAST2 search of databases at EMBL
BLAST2 is a program that allows you to search for similarity between your query sequence and the gene sequences held at EMBL. It is similar to the original BLAST program, but it includes gaps in the alignments.

INCA with BLAST / Entrez
Iterative Neighborhood Cluster Analysis

INCA is a Java applet that runs BLAST

INCA is a Java 1.02 applet. Give INCA a starter sequence and it finds related sequences. INCA runs BLAST on the starter sequence and then runs BLAST on the matching sequences. INCA keeps track of all the results. INCA originally accessed the Entrez predefined sequence neighbors. Now INCA uses the BLAST server to find sequence neighbors dynamically. Using BLAST instead of Entrez to find neighbors permits one to adjust search parameters as needed, and can improve search results.

dbEST Directory (GenBank)
dbEST (Expressed Sequence Tag) sequences are 'single pass' partial DNA sequences derived from clones randomly selected from cDNA libraries. dbEST is maintained by NCBI and included in the GenBank database. Because these data differ from traditional GenBank entries and thus require special processing and annotation, NCBI also makes them available in a separate database, dbEST. The full reports contain information on the availability of physical cDNA clones and mapping data in collaboration with the Genome Data Base at Johns Hopkins University.

dbGSS - Genome Survey Sequence
Contains contact information about the contributors, experimental conditions and genetic map locations of the Genome Survey Sequence division of Genbank/EMBL.

SRS-FASTA: Similarity Search of GenBank Subsets
This is a search of your query sequence against subsets of nucleic and protein databanks. These subsets are chosen by you with keyword selections in the sequence documentation.

There may be times when you will get better information by eliminating unwanted sections of the databanks before performing a sequence search. Given the large size and constant updates to the biosequence databanks, it is difficult to produce subsets of these data directly for similarity searching. By coupling similarity search software (FastA) with keyword selection software (SRS), one can provide such searches fairly efficiently.

Sequence Retrieval System (SRS)
A powerful search tool with links between more than 20 molecular biology databases (EMBL, SwissProt, PIR, PDB, Prosite ...) allowing complex searches

WWW-Query - sequence data and multivariate analysis
This is a World-Wide Web server for accessing sequence collections indexed with ACNUC and for performing multivariate analyses on sequences. General collections like GenBank or EMBL can be accessed, as well as specialized data banks like Hovergen or NRSub.

Indexation with ACNUC makes possible the building of queries using many criteria to retrieve sequences. Criteria are based on mnemonics, accession numbers, keywords, taxonomic data, bibliographic references, dates of insertion in the bank, the nature of the genome from which a sequence has been obtained, etc. Also, the notion of subsequence introduced in ACNUC allows to retrieve idependently genomic fragments of biological interest like CDS, tRNAs, rRNA, snRNAs, etc.

The result of each query is represented by a list of sequences and this list is temporarily stored in our server. By this way, it is possible to re-use a previous list to build more complex queries or to perform treatments on a set of sequences. Up to now, these methods consist mainly in programs for performing multivariate analyses on the CDS or the proteins. These methods are: Principal Component Analysis (PCA), COrrespondence Analysis (COA), and Multiple Correspondence Analysis (MCA).

REBASE The Restriction Enzyme Database
REBASE is a collection of information about restriction enzymes, methylases, the microorganisms from which they have been isolated, recognition sequences, cleavage sites, methylation specificity, the commercial availability of the enzymes, and references - both published and unpublished observations

Multi-Cut - A Data Base of Restriction Endonuclease Buffers
Multi-cut is a database of restriction endonuclease buffers. It finds compatible buffers for a list of enzymes that you want to use in a multiple restriction endonuclease digest. Multi-Cut searches through activity data from the catalogs of several major restriction endonuclease manufacturers and finds buffers that will work with all of the endonucleases in the reaction.

Sequence Tag Alignment and Consensus Knowledgebase (STACK)
Aims to make the most comprehensive representation of the sequence of each of the expressed genes in the human genome.

Vector Sequence Database
Vector Sequence Database contains the sequences for many vectors, all in Genbank format.

Phage Vectors
Plasmid Vectors
Phagemid Vectors
Phasmid Vectors
Cosmid Vectors
Virus Vectors
YAC Vectors
Organism subsets
Search Vectordb
Advertise your vectors
Other vector resources

Codon Usage Database
A query box to search a codon usage table for an organism, is presented. Search can be done via the Latin name or common name.

Alphabetical lists of all organisms and lists for organisms with 100 or more CDS's in Genbank available, are also presented.

ImMunoGeneTics Database (IMGT)
The ImMunoGeneTics database, IMGT, is an integrated specialised database containing nucleotide sequence information of genes important in the function of the immune system. It collects and annotates sequences belonging to the immunoglobulin superfamily which are involved in immune recognition, these are the B cell antigen receptor (Immunoglobulin or Ig), the T cell antigen receptor (TCR) (LIGM-database) and the class I and class II molecules of the Human Leucocyte Antigens (HLA) system (HLA-database).

EPD Eukaryotic Promotor Database
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of experimentally characterised eukaryotic POL II promoters.

The Tumor Gene Database
A database of genes associated with tumorigenesis and cellular transformation. This database includes oncogenes, proto-oncogenes, tumor supressor genes/anti-oncogenes, regulators and substrates of the above, regions believed to contain such genes such as tumor-associated chromosomal break points and viral integration sites, and other genes and chromosomal regions that seems relevant.

The 100 Kb Club
This is a collection of known sequences over 100 Kb in size prepared by Keith Robison.

Nucleic Acid Database (NDB) Project
The goal of the Nucleic Acid Database Project is to assemble and distribute structural information about nucleic acids.

Structures may be selected by making choices based on a large variety of structural and experimental characteristics.

The user can then view the structure's coordinates in either NDB or PDB format, view the structure's full NDB entry, view the structure using either a local viewer or the remote viewer (RasMol), or display the structure's atlas entry.

DNA Patents Database
The DNA Patents Database, compiled by the National Academy of Sciences (USA) contains the full text of patents. It is set up to provide the key biological information about each patent - which genes are included, the techniques used in their discovery and the precise extent of the claims made in each patent.

Molecular Probe Data Base (MPDB)
Contains information on ca. 4000 synthetic oligonucleotides with a sequence of up to 100 nucleotides.

Any Comments, Questions? Support@hgmp.mrc.ac.uk