Topic 2: Public data resources

Sequence databases, searching, downloading. Genome sites and other "added value" resources. Tools for data access and manipulation.


Where can I obtain sequence data?

Selected resources will be demonstrated and used in the practical tasks.

The NCBI portal

- more than just sequences:

Searching GenBank:


EMBL/EBI database portal


Genome sites - the Arabidopsis example:

A frequently used general format for displaying and organizing genome information: JBrowse


Transcriptome sites (demo):


Ontology:

Controlled dictionary to designate biological processes, structures, conditions, etc.


Last but not least: finding literature:


Tasks

2.1

Obtain and inspect the sequence of the pGWB4 cloning vector. Examine the various sequence formats available and the annotations. Using the knowledge from the previous lesson, construct a map of pGWB4.

2.2

Search the protein section of the GenBank/EMBL database for poplar (Populus sp.) extensins.

2.3

Search the protein section of the GenBank/EMBL database for Rab GDP dissociation inhibitors of Arabidopsis thaliana.

2.4

Use the plant BIOMART tool to, find all genes that are actin binding (using gene ontology term) in the Arabidopsis thaliana (TAIR10) database. In the final gene list, find the number of genes and identify at least 3 gene families, whose members bind actin in Arabidopsis thaliana.