ProtVista

A BioJS viewer for protein sequence features

This version of ProtVista is now deprecated. Head over here to discover the new version!

ProtVista

Table of Contents generated with DocToc

Interaction and customization

ProtVista presents sequence feature tracks under a ruler that represents sequence length for this protein. The track names are listed on the left hand side and the sequence features are shown in the horizontal track under the ruler.

You can expand a feature track to view all the sub-category titles by clicking on the blue area with the track name. In the example below, ‘Domains & Sites’ expands into ‘Domain’, ‘Binding site’, Active site’ and ‘Nucleotide binding site.

Expanding a feature track

Zoom

You can zoom into a feature track right at various levels by double clicking on the feature you are interested, using your mouse zoom function or dragging the edges of the ruler into the area that you wish to zoom into. You can then slide the selected area on the ruler to focus on your area of interest.

Zoom

You can directly zoom in to the level of the sequence (maximum possible zoom resolution) by clicking on the zoom icon.

Using the zoom icon

Shapes and colors of features

Sequence features that generally span multiple amino acids (like domains) are represented by rectangles. Other shapes represent sequence features that generally span only one amino acid. For example in the screenshot below, the purple rectangle represents a domain and the pink circle represents an active site. One track can have several sub-categories of features grouped inside it. Each feature sub-category has a dedicated color (for example ‘active site’ will have a different color to ‘binding site’).

Shapes and colours

Getting more information on track names and features

To learn more about a feature track’s name or or sub-category’s name and definition, hover over the title with your mouse.

Tracks title

To get more information about a sequence feature, click on it and you will see an info-box as shown below. You will also see a yellow highlight across the area occupied by the sequence feature so that you can easily see which other features overlap with it. Any selections can be reset by clicking on the white space in the tracks or by clicking on the reset icon.

Feature selection

Customize

You can customize the tracks that you see by clicking on the settings icon. You will see the list of all tracks with checkboxes to the left. You can deselect a checkbox in order to remove a track from view and select it to add the track back.

Customise tracks

Natural variants track

The natural variants track shows natural variants from UniProt annotations as well as Large Scale Studies (COSMIC, 1000 Genomes, Exome Sequencing Project, ExAC). The default track view shows a summary chart indicating the number of variants at each sequence position. For example, the peaks show areas of high variation.

Variants overview

To expand the variants track and see all individual variants, click on the ‘Natural variants’ title area. You will now see each a chart where the x-axis consists of the sequence positions and the y-axis consists of all possible amino acids (shown below). All natural variants found for your protein are plotted on this chart, represented by colored circles. The variants are color coded by deleteriousness and source, with the exact legend shown on the left hand in the track title area.

Variants expanded track

You can filter the variants to see those curated by UniProt (UniProt Reviewed) or Large Scale Study variants. You can also filter by consequence and choose disease associated, predicted deleterious to predicted benign, non-disease associated and variants affecting the initiator codon, stop loss or stop gain.

Click on a variant to view more details about the variants including the exact sequence change, the evidence, source and identifiers.

Variants tool tip

Feature categories and types

Feature types are grouped in categories. Here we present the list of the currently supported feature categories and types.

Category name: DOMAIN_AND_SITES. Label: Domains & sites

Type Label Description Shape
domain Domain Position and type of each modular protein domain rectangle
repeat Repeat Positions of repeated sequence motifs or repeated domains rectangle
ca_bind Calcium binding Position(s) of calcium binding region(s) within the protein rectangle
zn_fing Zinc finger Position(s) and type(s) of zinc fingers within the protein rectangle
dna_bind DNA binding Position and type of a DNA-binding domain rectangle
np_bind Nucleotide binding Nucleotide phosphate binding region rectangle
region Region Region of interest in the sequence rectangle
coiled Coiled coil Positions of regions of coiled coil within the protein rectangle
motif Motif Short (up to 20 amino acids) sequence motif of biological interest rectangle
act_site Active site Amino acid(s) directly involved in the activity of an enzyme circle
metal Metal binding Binding site for a metal ion diamond
binding Binding site Binding site for any chemical group (co-enzyme, prosthetic group, etc.) cat-face-like polygon
site Site Any interesting single amino acid site on the sequence chevron pointing down

Category name: MOLECULE_PROCESSING. Label: Molecule processing

Type Label Description Shape
init_met Initiator methionine Cleavage of the initiator methionine arrow pointing down
signal Signal Sequence targeting proteins to the secretory pathway or periplasmic space rectangle
transit Transit peptide Extent of a transit peptide for organelle targeting rectangle
propep Propeptide Part of a protein that is cleaved during maturation or activation rectangle
chain Chain Extent of a polypeptide chain in the mature protein rectangle
peptide Peptide Extent of an active peptide in the mature protein rectangle

Category name: PTM. Label: Post translational modifications

Type Label Description Shape
mod_res Modified residue Modified residues excluding lipids, glycans and protein cross-links triangle pointing down
lipid Lipidation Covalently attached lipid group(s) wave
carbohyd Glycosylation Covalently attached glycan group(s) hexagon
disulfid Disulfide bond Cysteine residues participating in disulfide bonds either bridge or antenna
crosslnk Cross-link Residues participating in covalent linkage(s) between proteins either bridge or antenna

Category name: SEQUENCE_INFORMATION. Label: Sequence information

Type Label Description Shape
compbias Compositional biased Region of compositional bias in the protein rectangle
non_std Non-standard residue Occurrence of non-standard amino acids (selenocysteine and pyrrolysine) in the protein sequence pentagon pointing right
unsure Sequence uncertainty Regions of uncertainty in the sequence rectangle
conflict Sequence conflict Description of sequence discrepancies of unknown origin rectangle
non_cons Non-adjacent residues Indicates that two residues in a sequence are not consecutive slash symbol
non_ter Non-terminal residue The sequence is incomplete. Indicate that a residue is not the terminal residue of the complete protein slash symbol

Category name: STRUCTURAL. Label: Structural features

Type Label Description Shape
helix Helix Helical regions within the experimentally determined protein structure rectangle
turn Turn Turns within the experimentally determined protein structure rectangle
strand Beta strand Beta strand regions within the experimentally determined protein structure rectangle

Category name: TOPOLOGY. Label: Topology

Type Label Description Shape
topo_dom Topological domain Location of non-membrane regions of membrane-spanning proteins rectangle
transmem Transmembrane Extent of a membrane-spanning region rectangle
intramem Intramembrane Extent of a region located in a membrane without crossing it rectangle

Category name: MUTAGENESIS. Label: Mutagenesis

Type Label Description Shape
mutagen Mutagenesis Site which has been experimentally altered by mutagenesis rectangle

Category name: PROTEOMICS. Label: Proteomics

Type Label Description Shape
unique Unique peptides Unique peptides based on peptide evidence mapped from mass-spectrometry proteomics services (PeptideAtlas, EPD and MaxQB) to UniProtKB sequences rectangle
non-unique Non-unique peptides Non-unique peptides based on peptide evidence mapped from mass-spectrometry proteomics services (PeptideAtlas, EPD and MaxQB) to UniProtKB sequences rectangle

Category name: VARIATION. Label: Variants

Type Label Description Shape
variant Natural variant Description of a natural variant of the protein circle

ProtVista

Table of Contents generated with DocToc

Interaction and customization

ProtVista presents sequence feature tracks under a ruler that represents sequence length for this protein. The track names are listed on the left hand side and the sequence features are shown in the horizontal track under the ruler.

You can expand a feature track to view all the sub-category titles by clicking on the blue area with the track name. In the example below, ‘Domains & Sites’ expands into ‘Domain’, ‘Binding site’, Active site’ and ‘Nucleotide binding site.

Expanding a feature track

Zoom

You can zoom into a feature track right at various levels by double clicking on the feature you are interested, using your mouse zoom function or dragging the edges of the ruler into the area that you wish to zoom into. You can then slide the selected area on the ruler to focus on your area of interest.

Zoom

You can directly zoom in to the level of the sequence (maximum possible zoom resolution) by clicking on the zoom icon.

Using the zoom icon

Shapes and colors of features

Sequence features that generally span multiple amino acids (like domains) are represented by rectangles. Other shapes represent sequence features that generally span only one amino acid. For example in the screenshot below, the purple rectangle represents a domain and the pink circle represents an active site. One track can have several sub-categories of features grouped inside it. Each feature sub-category has a dedicated color (for example ‘active site’ will have a different color to ‘binding site’).

Shapes and colours

Getting more information on track names and features

To learn more about a feature track’s name or or sub-category’s name and definition, hover over the title with your mouse.

Tracks title

To get more information about a sequence feature, click on it and you will see an info-box as shown below. You will also see a yellow highlight across the area occupied by the sequence feature so that you can easily see which other features overlap with it. Any selections can be reset by clicking on the white space in the tracks or by clicking on the reset icon.

Feature selection

Customize

You can customize the tracks that you see by clicking on the settings icon. You will see the list of all tracks with checkboxes to the left. You can deselect a checkbox in order to remove a track from view and select it to add the track back.

Customise tracks

Natural variants track

The natural variants track shows natural variants from UniProt annotations as well as Large Scale Studies (COSMIC, 1000 Genomes, Exome Sequencing Project, ExAC). The default track view shows a summary chart indicating the number of variants at each sequence position. For example, the peaks show areas of high variation.

Variants overview

To expand the variants track and see all individual variants, click on the ‘Natural variants’ title area. You will now see each a chart where the x-axis consists of the sequence positions and the y-axis consists of all possible amino acids (shown below). All natural variants found for your protein are plotted on this chart, represented by colored circles. The variants are color coded by deleteriousness and source, with the exact legend shown on the left hand in the track title area.

Variants expanded track

You can filter the variants to see those curated by UniProt (UniProt Reviewed) or Large Scale Study variants. You can also filter by consequence and choose disease associated, predicted deleterious to predicted benign, non-disease associated and variants affecting the initiator codon, stop loss or stop gain.

Click on a variant to view more details about the variants including the exact sequence change, the evidence, source and identifiers.

Variants tool tip

Feature categories and types

Feature types are grouped in categories. Here we present the list of the currently supported feature categories and types.

Category name: DOMAIN_AND_SITES. Label: Domains & sites

Type Label Description Shape
domain Domain Position and type of each modular protein domain rectangle
repeat Repeat Positions of repeated sequence motifs or repeated domains rectangle
ca_bind Calcium binding Position(s) of calcium binding region(s) within the protein rectangle
zn_fing Zinc finger Position(s) and type(s) of zinc fingers within the protein rectangle
dna_bind DNA binding Position and type of a DNA-binding domain rectangle
np_bind Nucleotide binding Nucleotide phosphate binding region rectangle
region Region Region of interest in the sequence rectangle
coiled Coiled coil Positions of regions of coiled coil within the protein rectangle
motif Motif Short (up to 20 amino acids) sequence motif of biological interest rectangle
act_site Active site Amino acid(s) directly involved in the activity of an enzyme circle
metal Metal binding Binding site for a metal ion diamond
binding Binding site Binding site for any chemical group (co-enzyme, prosthetic group, etc.) cat-face-like polygon
site Site Any interesting single amino acid site on the sequence chevron pointing down

Category name: MOLECULE_PROCESSING. Label: Molecule processing

Type Label Description Shape
init_met Initiator methionine Cleavage of the initiator methionine arrow pointing down
signal Signal Sequence targeting proteins to the secretory pathway or periplasmic space rectangle
transit Transit peptide Extent of a transit peptide for organelle targeting rectangle
propep Propeptide Part of a protein that is cleaved during maturation or activation rectangle
chain Chain Extent of a polypeptide chain in the mature protein rectangle
peptide Peptide Extent of an active peptide in the mature protein rectangle

Category name: PTM. Label: Post translational modifications

Type Label Description Shape
mod_res Modified residue Modified residues excluding lipids, glycans and protein cross-links triangle pointing down
lipid Lipidation Covalently attached lipid group(s) wave
carbohyd Glycosylation Covalently attached glycan group(s) hexagon
disulfid Disulfide bond Cysteine residues participating in disulfide bonds either bridge or antenna
crosslnk Cross-link Residues participating in covalent linkage(s) between proteins either bridge or antenna

Category name: SEQUENCE_INFORMATION. Label: Sequence information

Type Label Description Shape
compbias Compositional biased Region of compositional bias in the protein rectangle
non_std Non-standard residue Occurrence of non-standard amino acids (selenocysteine and pyrrolysine) in the protein sequence pentagon pointing right
unsure Sequence uncertainty Regions of uncertainty in the sequence rectangle
conflict Sequence conflict Description of sequence discrepancies of unknown origin rectangle
non_cons Non-adjacent residues Indicates that two residues in a sequence are not consecutive slash symbol
non_ter Non-terminal residue The sequence is incomplete. Indicate that a residue is not the terminal residue of the complete protein slash symbol

Category name: STRUCTURAL. Label: Structural features

Type Label Description Shape
helix Helix Helical regions within the experimentally determined protein structure rectangle
turn Turn Turns within the experimentally determined protein structure rectangle
strand Beta strand Beta strand regions within the experimentally determined protein structure rectangle

Category name: TOPOLOGY. Label: Topology

Type Label Description Shape
topo_dom Topological domain Location of non-membrane regions of membrane-spanning proteins rectangle
transmem Transmembrane Extent of a membrane-spanning region rectangle
intramem Intramembrane Extent of a region located in a membrane without crossing it rectangle

Category name: MUTAGENESIS. Label: Mutagenesis

Type Label Description Shape
mutagen Mutagenesis Site which has been experimentally altered by mutagenesis rectangle

Category name: PROTEOMICS. Label: Proteomics

Type Label Description Shape
unique Unique peptides Unique peptides based on peptide evidence mapped from mass-spectrometry proteomics services (PeptideAtlas, EPD and MaxQB) to UniProtKB sequences rectangle
non-unique Non-unique peptides Non-unique peptides based on peptide evidence mapped from mass-spectrometry proteomics services (PeptideAtlas, EPD and MaxQB) to UniProtKB sequences rectangle

Category name: VARIATION. Label: Variants

Type Label Description Shape
variant Natural variant Description of a natural variant of the protein circle