Domain Networks
In the interaction units we have examined so far, the smallest unit
of the network is a protein. But in reality, proteins are assembled
from domains, structurally-compact units with distinct
functional interpretation. A new development in interaction network
analysis is domain interaction networks. These networks
describe protein interactions more specifically, in terms of the
domains involved in the interaction. This has two main applications:
- Predicted domain interactions increase your confidence in
experimentally-observed interactions. The technology for observing
interactions experimentally has a high false positive rate. But,
where observed interactions are supported by another form of
evidence, such as domain interactions, you can have greater
confidence in the observed interaction.
- Where there are multiple types of protein isoforms, due
to either
alternative splicing or genetic variation, domain interactions can
help determine which interactions may still occur. With alternative
splicing, the protein produced may be a shorter form that lacks some
domains, and thus cannot participate in some interactions. With
genetic variation, one or more protein domain may fail to fold
properly or may assume a slightly different configuration, thus
altering the binding propensity of interactions to that domain. In
both cases, domain interaction networks help identify the portions
of the network most directly affected by the protein variation, and
help determine which interactions might and might not occur.
This module will introduce you to domain networks under Cytoscape, and
point you to a few resources for protein domain analysis.
This tutorial features the following plugins:
and the following data files
- il6.sp.sif,
a small dataset of interactions relating to
IL6, Interleukin 6. IL6 is a cytokine involved in B-cell
and nerve cell differentiation,
Download and install the DomainNetworkBuilder and DomainWebLinks
plugin:
- Go to the plugin page at
http://med.bioinf.mpi-sb.mpg.de/domainnet/index.html
(or via the Cytoscape plugins page
http://www.cytoscape.org/plugins2.php
- Scroll
down to the Download section, and follow the link to sign the
license agreement form.
-
Follow the instructions to download the DomainNetworkBuilder and
DomainWebLinks jar files
- Copy the two jar files into your Cytoscape plugins directory.
- If you are currently running Cytoscape, exit and restart.
Basic operation of the DomainNetworkBuilder plugin
-
Load the network il6.sp.sif into Cytoscape. After performing a
y-files organic layout, you should see a network like the one shown
below:

- Under the Plugins menu, select Domain Network,
and Create Domain Interaction Network for Current Network.
-
A Cytoscape Message window will appear, asking you to select the
species corresponding to the network, as shown below. Select Homo
sapiens, and click on Connect to Database
-
After a brief pause, you should see a new network, such as the one
shown below (shown following layout with yFiles Organic
layout
-
To explain the graph, let us focus on the lower portion of the
network, shown below:
- The
yellow nodes represent proteins, while the red nodes represent
protein domains.
- Red
arrows connect proteins from the same domains, forming a list. Green
arows connect proteins to their respective domain lists. For
example, we see that the protein P05231 contains the domain
P05231_IL6, and protein P08887 contains domains
P08887_ig, P08887_fn3, and P08887_Pfam-B_34367.
- Black
lines connect interacting domains. For example, the domain
P05231_IL6 interacts with domains P08887_ig and
P08887_fn3.
- You
can reduce the complexity of this network as follows:
- Under
the Plugins menu, select Domain Network, and
Set Parameters.
- You
should see a menu like the one shown below. Check the box next to
Hide domain nodes without visible domain-domain interaction
edges,
and click OK.
- Your
network should now appear as shown below. Notice that for protein
P29597 at the left, there is no longer a long string of domains
that are not involved in any network.
- To
get a cleaner picture, reapply a layout algorithm or rearrange nodes
manually. After rearranging nodes, the resulting network is shown
below:
- What
does this network tell us? For example,
- protein
Q06124 has three interaction-related domains:
two SH2 domains and
one Y_phosphatase domains. Protein P40189 has four
interaction-related domains: three fn3 domains and one
lep_receptor_Ig domain. These domains are connected, indicating
that both fn3 and lep_receptor_Ig domains tend to
interact with SH2
and Y_phosphatase domains. This adds a level of confidence
to the observed protein-protein interaction between P40189
and Q06124.
- The
protein P13725 contains one interaction-related domain: a
LIF_OSM
domain. Any mutation or other aberration which affects this domain
is likely to affect the interactions this protein. Mutations in
other parts of the protein might or might not change its interaction
pattern.
Further Exploration on domain significance
Domain
names are not always immediately intuitive. For any domain, you can
get further information as follows:
- Click
on a domain node, such as P08887__fn3.
- Right-click
on this node to pull up a menu for further information. If you
follow the link to More Web Info, you should see a menu as shown
below:
- These are all links to additional sources of information.
The links
labeled domains only are for the square domain nodes, while
the links labeled proteins only are for the round protein
nodes.
- Pfam
is a useful resource for learning about the biological
significance of a type of domain: select Pfam. This should
bring you to a page on the Pfam entry for the fn3 domain,
Fibronectin type III.
- Where
does these domain interactions come from? Return to the menu
shown above, follow the link to InterDom, and you should
arrive at
the InterDom entry for domain fn3. InterDom predicts
domain
interactions that are statistically likely according to a variety
of factors, such as whether protein-protein interaction databases
frequently report interactions between proteins with two specific
types of domains. See the InterDom web page for further
information.
- How
reliable are these predictions? Return to the menu shown above
and select 3DID to arrive at the 3DID web site for
domain fn3.
3DID contains records of domains that are shown interacting in
structural data, when two or more proteins are co-crystallized,
and thus represents a very high level of evidence. However, be
aware that if 3DID does not report a given domain-domain
interaction, that does not mean that the interaction is not real;
it could simply mean that there are no proteins with those domains
that have be co-crystallized, for a variety of technical reasons.
- Notice
that the 3DID entry for fn3 reports interactions with IL6
domains. Thus, the domain interaction shown between fn3
and il6 is very reliable.
Final note
For the sample network used in this tutorial, the nodes are
labeled with Uniprot protein IDs. But, the typical interaction
database labels nodes by gene symbols instead. In organisms such as
Homo Sapiens, one gene can create multiple different proteins due to
processes such as alternative splicing. Thus, you can map protein
IDs to gene names uniquely, but cannot always map gene names to
protein IDs uniquely. So if you have interaction data in which the
nodes are labeled by gene names, what can you do?
- There
are many ID mapping services that will give you a list of protein
IDs corresponding to a gene symbol. For example, see the list at
the bottom of the DomainNetworkBuilder web page at
http://med.bioinf.mpi-inf.mpg.de/domainnet/index.php
In cases where there are many protein IDs listed, you will still
need to choose one. One reasonable option is to select the longest
protein.
- You
can use the DomainNetworkBuilder plugin with a network that uses
gene symbols. In cases where multiple protein IDs are available,
the plugin will choose the Uniprot “consensus” sequence, which
is generally the longest protein and the one with the fewest (or no)
unusual mutations. If there is more than one protein for a given
gene, the plugin will give you a warning.
Congratulations!
You have now performed some very hard-core bioinformatics analysis!
For comments, suggestions, or shouts of pure joy, please contact Melissa Cline
at cline (at) pasteur.fr, or post to the
cytoscape-discuss list.