Cytoscape Online Tutorial

Domain Networks

In the interaction units we have examined so far, the smallest unit of the network is a protein. But in reality, proteins are assembled from domains, structurally-compact units with distinct functional interpretation. A new development in interaction network analysis is domain interaction networks. These networks describe protein interactions more specifically, in terms of the domains involved in the interaction. This has two main applications:

  1. Predicted domain interactions increase your confidence in experimentally-observed interactions. The technology for observing interactions experimentally has a high false positive rate. But, where observed interactions are supported by another form of evidence, such as domain interactions, you can have greater confidence in the observed interaction.
  2. Where there are multiple types of protein isoforms, due to either alternative splicing or genetic variation, domain interactions can help determine which interactions may still occur. With alternative splicing, the protein produced may be a shorter form that lacks some domains, and thus cannot participate in some interactions. With genetic variation, one or more protein domain may fail to fold properly or may assume a slightly different configuration, thus altering the binding propensity of interactions to that domain. In both cases, domain interaction networks help identify the portions of the network most directly affected by the protein variation, and help determine which interactions might and might not occur.

This module will introduce you to domain networks under Cytoscape, and point you to a few resources for protein domain analysis.

This tutorial features the following plugins:

and the following data files

Download and install the DomainNetworkBuilder and DomainWebLinks plugin:

  1. Go to the plugin page at http://med.bioinf.mpi-sb.mpg.de/domainnet/index.html (or via the Cytoscape plugins page http://www.cytoscape.org/plugins2.php
  2. Scroll down to the Download section, and follow the link to sign the license agreement form.
  3. Follow the instructions to download the DomainNetworkBuilder and DomainWebLinks jar files
  4. Copy the two jar files into your Cytoscape plugins directory.
  5. If you are currently running Cytoscape, exit and restart.

Basic operation of the DomainNetworkBuilder plugin

  1. Load the network il6.sp.sif into Cytoscape. After performing a y-files organic layout, you should see a network like the one shown below:

  2. Under the Plugins menu, select Domain Network, and Create Domain Interaction Network for Current Network.
  3. A Cytoscape Message window will appear, asking you to select the species corresponding to the network, as shown below. Select Homo sapiens, and click on Connect to Database

  4. After a brief pause, you should see a new network, such as the one shown below (shown following layout with yFiles Organic layout

  5. To explain the graph, let us focus on the lower portion of the network, shown below:

    • The yellow nodes represent proteins, while the red nodes represent protein domains.
    • Red arrows connect proteins from the same domains, forming a list. Green arows connect proteins to their respective domain lists. For example, we see that the protein P05231 contains the domain P05231_IL6, and protein P08887 contains domains P08887_ig, P08887_fn3, and P08887_Pfam-B_34367.
    • Black lines connect interacting domains. For example, the domain P05231_IL6 interacts with domains P08887_ig and P08887_fn3.
  6. You can reduce the complexity of this network as follows:
    • Under the Plugins menu, select Domain Network, and Set Parameters.
    • You should see a menu like the one shown below. Check the box next to Hide domain nodes without visible domain-domain interaction edges, and click OK.
    • Your network should now appear as shown below. Notice that for protein P29597 at the left, there is no longer a long string of domains that are not involved in any network.

  7. To get a cleaner picture, reapply a layout algorithm or rearrange nodes manually. After rearranging nodes, the resulting network is shown below:

  8. What does this network tell us? For example,
    • protein Q06124 has three interaction-related domains: two SH2 domains and one Y_phosphatase domains. Protein P40189 has four interaction-related domains: three fn3 domains and one lep_receptor_Ig domain. These domains are connected, indicating that both fn3 and lep_receptor_Ig domains tend to interact with SH2 and Y_phosphatase domains. This adds a level of confidence to the observed protein-protein interaction between P40189 and Q06124.
    • The protein P13725 contains one interaction-related domain: a LIF_OSM domain. Any mutation or other aberration which affects this domain is likely to affect the interactions this protein. Mutations in other parts of the protein might or might not change its interaction pattern.


    Further Exploration on domain significance

    Domain names are not always immediately intuitive. For any domain, you can get further information as follows:

    1. Click on a domain node, such as P08887__fn3.
    2. Right-click on this node to pull up a menu for further information. If you follow the link to More Web Info, you should see a menu as shown below:

    3. These are all links to additional sources of information. The links labeled domains only are for the square domain nodes, while the links labeled proteins only are for the round protein nodes.
      1. Pfam is a useful resource for learning about the biological significance of a type of domain: select Pfam. This should bring you to a page on the Pfam entry for the fn3 domain, Fibronectin type III.
      2. Where does these domain interactions come from? Return to the menu shown above, follow the link to InterDom, and you should arrive at the InterDom entry for domain fn3. InterDom predicts domain interactions that are statistically likely according to a variety of factors, such as whether protein-protein interaction databases frequently report interactions between proteins with two specific types of domains. See the InterDom web page for further information.
      3. How reliable are these predictions? Return to the menu shown above and select 3DID to arrive at the 3DID web site for domain fn3. 3DID contains records of domains that are shown interacting in structural data, when two or more proteins are co-crystallized, and thus represents a very high level of evidence. However, be aware that if 3DID does not report a given domain-domain interaction, that does not mean that the interaction is not real; it could simply mean that there are no proteins with those domains that have be co-crystallized, for a variety of technical reasons.
      4. Notice that the 3DID entry for fn3 reports interactions with IL6 domains. Thus, the domain interaction shown between fn3 and il6 is very reliable.


    Final note

    For the sample network used in this tutorial, the nodes are labeled with Uniprot protein IDs. But, the typical interaction database labels nodes by gene symbols instead. In organisms such as Homo Sapiens, one gene can create multiple different proteins due to processes such as alternative splicing. Thus, you can map protein IDs to gene names uniquely, but cannot always map gene names to protein IDs uniquely. So if you have interaction data in which the nodes are labeled by gene names, what can you do?
    1. There are many ID mapping services that will give you a list of protein IDs corresponding to a gene symbol. For example, see the list at the bottom of the DomainNetworkBuilder web page at http://med.bioinf.mpi-inf.mpg.de/domainnet/index.php In cases where there are many protein IDs listed, you will still need to choose one. One reasonable option is to select the longest protein.
    2. You can use the DomainNetworkBuilder plugin with a network that uses gene symbols. In cases where multiple protein IDs are available, the plugin will choose the Uniprot “consensus” sequence, which is generally the longest protein and the one with the fewest (or no) unusual mutations. If there is more than one protein for a given gene, the plugin will give you a warning.

    Congratulations! You have now performed some very hard-core bioinformatics analysis!

    For comments, suggestions, or shouts of pure joy, please contact Melissa Cline at cline (at) pasteur.fr, or post to the cytoscape-discuss list.