Table of Contents
This document is licensed under the Creative Commons license, 2006
Authors: The Cytoscape Collaboration
The Cytoscape project is an ongoing collaboration between:
Table 1.
University of California at San Diego | |
Institute for Systems Biology | |
Memorial Sloan-Kettering Cancer Center | |
Institut Pasteur | |
Agilent Technologies | |
University of California at San Francisco | |
Funding for Cytoscape is provided by a federal grant from the U.S. National Institute of General Medical Sciences (NIGMS) of the National Institutes of Health (NIH) under award number GM070743-01. Corporate funding is provided through a contract from Unilever PLC.
Cytoscape is a project dedicated to building open-source network visualization and analysis software. A software "Core" provides basic functionality to layout and query the network and to visually integrate the network with state data. The Core is extensible through a plug-in architecture, allowing rapid development of additional computational analyses and features.
Cytoscape's roots are in Systems Biology, where it is used for integrating biomolecular interaction networks with high-throughput expression data and other molecular state information. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape allows the visual integration of the network with expression profiles, phenotypes, and other molecular state information, and links the network to databases of functional annotations.
The central organizing metaphor of Cytoscape is a network (graph), with genes, proteins, and molecules represented as nodes and interactions represented as links, i.e. edges, between nodes.
Cytoscape is a collaborative project between the Institute for Systems Biology (Leroy Hood lab), the University of California San Diego (Trey Ideker lab), Memorial Sloan-Kettering Cancer Center (Chris Sander lab), the Institut Pasteur (Benno Schwikowski lab), Agilent Technologies (Annette Adler lab) and the University of California, San Francisco (Bruce Conklin lab).
Visit http://www.cytoscape.org for more information.
Cytoscape is protected under the GNU LGPL (Lesser General Public License). The License is included as an appendix to this manual, but can also be found online: http://www.gnu.org/copyleft/lesser.txt. Cytoscape also includes a number of other open source libraries, which are detailed in theCytoscape User Manual/Acknowledgements below.
Cytoscape version 2.6 contains several new features, plus improvements to the performance and usability of the software. These include:
Web Service client plugins for downloading networks from PathwayCommons, IntAct, and NCBI Entrez Gene.
Annotation import web service plugin for BioMart. This is mainly for ID translation/synonym mapping.
Cytoscape is a Java application verified to run on Linux, Windows, and Mac OS X. Although not officially supported, other UNIX platforms such as Solaris or FreeBSD may run Cytoscape if Java version 5 or later is available for the platform.
The system requirements for Cytoscape depend on the size of the networks the user wants to load, view and manipulate.
Table 2.
| Small Network Visualization | Large Network Analysis/Visualization |
Processor | 1GHz | As fast as possible |
Memory | 512MB | 2GB+ |
Graphics Card | On board Video | Highend Graphics Card |
Monitor | XGA (1024X768) | Wide or Dual Monitor |
If not already installed on your computer, download and install Java SE 5 or 6. Cytoscape 2.5 will no longer run with Java version 1.4.x or lower. You must install Java SE 5 or 6!!!
These can be found at:
In general, Java SE 6 is faster than 5. If your machine is compatible with the 6 series, please try version 6.
There are a number of options for downloading and installing Cytoscape. All options can be downloaded from the http://cytoscape.org website.
You can check out the latest and greatest software from our Subversion repository.
Cytoscape installations (regardless of platform) containing the following files and directories:
Table 3.
File | Description |
cytoscape.jar | Main Cytoscape application (Java archive) |
cytoscape.sh | Script to run Cytoscape from command line (Linux, Mac OS X) |
cytoscape.bat | Script to run Cytoscape (Windows) |
LICENSE.txt/html | Cytoscape GNU LGPL License |
lib/ | library jar files needed to run Cytoscape. |
docs/ | Manuals in different formats. What you are reading now. |
licenses/ | Licence files for the various libraries distributed with Cytoscape. |
plugins/ | Directory containing cytoscape plugins, in .jar format. |
sampleData/ | |
| galFiltered.gml -- Sample molecular interaction network file * |
| galFiltered.sif -- Identical network in Simple Interaction Format * |
| galExpData.pvals -- Sample gene expression matrix file * |
| galFiltered.nodeAttrTable.xls -- Sample node attribute file in Microsoft Excel format |
| galFiltered.cys -- Sample session file created from datasets above plus annotations from several databases * |
| BINDyeast.sif -- Network of all yeast protein-protein interactions in the BIND database as of Dec, 2006 ** |
| BINDhuman.sif -- Network of all human protein-protein interactions in the BIND database as of Dec, 2006 ** |
| yeastHighQuality.sif -- Sample molecular interaction network file *** |
| interactome_merged.networkTable.gz -- Human interactome network file in tab-delimited format **** |
| sampleStyles.props -- Additional sample Visual Styles |
* From Ideker et al., Science 292:929 (2001)
** Obtained from data hosted at http://www.blueprint.org/bind/bind_downloads.html
** From von Mering et al., Nature, 417:399 (2002) and Lee et al, Science 298:799 (2002)
**** Created from Cytoscape tutorial web page. Original data sets are available at: http://www.cytoscape.orghttp://cytoscape.org/cgi-bin/moin.cgi/Data_Sets/ from "A merged human interactome" by Andrew Garrow, Yeyejide Adeleye and Guy Warner (Unilever, Safety and Environmental Assurance Center).
Double-click on the icon created by the installer or by running cytoscape.sh from the command line (Linux or Mac OS X) or double-clicking cytoscape.bat (Windows). Alternatively, you can pass the .jar file to Java directly using the command java -Xmx512M -jar cytoscape.jar -p plugins. The -Xmx512M flag tells java to allocate more memory for Cytoscape and the -p plugins option tells cytoscape to load all of the plugins in the plugins directory. Loading the plugins is important because many key features like layouts, filters and the attribute browser are included with Cytoscape as plugins in the plugins directory. See the Command Line chapter for more detail. In Windows, it is also possible to directly double-click the .jar file to launch it. However, this does not allow specification of command-line arguments (such as the location of the plugin directory).
When you succeed in launching Cytoscape, a window will appear that looks like this (captured on Mac OS 10.4):
For users interested in loading large networks, the amount of memory needed by Cytoscape will increase. Memory usage depends on both number of network objects (nodes+edges) and the number of attributes. Here are some rough suggestions for memory allocation:
Suggested Memory Size Without View
Table 4.
Number of Objects (nodes + edges) | Suggested Memory Size |
0 - 70,000 | 512M (default) |
70,000 - 150,000 | 800M |
Suggested Memory Size With View
Table 5.
Number of Objects (nodes + edges) | Suggested Memory Size |
0 - 20,000 | 512M (default) |
20,000 - 70,000 | 800M |
70,000 - 150,000 | 1G |
To increase the maximum memory size for Cytoscape, you can specify it from command line. For example, if you want to assign 1GB of memory, type:
java -Xmx1GB -jar cytoscape.jar -p plugins
from the command line.
There is one more option related to memory allocation. Some of the functions in Cytoscape use larger stack space (a temporary memory for some operations, such as Layout). Since this value is set independently from the Xmx value above, sometimes Layout algorithms fails due to the out of memory error. To avoid this, you can set larger heap size by -Xss option. If layout fails for large networks, please try the following:
java -Xmx1GB -Xss10M -jar cytoscape.jar -p plugins
The option -Xss10M means set the heap size to 10MB. In many cases, this solves out of memory error for Layouts.
Randomly generated scale-free network with 500K nodes and 500k edges: If memory parameters are set properly, you can visualize huge networks. In this example, about 5GB of memory is used by Cytoscape. Stack size is set to 10MB. To use large memory space (4GB+) you need 64-bit version operating system AND 64-bit version Java SE 5 or 6.
Note: Some of the web service clients are multi-thread programs and each thread uses the memory size specified by -Xss option. If web service clients fails due to the out of memory error, please reduce the stack size and try again.
For more details, see How to increase memory for Cytoscape.
For the application to work properly, all files should be left in the directory in which they were unpacked. The core Cytoscape application assumes this directory structure when looking for the various libraries needed to run the application. If you are adventurous, you can get creative with the $CLASSPATH and/or the cytoscape.jar manifest file and run Cytoscape from any location you want.
When a network is loaded, Cytoscape will look something like the image below:
The main window here has several components:
The network management and attribute browser panels are dockable tabbed panels known as CytoPanels. You can undock any of these panels by clicking on the Float Window control
in the upper-right corner of the CytoPanel.
If you select this control, e.g. on the attribute browser panel, you will now have two Cytoscape windows, the main window, and a new window labeled CytoPanel 2, similar to the one shown below. Popup will be displayed when you put the mouse pointer on a cell.
Note that CytoPanel 2 now has a Dock Window control. If you select this control, the window will dock onto the main window.
Cytoscape also has an editor that enables you to build and modify networks interactively by dragging and dropping nodes and edges from a palette onto the main network view window. The Node shapes and Edge arrows on the palette are defined by the currently used Visual Style. To edit a network, just select the Editor tab on CytoPanel 1. An example of an editor, with the palette contained in CytoPanel 1 and defined by the BioMoleculeEditor Visual Style, is shown below.
The File menu contains most basic file functionality: File → Open for opening a Cytoscape session file; File → New for creating a new network, either blank for editing, or from an existing network; File → Save for saving a session file; File → Import for importing data such as networks and attributes; and File → Export for exporting data and images. Also, File → Print allows printing, while File → Quit closes all windows of Cytoscape and exits the program.
The Edit menu contains Undo and Redo functions which undo and redo edits made in the Attribute Browser, the Network Editor and to layout.
There are also options for creating and destroying views (graphical representations of a network) and networks (the raw network data – not yet visualized), as well as an option for deleting selected nodes and edges from the current network. All deleted nodes and edges can be restored to the network via Edit → Undo. Editing preferences for properties and plugins is found under Edit → Preferences → Properties... .
The View menu allows you to display or hide the network management panel (CytoPanel 1), the attribute browser (CytoPanel 2), the Network Overview (in CytoPanel 1), and the VizMapper.
The Select menu contains different options for selecting nodes and edges. It also contains the Select → Use Filters option, which allows filters to be created for automatic selection of portions of a network whose node or edge attributes meet a filtering criterion.
The Layout menu has an array of features for visually organizing the network. The features in the top portion of the network (Rotate, Scale, Align and Distribute) are tools for manipulating the network visualization. The bottom section of the menu lists a variety of layout algorithms which automatically lay a network out.
The Plugins menu contains options for managing (install/update/delete) your plugins and may have options added by plugins that have been installed, such as the Agilent Literature Search or Merge Networks. Depending on which plugins are loaded, the plugins that you see may be different than what appear here.
Table 6.
Note: A list of available Cytoscape plugins with descriptions is available online at: http://cytoscape.org/plugins2.php |
Cytoscape 2.3 and newer versions allow multiple networks to be loaded at a time, either with or without a view. A network stores all the nodes and edges that are loaded by the user and a view displays them. You can have many views of the same network. Networks (and their optionally associated views) can be organized hierarchically.
An example where a number of networks have been loaded and arranged hierarchically is shown below:
The network manager (top-right tree view in CytoPanel 1) shows the networks that are loaded. Clicking on a network here will make that view active in the main window, if the view exists (green highlighted networks only). Each network has a name and size (number of nodes and edges), which are shown in the network manager. If a network is loaded from a file, the network name is the name of the file.
Some networks are very large (thousands of nodes and edges) and can take a long time to display. For this reason, a network in Cytoscape may not contain a ‘view’. Networks that have a view are highlighted in green and networks that don’t have a view are highlighted in red. You can create or destroy a view for a network by right-clicking the network name in the network manager or by choosing the appropriate option in the Edit menu. You can also destroy previously loaded networks this way. In the picture above, seven networks are loaded, six green ones with views and one red one without a view.
Certain operations in Cytoscape will create new networks. If a new network is created from an old network, for example by selecting a set of nodes in one network and copying these nodes to a new network (via the File → New → Network option), it will be shown as a child of the network that it was derived from. In this way, the relationships between networks that are loaded in Cytoscape can be seen at a glance. Networks in the top part of the tree in the figure above were generated in this manner.
The available network views are also arranged as multiple overlapping windows in the network view window. You can maximize, minimize, and destroy network views by using the normal window controls for your operating system.
The network overview window shows an overview (or ‘bird’s eye view’) of the network. It can be used to navigate around a large network view. The blue rectangle indicates the portion of the network currently displayed in the network view window, and it can be dragged with the mouse to view other portions of the network. Zooming in will cause the rectangle to appear smaller and vice versa.
Cytoscape recognizes a number of optional command line arguments, including run-time specification of network files, attribute files, and session files. This is the output generated when the cytoscape is executed with the "-h" or "--help" flag:
usage: java -Xmx512M -jar cytoscape.jar [OPTIONS]
-h,--help Print this message.
-v,--version Print the version number.
-s,--session <file> Load a cytoscape session (.cys) file.
-N,--network <file> Load a network file (any format).
-e,--edge-attrs <file> Load an edge attributes file (edge attribute format).
-n,--node-attrs <file> Load a node attributes file (node attribute format).
-m,--matrix <file> Load a node attribute matrix file (table).
-p,--plugin <file> Load a plugin jar file, directory of jar files,
plugin class name, or plugin jar URL.
-P,--props <file> Load cytoscape properties file (Java properties
format) or individual property: -P name=value.
-V,--vizmap <file> Load vizmap properties file (Java properties format).
Any file specified for an option may be specified as either a path or as a URL. For example you can specify a network as a file (assuming that myNet.sif exists in the current working directory): cytoscape.sh -N myNet.sif. Or you can specify a network as a URL: cytoscape.sh -N http://example.com/myNet.sif.
Table 7.
Argument | Description |
-h,--help | This flag generates the help output you see above and exits. |
-v,--version | This flag prints the version number of Cytoscape and exits. |
-s,--session <file> | This option specifies a session file to be loaded. Since only one session file can be loaded at a given time, this option may only specified once on a given command line. The option expects a |
-N,--network <file> | This option is used to load all types of network files. SIF, GML, and XGMML files can all be loaded using the -N option. You can specify as many networks as desired on a single command line. |
-e,--edge-attrs <file> | This option specifies an edge attributes file. You may specify as many edge attribute files as desired on a single command line. |
-n,--node-attrs <file> | This option specifies a node attributes file. You may specify as many node attribute files as desired on a single command line. |
-m,--matrix <file> | This option specifies a data matrix file. In a biological context, the data matrix consists of expression data. All data matrix files are read into node attributes. You may specify as many data matrix files as desired on a single command line. |
-p,--plugin <file> | This option specifies a cytoscape plugin (.jar) file to be loaded by Cytoscape. This option also subsumes the previous "resource plugin option". You may specify a class name that identifies your plugin and the plugin will be loaded if the plugin is in Cytoscape's CLASSPATH. For example, assuming that the class MyPlugin can be found in the CLASSPATH, you could specify the plugin like this: |
-P,--props <file> | This option specifies Cytoscape properties. Properties can be specified either as a properties file (in Java's standard properties format), or as individual properties. To specify individual properties, you must specify the property name followed by the property value where the name and value are separated by the '=' sign. For example to specify the defaultSpeciesName: |
-V,--vizmap <file> | This option specifies a visual properties file. |
All options described above (including plugins) can be loaded from the GUI once Cytoscape is running.
Table 8.
Important! If you have used previous versions of Cytoscape, you will notice that handling of properties has changed. The most important change is that properties are no longer saved by default to the current directory or to your home |
The Cytoscape Properties editor, accessed via Edit → Preferences → Properties…, is used to specify general and default properties. Properties are now stored in Cytoscape session files, so changes to general properties will be saved as part of the current session, but will only carry over to subsequent sessions if they are set as defaults or exported using the File → Export function.
Cytoscape properties are configurable via Add, Modify and Delete operations.
Some common properties are described below.
Table 9.
Property name | Default value | Valid values |
viewThreshold | 10000 | integer > 0 |
secondaryViewThreshold | 30000 | integer > 0 |
viewType | tabbed | tabbed |
defaultWebBrowser |
| A path to the web browser on your system. This only needs to be specified if Cytoscape can’t find the web browser on your system. |
It is possible to alter the default properties for Cytoscape.
Edit the properties via Edit → Preferences → Properties... and check the Make Current Cytoscape Properties Default checkbox. This will save the current properties to the .cytoscape directory, where they will then be applied to all of your Cytoscape sessions from that point on. Otherwise, Cytoscape will automatically save the properties used in a particular session inside its .cys session file, while the default properties will be applied at the beginning of subsequent sessions.
Cytoscape contains a pre-defined list of bookmarks, which point to sample network files located on the Cytoscape web server. Users may add, modify, and delete bookmarks through the Bookmark manager, accessed by going to Edit → Preferences → Bookmarks… .
There are currently two types of bookmarks: network and annotation. Network bookmarks are URLs pointing to network files available on the Internet. These are nomal networks that can be loaded into Cytoscape. The annotation bookmarks are URLs pointing to ontology annotation files. The annotation bookmarks are only used when importing an ontology.
There are 4 different ways of creating networks in Cytoscape:
Network files can be specified in any of the formats described in the Supported Network Formats chapter. Networks are imported into Cytoscape through the "Import Network" window, which can be accessed by going to File → Import → Network (multiple file types). The network file can either be located directly on the local computer, or found on a remote computer (in which case it will be referenced with a URL).
By default, Cytoscape loads networks from the local computer.
The Import Networks dialog shows a default setting of "Data Source Type: Local," meaning that network files from the local computer will be available for importing. Choose the correct file by clicking on the Select button (only file types that Cytoscape recognizes will be shown), and then load the network by clicking on the Import button. Some sample network files of different types have been included in the sampleData folder in Cytoscape.
Network files in SIF, GML, and XGMML formats may also be loaded directly from the command line using the –N option.
The Import Networks dialog is also capable of importing network files using a URL. To do this, set the Data Source Type to Remote and insert the appropriate URL, either manually or using URL bookmarks. Bookmarked URLs can be accessed by clicking on the arrow to the right of the text field (see the Bookmark Manager in Preferences for more details on bookmarks). Also, you can drag and drop links from web browser to the URL text box. Once a URL has been specified, click on the Import button to load the network.
Importing networks from URL addresses has an important caveat. Because Cytoscape determines file type primarily (not exclusively) by file extension, it can have trouble importing networks with URLs that don't end in a human readable file name. If Cytoscape does not recognize a meaningful file name and extension in the URL, it will attempt to guess the type of file based on MIME type. If the MIME type is not recognizable to any of our import handlers, then the import will fail.
Another issue for network import is the presence of firewalls, which can affect which files are accessible to a computer. To work around this problem, Cytoscape supports the use of proxy servers. To configure the proxy server, go to Edit → Preferences→ Proxy Server... . This is further described in the Preferences chapter.
Introduced in version 2.4, Cytoscape now supports the import of networks from delimited text files and Excel workbooks using Edit → Import → Network from Table (Text/MS Excel)... . An interactive GUI allows users to specify parsing options for specified files. The screen provides a preview that shows how the file will be parsed given the current configuration. As the configuration changes, the preview updates automatically. In addition to specifying how the file will be parsed, the user must also choose the columns that represent the Source nodes, the Target nodes, and an optional edge interaction type.
The "Import Network from Table" function supports delimited text files and single-sheet Microsoft Excel Workbooks. The following is a sample table file:
source target interaction boolean attribute string attribute floating point attribute YJR022W YNR053C pp TRUE abcd12371 1.2344543 YER116C YDL013W pp TRUE abcd12372 1.2344543 YNL307C YAL038W pp FALSE abcd12373 1.2344543 YNL216W YCR012W pd TRUE abcd12374 1.2344543 YNL216W YGR254W pd TRUE abcd12375 1.2344543
The network table files should contain at least two columns: source nodes and target nodes. The interaction type is optional in this format. Therefore, a minimal network table looks like the following:
YJR022W YNR053C YER116C YDL013W YNL307C YAL038W YNL216W YCR012W YNL216W YGR254W
One row in a network table file represents an edge and its edge attributes. This means that a network file is considered a combination of network data and edge attributes. A table may contain columns that aren't meant to be edge attributes. In this case, you can choose not to import those columns by clicking on the column header in the preview window. This function is useful when importing a data table like the following (1):
Unique ID A Unique ID B Alternative ID A Alternative ID B Aliases A Aliases B Interaction detection methods First author surnames Pubmed IDs species A species B Interactor types Source database Interaction ID Interaction labels Cross-references Associated Files Experiment files Experiment labels Different techniques Different Pubmed articles Different sources Weight 7205 5747 TRIP6 PTK2 Q15654 Q05397-1 vv|HPRD Currently not available 14688263|15892868(Marcotte) Mammalia Homo sapiens protein|protein HPRD|Marcotte 0 Thyroid hormone receptor interactor 6-FAK-|PTK2-TRIP6 NA(HPRD)|NA(Marcotte) HPRD/02859_psimi.xml|other/ORIGINAL_DATA_MARCOTTE.txt vv(HPRD/02859_psimi.xml)|HPRD(other/ORIGINAL_DATA_MARCOTTE.txt) 17651(ExptRef)|Marcotte 2 2 2 2 4174 7311 MCM5 UBA52 P33992 P62987 neighbouring_reaction Currently not available 15608231(Reactome) Homo sapiens Homo sapiens protein|protein Reactome 1 P33992-P62988 Reaction:68944<->Reaction:68946(Reactome)|Reaction:68946<->Reaction:68944(Reactome) other/ORIGINAL_DATA_MARCOTTE.txt neighbouring_reaction(other/REACTOMEhomo_sapiens.interactions.txt) Reactome 1 1 1 1 7040 7040 TGFB1 TGFB1 P01137 P01137 nmr: nuclear magnetic resonance Currently not available 8679613 Homo sapiens Homo sapiens protein|protein BIND 2 TGFB1-TGFB1- 72085(BIND) BIND/bind_taxid9606.1.psi.xml nmr: nuclear magnetic resonance(BIND/bind_taxid9606.1.psi.xml) NotAvailable 1 1 1 1
This data file is a tab-delimited text and contains network data (interactions), edge attributes, and node attributes. To import network and edge attributes from this table, you need to choose Unique ID A as source, Unique ID B as target, and Interactor types as interaction type. Then you need to turn off columns used for node attributes (Alternative ID A, species B, etc.). Other columns can be imported as edge attributes.
The network import function cannot import node attributes - only edge attributes. To import node attributes from this table, please see the Attributes section of this manual.
Note (1): This data is taken from the A merged human interactome datasets by Andrew Garrow, Yeyejide Adeleye and Guy Warner (Unilever, Safety and Environmental Assurance Center, 12 October 2006). Actual data files are available at http://www.cytoscape.orghttp://cytoscape.org/cgi-bin/moin.cgi/Data_Sets/
To import network text/Excel tables, please follow these steps:
Select File → Import → Network from Table (Text/MS Excel)...
Enable/Disable Attribute Column - By left-clicking on a column header in the preview table, you can enable/disable edge attributes. If the header is checked and entries are blue, the column will be imported as an edge attribute. For example, the table below shows that columns 1 through 3 will be used as network data, column 4 will not be imported, and columns 5 and 6 will be imported as edge attributes.
Change Attribute Name and Data Types - If you right-click on a column header in the preview table, you can modify the attribute name and data type. For more detail, see "Modify Attribute Name/Type" below.
Table Import feature supports list of nodes without edges. If you select source column only, it creates a network without interactions. This feature is usufl with node expansion function available from some web service clients. Please read the section Importing Networks from External Database for more detail.
You can select several options by checking the Show Text File Import Options checkbox.
Attribute names and data types can be modified here.
Cytoscape has a basic data type detection function that automatically suggests the attribute data type of a column according to its entries. This can be overridden by selecting the appropriate data type from the radio buttons provided. For lists, a global delimiter must be specified (i.e., all cells in the table must use the same delimiter).
From version 2.6.0, Cytoscape has a new feature called Web Service Client Manager. Users can access verious kinds of databases through this function. Please read Importing Networks and Attributes from External Database for more detail.
A new, empty network can also be created and nodes and edges manually added. To create an empty network, go to File → New → Network → Empty Network, and then manually add network components using the Editor in CytoPanel 1 (see the Editor chapter for more details).
Cytoscape can read network/pathway files written in the following formats:
The SIF format specifies nodes and interactions only, while other formats store additional information about network layout and allow network data exchange with a variety of other network programs and data sources. Typically, SIF files are used to import interactions when building a network for the first time, since they are easy to create in a text editor or spreadsheet. Once the interactions have been loaded and network layout has been performed, the network may be saved to GML or XGMML format for interaction with other systems. All file types listed (except Excel) are text files and you can edit and view them in a regular text editor.
The simple interaction format is convenient for building a graph from a list of interactions. It also makes it easy to combine different interaction sets into a larger network, or add new interactions to an existing data set. The main disadvantage is that this format does not include any layout information, forcing Cytoscape to re-compute a new layout of the network each time it is loaded.
Lines in the SIF file specify a source node, a relationship type (or edge type), and one or more target nodes:
nodeA <relationship type> nodeB nodeC <relationship type> nodeA nodeD <relationship type> nodeE nodeF nodeB nodeG ... nodeY <relationship type> nodeZ
A more specific example is:
node1 typeA node2 node2 typeB node3 node4 node5 node0
The first line identifies two nodes, called node1 and node2, and a single relationship between node1 and node2 of type typeA. The second line specifies three new nodes, node3, node4, and node5; here "node2" refers to the same node as in the first line. The second line also specifies three relationships, all of type typeB and with node2 as the source, with node3, node4, and node5 as the targets. This second form is simply shorthand for specifying multiple relationships of the same type with the same source node. The third line indicates how to specify a node that has no relationships with other nodes. This form is not needed for nodes that do have relationships, since the specification of the relationship implicitly identifies the nodes as well.
Duplicate entries are ignored. Multiple edges between the same nodes must have different edge types. For example, the following specifies two edges between the same pair of nodes, one of type xx and one of type yy:
node1 xx node2 node1 xx node2 node1 yy node2
Edges connecting a node to itself (self-edges) are also allowed:
node1 xx node1
Every node and edge in Cytoscape has an identifying name, most commonly used with the node and edge data attribute structures. Node names must be unique, as identically named nodes will be treated as identical nodes. The name of each node will be the name in this file by default (unless another string is mapped to display on the node using the visual mapper). This is discussed in the section on visual styles. The name of each edge will be formed from the name of the source and target nodes plus the interaction type: for example, sourceName (edgeType) targetName.
The tag <relationship type> can be any string. Whole words or concatenated words may be used to define types of relationships, e.g. geneFusion, cogInference, pullsDown, activates, degrades, inactivates, inhibits, phosphorylates, upRegulates, etc.
Some common interaction types used in the Systems Biology community are as follows:
pp .................. protein – protein interaction pd .................. protein -> DNA (e.g. transcription factor binding upstream of a regulating gene.)
Some less common interaction types used are:
pr .................. protein -> reaction rc .................. reaction -> compound cr .................. compound -> reaction gl .................. genetic lethal relationship pm .................. protein-metabolite interaction mp .................. metabolite-protein interaction
Whitespace (space or tab) is used to delimit the names in the simple interaction file format. However, in some cases spaces are desired in a node name or edge type. The standard is that, if the file contains any tab characters, then tabs are used to delimit the fields and spaces are considered part of the name. If the file contains no tabs, then any spaces are delimiters that separate names (and names cannot contain spaces).
If your network unexpectedly contains no edges and node names that look like edge names, it probably means your file contains a stray tab that's fooling the parser. On the other hand, if your network has nodes whose names are half of a full name, then you probably meant to use tabs to separate node names with spaces.
Networks in simple interactions format are often stored in files with a .sif extension, and Cytoscape recognizes this extension when browsing a directory for files of this type.
In contrast to SIF, GML is a rich graph format language supported by many other network visualization packages. The GML file format specification is available at:
http://www.infosun.fmi.uni-passau.de/Graphlet/GML/
It is generally not necessary to modify the content of a GML file directly. Once a network is built in SIF format and then laid out, the layout is preserved by saving to and loading from GML. Visual attributes specified in a GML file will result in a new visual style named Filename.style when that GML file is loaded.
XGMML is the XML evolution of GML and is based on the GML definition. In addition to network data, XGMML contains node/edge/network attributes. The XGMML file format specification is available at:
http://www.cs.rpi.edu/~puninj/XGMML/
XGMML is now preferred to GML because it offers the flexibility associated with all XML document types. If you're unsure about which to use, choose XGMML.
The Systems Biology Markup Language (SBML) is an XML format to describe biochemical networks. SBML file format specification is available at:
BioPAX is an OWL (Web Ontology Language) document designed to exchange biological pathways data. The complete set of documents for this format is available at:
The PSI-MI format is a data exchange format for protein-protein interactions. It is an XML format used to describe PPI and associated data. PSI-MI XML format specification is available at:
Cytoscape has native support for Microsoft Excel files (.xls) and delimited text files. The tables in these files can have network data and edge attributes. Users can specify columns containg source nodes, target nodes, interaction types, and edge attributes during file import. Some of the other network analysis tools, such as igraph (http://cneurocvs.rmki.kfki.hu/igraph/), has feature to export graph as simple text files. Cytoscape can read these text files and build networks from them. For more detail, please read the Import Free-Format Tables section section of the Creating Networks chapter.
Network generated by igraph's Watts-Strogatz small-world model (50k nodes and 250k esges) visualized by Cytoscape: You can import networks created by other applications using this Table Import feature.
Typically, genes are represented by nodes, and interactions (or other biological relationships) are represented by edges between nodes. For compactness, a gene also represents its corresponding protein. Nodes may also be used to represent compounds and reactions (or anything else) instead of genes.
If a network of genes or proteins is to be integrated with Gene Ontology (GO) annotation or gene expression data, the gene names must exactly match the names specified in the other data files. We strongly encourage naming genes and proteins by their systematic ORF name or standard accession number; common names may be displayed on the screen for ease of interpretation, so long as these are available to the program in the annotation directory or in a node attribute file. Cytoscape ships with all yeast ORF-to-common name mappings in a synonym table within the annotation/ directory. Other organisms will be supported in the future.
Why do we recommend using standard gene names? All of the external data formats recognized by Cytoscape provide data associated with particular names of particular objects. For example, a network of protein-protein interactions would list the names of the proteins, and the attribute and expression data would likewise be indexed by the name of the object.
The problem is in connecting data from different data sources that don't necessarily use the same name for the same object. For example, genes are commonly referred to by different names, including a formal "location on the chromosome" identifier and one or more common names that are used by ordinary researchers when talking about that gene. Additionally, database identifiers from every database where the gene is stored may be used to refer to a gene (e.g. protein accession numbers from Swiss-Prot). If one data source uses the formal name while a different data source used a common name or identifier, then Cytoscape must figure out that these two different names really refer to the same biological entity.
Cytoscape has two strategies for dealing with this naming issue, one simple and one more complex. The simple strategy is to assume that every data source uses the same set of names for every object. If this is the case, then Cytoscape can easily connect all of the different data sources.
To handle data sources with different sets of names, as is usually the case when manually integrating gene information from different sources, Cytoscape needs a data server that provides synonym information (see the chapter on Annotation). A synonym table gives a canonical name for each object in a given organism and one or more recognized synonyms for that object. Note that the synonym table itself defines which set of names are the "canonical" names. For example, in budding yeast, the ORF names are commonly used as the canonical names.
If a synonym server is available, then by default Cytoscape will convert every name that appears in a data file to the associated canonical name. Unrecognized names will not be changed. This conversion of names to a common set allows Cytoscape to connect the genes present in different data sources, even if they have different names – as long as those names are recognized by the synonym server.
For this to work, Cytoscape must also be provided with the species to which the objects belong, since the data server requires the species in order to uniquely identify the object referred to by a particular name. This is usually done in Cytoscape by specifying the species name on the command line with the –P option (cytoscape.sh -P "defaultSpeciesName=Saccharomyces cerevisiae") or by editing the properties (under Edit → Preferences → Properties...).
The automatic canonicalization of names can be turned off using the -P option (cytoscape.sh -P canonicalizeName=false") or by editing the properties (under Edit → Preferences → Properties...). This canonicalization of names currently does not apply to expression data. Expression data should use the same names as the other data sources or use the canonical names as defined by the synonym table.
Interaction networks are useful as stand-alone models. However, they are most powerful for answering scientific questions when integrated with additional information. Cytoscape allows the user to add arbitrary node, edge and network information to Cytoscape as node/edge/network attributes. This could include, for example, annotation data on a gene or confidence values in a protein-protein interaction. These attributes can then be visualized in a user-defined way by setting up a mapping from data attributes to visual attributes (colors, shapes, and so on). The section on visual styles discusses this in greater detail.
Node and edge attribute files are simply formatted: a node attribute file begins with the name of the attribute on the first line (note that it cannot contain spaces). Each following line contains the name of the node, followed by an equals sign and the value of that attribute. Numbers and text strings are the most common attribute types. All values for a given attribute must have the same type. For example:
FunctionalCategory YAL001C = metabolism YAR002W = apoptosis YBL007C = ribosome
An edge attribute file has much the same structure, except that the name of the edge is the source node name, followed by the interaction type in parentheses, followed by the target node name. Directionality counts, so switching the source and target will refer to a different (or perhaps non-existent) edge. The following is an example edge attributes file:
InteractionStrength YAL001C (pp) YBR043W = 0.82 YMR022W (pd) YDL112C = 0.441 YDL112C (pd) YMR022W = 0.9013
Since Cytoscape treats edge attributes as directional, the second and third edge attribute values refer to two different edges (source and target are reversed, though the nodes involved are the same).
Each attribute is stored in a separate file. Node and edge attribute files use the same format. Node attribute file names often use the suffix ".noa", while edge attribute file names use the suffix ".eda". Cytoscape recognizes these suffixes when browsing for attribute files.
Node and edge attributes may be loaded at the command line using the –n and –e options or via the File → Import menu.
When expression data is loaded using an expression matrix, it is automatically loaded as node attribute data unless explicitly specified otherwise.
Node and edge attributes are attached to nodes and edges, and so are independent of networks. Attributes for a given node or edge will be applied to all copies of that node or edge in all loaded network files, regardless of whether the attribute file or network file is imported first.
Note: In order to import network attributes in Cytoscape 2.4, please go to File → Import → Attribute from Table (text/MS Excel)... or encode them in an XGMML network file (see Supported File Formats for more details).
Every attribute file has one header line that gives the name of the attribute, and optionally some additional meta-information about that attribute. The format is as follows:
attributeName (class=formal.class.of.value)
The first field is always the attribute name: it cannot contain spaces. If present, the class field defines the formal (package qualified) name of the class of the attribute values. For example, java.lang.String for Strings, java.lang.Double for floating point values, java.lang.Integer for integer values, etc. If the value is actually a list of values, the class should be the type of the objects in the list. If no class is specified in the header line, Cytoscape will attempt to guess the type from the first value. If the first value contains numbers in a floating point format, Cytoscape will assume java.lang.Double; if the first value contains only numbers with no decimal point, Cytoscape will assume java.lang.Integer; otherwise Cytoscape will assume java.lang.String. Note that the first value can lead Cytoscape astray: for example,
floatingPointAttribute firstName = 1 secondName = 2.5
In this case, the first value will make Cytoscape think the values should be integers, when in fact they should be floating point numbers. It's safest to explicitly specify the value type to prevent confusion. A better format would be:
floatingPointAttribute (class=Double) firstName = 1 secondName = 2.5
or
floatingPointAttribute firstName = 1.0 secondName = 2.5
Every line past the first line identifies the name of an object (a node in a node attribute file or an edge in a edge attribute file) along with the String representation of the attribute value. The delimiter is always an equals sign; whitespace (spaces and/or tabs) before and after the equals sign is ignored. This means that your names and values can contain whitespace, but object names cannot contain an equals sign and no guarantees are made concerning leading or trailing whitespace. Object names must be the Node ID or Edge ID as seen in the left-most column of the attribute browser if the attribute is to map to anything. These names must be reproduced exactly, including case, or they will not match.
Edge names are all of the form:
sourceName (edgeType) targetName
Specifically, that is
Note that tabs are not allowed in edge names. Tabs can be used to separate the edge name from the "=" delimiter, but not within the edge name itself. Also note that this format is different from the specification of interactions in the SIF file format. To be explicit: a SIF entry for the previous interaction would look like
sourceName edgeType targetName
or
To specify lists of values, use the following syntax:
listAttributeName (class=java.lang.String) firstObjectName = (firstValue::secondValue::thirdValue) secondObjectName = (onlyOneValue)
This example shows an attribute whose value is defined as a list of text strings. The first object has three strings, and thus three elements in its list, while the second object has a list with only one element. In the case of a list every attribute value uses list syntax (i.e. parentheses), and each element is of the same class. Again, the class will be inferred if it is not specified in the header line. Lists are not supported by the visual mapper and so can’t be mapped to visual attributes.
As of Cytoscape 2.4, importing delimited text and MS Excel attribute data tables is now supported. Using this functionality, users can now easily import data that isn't formatted into Cytoscape node or edge attribute file formats (as described above).
Sample Attribute Table 1
Table 12.
Object Key | Alias | SGD ID |
AAC3 | YBR085W|ANC3 | S000000289 |
AAT2 | YLR027C|ASP5 | S000004017 |
BIK1 | YCL029C|ARM5|PAC14 | S000000534 |
The attribute table file should contain a primary key column and at least one attribute column. The maximum number of attribute columns is unlimited. The Alias column is an optional feature, as is using the first row of data as attribute names. Alternatively, you can specify each attribute name from the File → Import → Attribute from Table (text/MS Excel)... user interface.
The user interface of the "Import Attributes from Table" window is similar to that of the "Import Network from Table" window.
Select File → Import → Attribute from Table (text/MS Excel)...
Formerly, Cytoscape only supported mapping between node/edge IDs and the primary keys in attribute files. With the introduction of Cytoscape 2.4, this limitation has been removed, and now both IDs and attributes with primitive data types (string, boolean, floating point, and integer) can be selected as the Key Attribute using the dropdown list provided. Complex attributes such as lists are not supported.
Cytoscape uses a simple mechanism to manage aliases of objects. Both nodes and edges can have aliases. If an attribute is loaded as an alias, it is treated as a special attribute called "alias". This will be used when mapping attributes. If the primary key and key attribute for an object do not match, Cytoscape will search for a match between aliases and the key attribute. To define an alias column in the attribute table, just click on the checkboxes to the left of the column name while importing.
For more detail on these options, please see the "Import Free-Format Table Files" section of the user manual in the Creating Networks chapter.
When Cytoscape is started, the Attribute Browser appears in the bottom CytoPanel. This browser can be hidden and restored using the F5 key or the View → Show/Hide attribute browser menu option. Like other CytoPanels, the browser can be undocked by pressing the little icon in the browser’s top right corner.
To swap between displaying node, edge, and network attributes use the tabs on the bottom of the panel labelled "Node Attribute Browser", "Edge Attribute Browser", and "Network Attribute Browser". The attribute browser displays attributes belonging to selected nodes and/or edges and the currently selected network. To populate the browser with rows (as pictured above), simply select nodes and/or edges in a loaded network. By default, only the ID of nodes and edges is shown. To display more than just the ID, click the Select Attributes
button and choose the attributes that are to be displayed (select various attributes by clicking on them, and then click elsewhere on the screen to close the attribute list). Each attribute chosen will result in one column in the attribute browser. Most attribute values can be edited by double-clicking an attribute cell; list values cannot be edited, and neither can the ID. Attribute rows in the browser can be sorted alphabetically by specific attribute by clicking on a column heading. A new attribute can be created using the Create New Attribute
button, and must be one of four types – integer, string, real number (floating point), or boolean. Attributes can be deleted using the Delete Attributes
button. NOTE: Deleting attributes removes them from Cytoscape, not just the attribute browser! To remove attributes from the browser without deleting them, simply unselect the attribute using the Select Attributes
button.
The right-click menu on the Attribute Browser has several functions, such as exporting attribute information to spreadsheet applications. For example, use the right-click menu to Select All and then Copy the data, and then paste it into a spreadsheet application. Each attribute browser panel also has a button for importing new attributes:
.
The Node Attribute Browser panel has additional buttons for loading Gene Expression attribute matrices (
) as node attributes.
From Cytoscape 2.6, Attribute Browser has new Attribute Batch Editor. This enables you to set and modify attribute values at once. This function changes values for selected nodes or edges. For example, if you want to create a new attribute called Modules and set module names for each group of selected nodes, you can use Set command from this editor.
In addition to normal node and edge attribute data, Cytoscape also supports importing gene expression data. Gene expression data are imported using a different file format than normal attributes; however, the resulting attributes are not treated differently by Cytoscape. Gene expression data (like attribute data) can be loaded at any time, but are (generally) only relevant once a network has been loaded.
Gene expression ratios or values are specified over one or more experiments using a text file. Ratios result from a comparison of two expression measurements (experiment vs. control). Some expression platforms, such as Affymetrix, directly measure expression values, without a comparison. The file consists of a header and a number of space- or tab-delimited fields, one line per gene, with the following format:
Identifier [CommonName] value1 value2 ... valueN [pval1 pval2 ... pvalN]
Brackets [ ] indicate fields that are optional.
The first field identifies which Cytoscape node the data refers to. In the simplest case, this is the gene name - exactly as it appears on the network generated by Cytoscape (case sensitive!). Alternatively, this can be some node attribute that identifies the node uniquely, such as a probeset identifier for commercial microarrays.
The next field is an optional common name. It is not used by Cytoscape, and is provided strictly for the user's convenience. With this common name field, the input format is the same as for commonly-used expression data anaysis packages such as SAM (http://www-stat.stanford.edu/~tibs/SAM/).
The next set of columns represent expression values, one per experiment. These can be either absolute expression values or fold change ratios. Each experiment is identified by its experiment name, given in the first line.
Optionally, significance measures such as P values may be provided. These values, generated by many microarray data analysis packages, indicate where the level of gene expression or the fold change appears to be greater than random chance. If you are using significance measures, then your expression file should contain them in a second set of columns after the expression values. The column names for the expression significance measures need to match those of the expression values exactly.
For example, here is an excerpt from the file galExpData.pvals in the Cytoscape sampleData directory:
GENE COMMON gal1RG gal4RG gal80R gal1RG gal4RG gal80R YHR051W COX6 -0.034 0.111 -0.304 3.75720e-01 1.56240e-02 7.91340e-06 YHR124W NDT80 -0.090 0.007 -0.348 2.71460e-01 9.64330e-01 3.44760e-01 YKL181W PRS1 -0.167 -0.233 0.112 6.27120e-03 7.89400e-04 1.44060e-01 YGR072W UPF3 0.245 -0.471 0.787 4.10450e-04 7.51780e-04 1.37130e-05
This indicates that there is data for three experiments: gal1RG, gal4RG, and gal80R. These names appear two times in the header line: the first time gives the expression values, and the second gives the significance measures. For instance, the second line tells us that in Experiment gal1RG, the gene YHR051W has an expression value of -0.034 with significance measure 3.75720e-01.
Some variations on this basic format are recognized; see the formal file format specification below for more information. Expression data files commonly have the file extensions ".mrna" or ".pvals", and these file extensions are recognized by Cytoscape when browsing for data files.
Load an expression attribute matrix file using File → Import → Attribute/Expression Matrix... to bring up the import window, or by specifying the filename using the -m option at the command line. If you use the command line input, you must enter your expression data by node ID. If you use the dialog box, then you can either load expression data by node ID (the default option), or you can select a node attribute to use in assigning your expression data to your Cytoscape nodes. If you do use a node attribute, then (1) the attribute should already be loaded, and (2) the node attribute value must match the first column in your matrix file.
For the sample network file sampleData/galFiltered.sif:
Option A.
Load a sample gene expression data set by going to File → Import → Attribute/Expression Matrix... . In the resulting window, in the field labeled "Please select an attribute or expression matrix file...", use the Select button to enter sampleData/galExpData.pvals. The identifiers used in this file are the same ones used in the network file sampleData/galFiltered.sif, so you do not need to touch the field labeled "Assign values to nodes using...". A few lines of this file are shown below:
GENE COMMON gal1RG gal4RG gal80R gal1RG gal4RG gal80R YHR051W COX6 -0.034 0.111 -0.304 3.75720e-01 1.56240e-02 7.91340e-06 YHR124W NDT80 -0.090 0.007 -0.348 2.71460e-01 9.64330e-01 3.44760e-01 YKL181W PRS1 -0.167 -0.233 0.112 6.27120e-03 7.89400e-04 1.44060e-01
Option B.
Step 1. After loading the network, load the node attribute file sampleData/gal.probeset.na, using File → Import → Node attributes... . This file is shown in part below:
Probeset YHR051W = probeset2 YHR124W = probeset3 YKL181W = probeset4
Step 2. After loading the node attribute file, select the expression data file sampleData.galExpPvals.probeset.pvals, shown in part below:
GENE COMMON gal1RG gal4RG gal80R gal1RG gal4RG gal80R probeset2 COX6 -0.034 0.111 -0.304 3.75720e-01 1.56240e-02 7.91340e-06 probeset3 NDT80 -0.090 0.007 -0.348 2.71460e-01 9.64330e-01 3.44760e-01 probeset4 PRS1 -0.167 -0.233 0.112 6.27120e-03 7.89400e-04 1.44060e-01
After selecting this file, in the field labeled "Assign values to nodes using...", select Probeset. You will see that this loads exactly the same expression data as in Case 1, but provides extra flexibility in case the node name cannot be used as an identifier.
In all expression data files, any whitespace (spaces and/or tabs) is considered a delimiter between adjacent fields. Every line of text is either the header line or contains all the measurements for a particular gene. No name conversion is applied to expression data files.
The names given in the first column of the expression data file should match exactly the names used elsewhere (i.e. in SIF or GML files).
The first line is a header line with one of the following three header formats:
<text> <text> cond1 cond2 ... cond1 cond2 ... [NumSigConds] <text> <text> cond1 cond2 ... <tab><tab>RATIOS<tab><tab>...LAMBDAS
The first format specifies that both expression ratios and significance values are included in the file. The first two text tokens (in angled brackets) contain names for each gene, such as the formal and common gene names. The condX token set specifies the names of the experimental conditions; these columns will contain ratio values. This list of condition names must then be duplicated exactly, each spelled the same way and in the same order. Optionally, a final column with the title NumSigConds may be present. If present, this column will contain integer values indicating the number of conditions in which each gene had a statistically significant change according to some threshold.
The second format is similar to the first except that the duplicate column names are omitted, and there is no NumSigConds field. This format specifies data with ratios but no significance values.
The third format specifies an MTX header, which is a commonly used format. Two tab characters precede the RATIOS token. This token is followed by a number of tabs equal to the number of conditions, followed by the LAMBDAS token. This format specifies both ratios and significance values.
Each line after the first is a data line with the following format:
FormalGeneName CommonGeneName ratio1 ratio2 ... [lambda1 lambda2 ...] [numSigConds]
The first two tokens are gene names. The names in the first column are the keys used for node name lookup; these names should be the same as the names used elsewhere in Cytoscape (i.e. in the SIF, GML, or XGMML files). Traditionally in the gene expression microarray community, who defined these file formats, the first token is expected to be the formal name of the gene (in systems where there is a formal naming scheme for genes), while the second is expected to be a synonym for the gene commonly used by biologists, although Cytoscape does not make use of the common name column. The next columns contain floating point values for the ratios, followed by columns with the significance values if specified by the header line. The final column, if specified by the header line, should contain an integer giving the number of significant conditions for that gene. Missing values are not allowed and will confuse the parser. For example, using two consecutive tabs to indicate a missing value will not work; the parser will regard both tabs as a single delimiter and be unable to parse the line correctly.
Optionally, the last line of the file may be a special footer line with the following format:
NumSigGenes int1 int2 ...
This line specified the number of genes that were significantly differentially expressed in each condition. The first text token must be spelled exactly as shown; the rest of the line should contain one integer value for each experimental condition.
Cytoscape 2.6.0 has a new feature called Web Service Client Manager. This is a framework to manage various kinds of web service clients in Cytoscape. By using web service clients, users can access remote datasources easily.
A web service is a standardized, platform-independent mechanism for machines to interact over the network. These days, many major biological databases publishes their system with web service API:
List of Biological Web Services: http://taverna.sourceforge.net/services
Web Services at the EBI: http://www.ebi.ac.uk/Tools/webservices/
This enables developers to write a program to access these services. Cytoscape core developer team have developed several sample web service clients using this framework. Currently, Cytoscape supports the following web services:
IntAct: an open source database of protein interaction data, hosted at EMBL-EBI.
Pathway Commons: an open source portal, providing access to multiple integrated data sets, including: Reactome, IntAct, HPRD, HumanCyc, MINT, the MSKCC Cancer Cell Map, and the NCI/Nature Pathway Interaction database.
NCBI Entrez Gene: a public database of genes, including annotation, sequence and interactions.
Biomart: an open source biological database engine. Useful for ID/Name mapping.
All of these clients are available as Plugins and users can install them through Plugin Manager.
In the following sections, users learn how to import network from extrenal databases.
To get started, select: File → Import → Network from web services...
Select: File → Import → Network from web services...
From the pull-down menu, select the IntAct Web Service Client.
After confirming the download of interaction data, the network of BRCA1 will be imported and visualized.
Tip: Expanding the Network: Several of the Cytoscape web services provide additional options in the node context menu. To access these options, right-click on a node and select "Use Web Services." For example, in the screenshot to the right, we have loaded the BRCA1 network from IntAct, and have chosen to merge this node's neighbors into the existing network.
An entry of NCBI Entrez Gene has a section called Interactions. NCBI web service client uses this section to build networks.
Select: File → Import → Network from web services...
From the pull-down menu, select the NCBI Web Service Client.
Enter free-keywords. For example, type human muscular dystrophy.
Network generated from Entrez Gene data: The network above is generated from interaction data matching the keyword human muscular dystrophy. Edge color represents data source type (BIND, BioGRID, or HPRD).
Note: since NCBI client extracts interaction data from a huge dataset, it takes a long time (30 sec. - 5 minutes, depends on machine spec. and network connection) to import large set of interactions.
Select: File → Import → Network from web services...
Then, follow the three-step process outlined below:
In Step 3, you can simply double-click on a pathway of interest, or click on the Interaction Networks tab. The Interaction Networks tab enables you to filter interactions by data source and/or interaction type. For example, you can choose to restrict your network to direct physical interactions from HPRD and MINT only:
You can configure access options from the Options tab. There are two retrieval options:
By default, the simplified binary model is selected.
As additional web service clients become available, they will be made available via the Cytoscape Plugin Manager. Once installed, these web service clients will be centrally accessible via the same steps defined above:
File → Import → Network from web services...
Some of the web service clients can import attributes from external databases. BioMart client is an example. You can install it from Plugin Manager.
Load a network. In this example, we use galFiltered.sif in sampleData directory.
File → Import → Import Attributes from BioMart...
Select Data Source. Since galFiltered.sif is a yeast network, select yeast dataset.
For Key Attribute section, select ID for Attribute and Data Type should be Ensemble Gene ID. Attribute is the list of available attributes in current Cytoscape session and Data Type is the type of ID set of the attribute. In this case, Cytoscape uses ID as the key for mapping. Because the sample network galFiletred.sif uses Ensemble Gene ID for its node ID, like YOR072W, you need to select Ensemble Gene ID for Data Type. So you need to know the type of ID set (Entrez Gene ID, UniProt Unified Acc. Number, Ensemble Gene ID, etc.) of the attribute selected in the Attribute box.
Select attributes you want to import. (Note: You cannot select too many attributes at once because BioMart server has maximum number of selectable annotations.)
Press Import.
Now you can see the newly imported attributes on the Attribute Browser. You may see some attribute names ends with -TOP if there are multiple attribute values for a node. This is an attribute taken from the first entry of the original list attribute.
NCBI Entrez Gene database () can be used as network data source and annotation repository. You can use NCBI web service client to import gene annotations from Entrez Gene.
File → Import → Import attributes from NCBI Entrez Gene...
Data Source is fixed to NCBI Entrez Gene.
Data Type is also fixed to Entrez Gene ID. This means Cytoscape attribute selected in Attribute list should be Entrez Gene ID.
Select Annotation Category. If you select a category, all of the annotation under the category will be imported (i.e., multiple Cytoscape attributes will be created for each category).
Press Import. The following is a sample human network (RUAL.subset.sif) annotated by NCBI client.
Web services are useful when you combine the result from multiple data sources.
Import network from IntAct using keyword. In this example, type p53 AND species:mouse.
Import human orthologs from BioMart.
Show the othologs as the list of Ensemble Gene ID on the Data Panel. Copy them and use them as the query for IntAct.
Import Entrez Gene ID from BioMart. Use ensembl attribute for the mapping key.