LibGuides: Network Analysis with Gephi: Home

Introducing Gephi

Install Gephi

1. Go to the Gephi website to download the software.

Gephi 0.9.2 works on Mac and PC computers, but earlier Gephi versions do not.

2. Open your application.

3. Click on "New Project" on the "Welcome to Gephi" popup window.

Additional resources as you troubleshoot installation:

Official Gephi learning portal

Gephi wiki

Gephi support

Here are the supported data formats that you can use in Gephi: https://gephi.org/users/supported-graph-formats. A spreadsheet in .csv format usually works well for me.

Watch a video tutorial

Here is a fifteen minute video offering an overview of how to use Gephi.

Add data

Let's add some data.

Click on "Data Laboratory" This is where you'll upload your data.

1. Click on "Import Spreadsheet."

2. Import the "edges.csv" file and the "nodes.csv" file. If you come to my workshop "Introduction to Network Analysis," I will give you sample edges and nodes files. If you are trying this out asynchronously, here is some sample data to use. If you're using your own data, read below:

You will need to create two .csv files: a node table and an edge table. Excel files automatically save by default as .xlsx format. In order to get the .csv format, save the file as .csv when you click "Save as."

In general, here is a bit about the difference between nodes and edges:

Nodes: the nodes file tells Gephi all the possible nodes in a network. A node is represented by a circle within the Gephi visualization whereas the edges file tells Gephi how all the nodes are related (or connected). The nodes file should at least have the columns "Id" and "Label." In our example, the Id is a number (e.g. 1, 2, 3, 4, 5). The label is what you want to see the node labeled as in the graph itself. Your nodes table might look like this:

The node table can also include attributes. Attributes offer a way for you to distinguish between your nodes by categorizing your data by, for example, color, size, or age.

Edges: The edges table (the second .csv table) tells Gephi how the nodes are connected. It has the columns Source, Target, and Type. Source refers to a node that you've identified and labeled in your nodes.csv file. Target also refers to a node you've listed in your nodes.csv file. Type refers to how the two nodes are connected. If the source drives the relationship (for example, a sender of a letter versus a receiver), the relationship is "Directed." In this example, the sender of the letter is the source and the receiver of the letter is the target. If the relationship goes both ways -- for example, the graph visualizes friendships, the graph will be undirected. Here is an example of what your graph will look like:

In the edges table, you can also add a column to define the weightedness for each relationship. Weight gives you the option to show the importance of certain relationships by giving them a numerical weight.

After uploading both your nodes and edges table to Gephi, you'll need to tell Gephi you'd like a column for labels. To do that, click on ID and then click on Label. Then, finally, click "save." Now we can start visualizing!

Let's start visualizing

Click on "Overview" to see your graph.

If you do not see your graph, click on "window" and then click "graph." This will populate a window for your graph.

What is that?

The initial visualization you see won't look like much, but don't worry! There are three ways we can explore and improve this graph:

1. Overview: We can explore the graph visually

2. Data Laboratory: This is where we can see the spreadsheet view of our data -- and the new data we create as we analyze our graph

3. Preview: This is where we polish our visualization.

Ways to find out information from your graph

Here are a few ways you can explore and visualize your data:

Layout

Gephi adjusts the nodes and edges in the network by the layout feature. It prioritizes different properties of the network.

Choose a layout from the drop-down list (e.g., ForceAtlas 2)

Adjust parameters for the layout algorithm

Click the "Run" button

Continue to refine the layout until you are happy with the results

Color

Select a "partition" (categorical) node variable from your data. For example, in our sample data in the Gephi workshop we have the variable called "State"

Click on "Partition"

Click on "Nodes"

Choose "State" from the drop down

Click "Apply"

Filter

Click the "Filters" tab on the right

Expand the "Attributes" folder

Double-click the "Equal" folder

Drag “sex” down to the “Queries” below.

Click the "Filter" button

Size

Resize nodes uniformly

Click on the selection box icon on the left vertical toolbar

Draw a box around all nodes to select them all

Click on the diamond icon on the left vertical toolbar

Click on a node, then drag the mouse up and down to increase and decrease the size

Resize nodes according to a numerical variable

Click on the "Ranking" tab

Click on "Nodes"

Select a variable (e.g., Degree) from the drop down
Choose a minimum and maximum size as a range for the size of the nodes

Click the "Apply" button

Statistics

Click the Statistics tab on the right hand side

Run the “modularity” statistic as a first example.

This creates a new way to view your graph. It also populates a new cell in your data laboratory.

Click into the “Appearance” tab on the left-hand side. Under “nodes” click “modularity class” in the “Partition” tab.

Color your nodes by community

Once you've calculated modularity, we can color nodes according to their communities. Go to the Partition pane (on the left side of the Gephi window) and click on the little Refresh icon. From the dropdown window, select Modularity Class.