A graph consists of a series of nodes connected by edges. The edges can be either weighted (W), in which case they are associated with a real number that indicates the strength of the connection, or binary ( B), in which case they are either 0 (absence of connection) or 1 (existence of connection). Furthermore, the edges can be either directed (D), if the connections have a directionality (e.g. node can be connected to node but not vice versa), or undirected (U), if the connections do not have a preferential directionality (i.e. if node is connected to node , then automatically node is connected to node ). The connections of a graph can be efficiently represented with a connectivity matrix, where the th row represents the out-going connections from node and the th column represents the in-coming connections to node . U graphs are symmetric. Typically, the diagonal elements of the connectivity matrix are set to 0, i.e. a node is not connected to itself.
Graph measures can be classified within two broad categories:
- global measures refer to global properties of a graph and, therefore, consist of a single number for each graph;
- nodal measures refer to properties of the nodes of a graph and, therefore, consist of a vector of numbers — one for each node of the graph.
Furthermore, we will indicate to which kind of graph a given measure belongs by using W (= weighted graphs) or B (= binary graphs), and D (= directed graphs) or U (= undirected graphs). If no letter is indicated it means that the measure applies to both cases.
DegreeDegree (nodal): Total number of edges connected to a node.
Average degree (global): Average of the degrees of all nodes.
In-degree (nodal, D): Number of inward edges going into a node.
Average in-degree (global, D): Average of the in-degrees of all nodes.
Out-degree (nodal, D): Number of outward edges originating from a node.
Average out-degree (global, D): Average of the out-degrees of all nodes.
Methodological notes: For BU graphs, the degree is calculated as the sum of the number of connections across the rows or columns of the connectivity matrix. For BD graphs, the in-degree is calculated as sum over columns, while the out-degree is calculated as sum over the rows; the degree is the sum of in-degree and out-degree. For W graphs, the weights of the connections are ignored in the calculations by binarizing the connectivity matrix so that only edges with nonzero weights are considered connected.
Strength
Strength (nodal, W): Sum of the weights of all edges connected to a node.[1]
Average strength (global, W): Average of the strengths of all nodes.
In-strength (nodal, WD): Sum of the weights of inward edges going into a node.
Average in-strength (global, WD): Average of the in-strengths of all nodes.
Out-strength (nodal, WD): Sum of the weights of outward edges originating from a node.
Average out-strength (global, WD): Average of the out-strengths of all nodes.
Methodological notes: For WU graphs, strengths are calculated as sums over either rows or columns of the weighted connectivity matrix. For WD graphs, in-strengths (out-strengths) are calculated as sums over columns (rows), and strengths are calculated as sums of in-strengths and out-strengths.
Eccentricity
Eccentricity (nodal): Maximal distance between a certain node and any other node.[2]
Average eccentricity (global): Average of the eccentricities of all nodes.
In-eccentricity (nodal, D): Maximal incoming distance from all other nodes to a node.
Average in-eccentricity (global, D): Average of the in-eccentricities of all nodes.
Out-eccentricity (nodal, D): Maximal outgoing distance from a node to all other nodes.
Average out-eccentricity (global, D): Average of the out-eccentricities of all nodes.
Radius (global): Minimum eccentricity of all nodes.
Diameter (global): Maximum eccentricity of all nodes.
Methodological notes: The distances (the shortest path lengths) between a node and any other node in the graph can be calculated and stored in a distance matrix. The eccentricity of a node is the maximum of all distances calculated for this node. For D graphs, the in-eccentricity (out-eccentricity) is the maximum along columns
(rows) of the distance matrix and the eccentricity is the larger value of the the in-eccentricity and out-eccentricity. For disconnected nodes, the eccentricity is set to NaN.
Path length
Path length (nodal): Average distance from a node to all other nodes.
Characteristic path length (global): Average of the path lengths of all nodes.
In-path length (nodal, D): Average distance from all other nodes to a particular node.
Characteristic in-path length (global, D): Average of the in-path lengths of all nodes.
Out-path length (nodal, D): Average distance from a particular node to all other nodes.
Characteristic out-path length (global, D): Average of the out-path lengths of all nodes.
Methodological notes: The distance between two nodes is defined as the length of the shortest path between those nodes (figure 3). For B graphs, the length of a path is the number of edges. For W graphs, the length of an edge is a function of its weight; typically, the edge length is inversely proportional to the edge weight because a high weight implies a shorter connection.[3] For D graphs, the path length of a node is the average of its in- and out-path lengths. The shortest path lengths between all pairs of nodes can be found using Dijkstra’s algorithm on W graphs and using breadth-first search on binary graphs.[4]
Triangles
Triangles (nodal): Number of neighbors of a node that are also neighbors of each other.[5]
Methodological notes: For BU graphs, given a connectivity matrix , the number of triangles is the diagonal of divided by two. For WU graphs, a contribution of the triangles around the node is defined as the geometric mean of the weights of the edges forming the triangle; it can be calculated by taking each element of the connection matrix to the power of and the diagonal entries of the third power of the resulting matrix divided by two. For D graphs, we will consider that there is a triangle only if the directed edges between the three nodes (vertices of the triangle) are arranged so that they form a closed cycle (but other conventions are also possible).
Clustering coefficient
Clustering coefficient (nodal): Fraction of triangles present around a node.[6]
Clustering coefficient (global): Average of the clustering coefficients of all nodes.
Methodological notes: The clustering coefficient is calculated as the ratio between the number of triangles present around a node and the maximum number of triangles that could possibly be formed around that node. See also triangles for how the number of triangles is calculated. For U graphs, the total number of possible triangles is calculated as , where is the degree of a node. For D graphs, we consider a triangle only if the directed edges between any three nodes form a cycle; the total number of possible triangles is calculated as , where and are the in-degree and out-degree of the node, respectively, and is the number of connections that cannot form triangles (i.e. the number of neighboring nodes that are connected with both inward and outward edges).
Transitivity
Transitivity (global): Ratio of total number of triangles to the number of (unordered) triplets in the graph.
Methodological notes: The transitivity is calculated as , where is the total number of triangles and is the total number of triplets in the graph. , where the sum runs over all nodes in the graph, is the total degree of each node and are the false pairs that do not result in triplets.[7]
Closeness centrality
Closeness centrality (nodal): Inverse of the path length of a node.
In-closeness centrality (nodal, D): Inverse of the in-path length of a node.
Out-closeness centrality (nodal, D): Inverse of the out-path length of a node.
Methodological notes: See path length for the calculation of the path length.
Betweenness centrality
Betweenness centrality (nodal): Fraction of all shortest paths in the graph that pass through a node. Nodes with high values of betweenness centrality participate in a large number of shortest paths.
Methodological notes: An algebraic method used to calculate the betweenness centrality is presented by Kintali.[8]
Global efficiency
Global efficiency (nodal): Average of the inverse shortest path length from a node to all other nodes.[9]
Global efficiency (global): Average of the global efficiencies of all nodes.
In-global efficiency (nodal, D): Average of the inverse shortest in-path lengths of a node.
In-global efficiency (global, D): Average of the in-global efficiencies of all nodes.
Out-global efficiency (nodal, D): Average of the inverse shortest out-path lengths of a node.
Out-global efficiency (global, D): Average of the out-global efficiencies of all nodes.
Methodological notes: See path length for the calculation of the path length. After the path lengths from a node to all other nodes are calculated, they are inverted and the average gives the global efficiencies of the nodes. For D graphs, the global efficiencies of the nodes are the average of their in- and out-global efficiencies.
Local efficiency
Local efficiency (nodal): Global efficiency of a node calculated on the subgraph created by the node’s neighbors.
Local efficiency (global): Average of the local efficiencies of all nodes.
Methodological notes: See global efficiency for the calculation of the global efficiency. The local efficiency is calculated by applying the same steps on the subgraph formed by the node’s neighbors. In the case of W graph, the weighted connections of the neighbors of node are calculated as , where the nodes and are two neighbors of , and , and are the weights of the edges.
Modularity
Modularity (global): Extent to which a graph can be divided into clearly separated communities (i.e. subgraphs or modules). Its calculation requires a previously determined community structure.
Methodological notes: The modularity is calculated as
where is the number of edges in the graph, represents the connectivity matrix, () is the degree of the node (), and is if the two nodes belong to the same community and otherwise, while the sum is performed over all pairs of nodes in the graph.
Within-module z-score
Within-module z-score (nodal): Extent to which a node is connected to the other nodes in the same community. It is a within-module version of degree. Its calculation requires a previously determined community structure.
Within-module in-z-score (nodal, D): Z-score calculated only by considering the contribution of in-path lengths.
Within-module out-z-score (nodal, D): Z-score calculated only by considering the contribution of out-path lengths.
Methodological notes: The z-Score is calculated as
where is the degree of the node in the community to which the node belongs, is the average degree of all nodes in the community , and is the standard deviation of the degree of the nodes within the community .
Participation coefficient
Participation coefficient (nodal): Quantifies the relation between the number of edges connecting a node outside its community and its total number of edges. Its calculation requires a previously determined community structure.
Methodological notes: The participation coefficient can be calculated as
where the sum runs over all communities, is the number of edges connecting the node within its community , and is the total number of edges of node . Nodes with a high participation coefficient (known as connector hubs) are connected to many communities and are likely to facilitate global intermodular integration.
Assortativity coefficient
Assortativity coefficient (global): The assortativity coefficient is a correlation coefficient between the degrees/strengths of all nodes on two opposite ends of an edge.[10]
Methodological notes: The assortativity is calculated as
where and are the respective degrees of the nodes and , and is the number of edges in the graph. The corresponding coefficient for directed and weighted networks is calculated by using the weighted and directed variants of degree/strength. A positive assortativity coefficient indicates that nodes tend to link to other nodes with similar degree/strength.
Small-worldness
Small-worldness (global): A small-world graph has a similar characteristic path length as a random graph with the same degree distribution but is significantly more clustered.[11,12]
Methodological notes: The small-worldness coefficient is calculated as
where and are the clustering coefficient and characteristic path length of the graph, and and are the corresponding expected measures calculated on random graphs (typically averaging more than 100 random graphs). If , then the network has a small-world organization.
Footnotes and references
- ^ Barrat, Alain, et al. “The architecture of complex weighted networks.” Proceedings of the National Academy of Sciences of the United States of America 101.11 (2004): 3747-3752.
- ^ Harris, John Michael, Jeffry L. Hirst, and Michael J. Mossinghoff. Combinatorics and graph theory. Vol. 2. New York: Springer, 2008.
- ^ Rubinov, Mikail, and Olaf Sporns. “Complex network measures of brain connectivity: uses and interpretations.” Neuroimage 52.3 (2010): 1059-1069.
- ^ Kepner, Jeremy, and John Gilbert, eds. Graph algorithms in the language of linear algebra. Vol. 22. SIAM, 2011.
- ^ Onnela, Jukka-Pekka, et al. “Intensity and coherence of motifs in weighted complex networks.” Physical Review E 71.6 (2005): 065103.
- ^ Watts, Duncan J., and Steven H. Strogatz. “Collective dynamics of ‘small-world’networks.” nature 393.6684 (1998): 440-442.
- ^ Newman, Mark EJ. “Ego-centered networks and the ripple effect.” Social Networks 25.1 (2003): 83-95.
- ^ Kintali, Shiva. “Betweenness centrality: Algorithms and lower bounds.” arXiv preprint arXiv:0809.1906 (2008).
- ^ Latora, Vito, and Massimo Marchiori. “Efficient behavior of small-world networks.” Physical review letters 87.19 (2001): 198701.
- ^ Newman, M. E. J. “Assortative mixing in networks, 2002.” Phys. Rev. Lett 89: 208701.
- ^ Watts, Duncan J., and Steven H. Strogatz. “Collective dynamics of ‘small-world’networks.” nature 393.6684 (1998): 440-442.
- ^ Humphries, Mark D., and Kevin Gurney. “Network ‘small-world-ness’: a quantitative method for determining canonical network equivalence.” PloS one 3.4 (2008): e0002051.