Abstract
Identifying influential spreaders in complex networks is a widely discussed topic in the field of network science. Numerous methods have been proposed to rank key nodes in the network, and while gravity-based models often perform well, most existing gravity-based methods either rely on node degree, k-shell values, or a combination of both to differentiate node importance without considering the overall impact of neighboring nodes. Relying solely on a node's individual characteristics to identify influential spreaders has proven to be insufficient. To address this issue, we propose a new gravity centrality method called HVGC, based on the H-index. Our approach considers the impact of neighboring nodes, path information between nodes, and the positional information of nodes within the network. Additionally, it is better able to identify nodes with smaller k-shell values that act as bridges between different parts of the network, making it a more reasonable measure compared to previous gravity centrality methods. We conducted several experiments on 10 real networks and observed that our method outperformed previously proposed methods in evaluating the importance of nodes in complex networks.
Similar content being viewed by others
Introduction
Complex networks are a pervasive presence in various domains of both human society and the natural world. In each system, individuals and their relationships can be represented as networks consisting of nodes and edges1,2. Recently, the identification of significant nodes in complex networks has gained significant attention from researchers, providing a new perspective for understanding the objective world and facilitating a better comprehension of the spread of diseases3,4,5, power grid protection6, information dissemination7,8,9, protein discovery10, and immunization strategies11,12, among other fields13,14,15.
To date, numerous centrality methods have been proposed to detect key nodes in complex networks. Centrality measurement methods can be primarily categorised into three types: local indices, global indices, and hybrid indices16. Local-index-based centrality methods include classical measures such as degree centrality17 (DC) and H-index18. Local-index-based methods have low computational complexity and are suitable for large-scale networks as they only consider the local neighbourhood information of nodes. However, their ability to identify influential nodes that are not central but have high impact is limited. To address this limitation, many researchers have proposed improvements, such as extended H-index centrality19 (EHC) and local clustering H-index centrality20 (LCH) methods. Global-index-based centrality methods assess individuals' influence by considering the global structural information of the network, such as closeness centrality21 (CC) and betweenness centrality22 (BC). The main drawbacks of these measurement methods are their high computational complexity and inapplicability to large-scale networks23. Among them, the K-shell decomposition method24 (KS), as a global approach, determines the influence of nodes by differentiating their core levels and operates at a faster speed. However, the main limitation of k-shell is that it assigns the same k-shell value to many nodes, resulting in low differentiation in node influence ranking. Many efforts have been made to address this issue, such as extended neighbourhood coreness25 (CNC+), classifying neighbourhood26 (CN), k-shell iteration factor27 (KSIF), and Mixed Degree Decomposition28 (MDD). The primary limitations of these global methods are their typically high computational costs as they consider the entire topological structure of the network. Hybrid-index-based centrality methods, such as local and global influence29 (LGI), local and global centrality30 (LGC) and global and local information31 (GLI) integrate both local and global information about nodes, aiming to strike a balance between algorithm accuracy and computational complexity.
The gravity model not only considers the attributes held by two nodes but also takes into account the shortest path information between nodes, which represents their mutual interactions and provides a basis for integrating local and global information. Inspired by this formula, Ma et al.32 proposed two models (G and G+) based on the gravity formula. These models adopt the k-shell value of a node as its mass and use the shortest path distance between two nodes as the distance. Building upon this, Wang et al.33 improved the model by considering the degree values of neighbouring nodes, resulting in the improved gravity centrality (IGC). Li et al.34 introduced the gravity model (GM), which employs the degree of nodes as their mass, and developed the local gravity model (LGM), which only considers node pairs within a truncated radius. Furthermore, Li et al.35 combined the local clustering coefficient and degree value as the mass of nodes, proposing the generalized gravity centrality (GGC). In addition, Yang et al.36 introduced a gravity centrality (KSGC) based on the K-shell value of nodes, considering the variations in interactions when nodes are located in different shell layers. Li et al.37 combined the k-shell value and k-shell iteration factor as the mass of nodes, presenting the DK-based gravity model (DKGM) to enhance the model's performance. Subsequently, they considered multiple features of nodes and proposed the multi-characteristics gravity model38 (MCGM). Liu et al.39 introduced the spreading entropy gravity Model (SEGM), incorporating the spreading information entropy of nodes into consideration.
From the above, we can observe that many of the gravity models mentioned are either based on node degree, related to the k-shell value, or a combination of both. However, It is not enough to evaluate the importance of a node solely on the basis of its single attributes; it is also necessary to consider the location of the node and the overall influence of neighbouring nodes on it. For instance, some nodes may have a relatively small k-shell index but possess significant influence since they act as bridges connecting different communities within the network. Similarly, there are nodes with lower degree or k-shell values compared to others but are closer to the most important nodes in the network, surrounded by highly influential nodes, as a result, their importance will also be enhanced. To address this issue, we propose the H-index-based gravity centrality method (HVGC), which not only considers the path information of nodes but also incorporates the overall influence of neighbouring nodes, structural hole position information of nodes, and the differential gravitational impact of nodes positioned at different locations. Experimental results demonstrate that our proposed method exhibits significant competitiveness compared to other advanced gravity models, Particularly in networks with evident community structures, it exhibits outstanding accuracy, unlike other algorithms that are prone to identifying false core nodes.
Preliminaries
Centrality measures
In the context of an undirected and unweighted simple network \(G \, = < V, \, E >\),\(V\) and \(E\) respectively represent the sets of nodes and links. The cardinality of \(V\) and \(E\) can be expressed as \(\left| V \right| = N\) and \(\left| E \right| = M\), indicating the presence of \(N\) nodes and \(M\) links within the network. The network's connectivity structure is typically captured by its adjacency matrix \(A = (a_{ij} )_{N \times N}\), where \(a_{ij} = 1\) if node \(i\) and node \(j\) are linked, and 0 otherwise.
Degree centrality17 of node \(i\) is defined as
where \(k(i) = \sum\limits_{j = 1}^{N} {a_{ij} } .\)
The maximum integer fulfilling that there are at least \(H(i)\) neighbors of node \(i\) whose degrees are all at least \(H(i)\), represented by \(H(i)\), is known as the H-index18 of the node \(i\).
The k-shell decomposition method24(KS), operates through an iterative process of decomposing the network into distinct shells. Initially, KS removes nodes with a degree of 1 from the network, resulting in a decrease in the degree values of the remaining nodes. This process is repeated by removing nodes with residual degrees less than or equal to 1 until all remaining nodes have residual degrees greater than 1. The nodes removed in the first step constitute the 1-shell, and their k-shell values are assigned as 1. This process is then iteratively applied to obtain the 2-shell, 3-shell, and so on. The decomposition process continues until all nodes in the network have been accounted for.
Gravity centrality32 (G) of node \(i\) is defined as
where \(k_{s} (i)\) is the k-shell value of node \(i\), \(d(i,j)\) is the shortest path distance from node \(i\) to node \(j\), and \(\psi_{i}\) is the set of nodes whose distance from node \(i\) does not exceed 3.
Extended gravity centrality32 (G+) of node \(i\) is described as
\(\Lambda_{i}\) is the nearest neighborhood of node \(i\).
The improved gravity centrality33 (IGC) of node \(i\) is measured by
where \(R\) is the truncation radius, and the optimal truncation radius \(R^{*}\) can be estimated by
where \(\langle d\rangle\) is the average distance of the network.
Extended improved gravity centrality33 (IGG+) of node \(i\) is described as
\(\Lambda_{i}\) is the nearest neighborhood of node \(i\).
The local gravity model34 (LGM) of node \(i\) is determined by
The generalized gravity centrality35 (GGC) of node \(i\) is defined as
where \(C_{i}\) is the local clustering coefficient of node \(i\), \(n_{i}\) denotes the number of edges between neighbors of node \(i\), and \(\alpha = 2\).
The k-shell based on gravity centrality36 (KSGC) is defined as
where \(c_{ij}\) is the coefficient of attraction exerted by node \(i\) on node \(j\), \(k_{s} (i)\) and \(k_{s} (j)\) denote the k-shell values of node \(i\) and node \(j\), respectively. \(ks_{\max }\) and \(ks_{\min }\) refer to the largest and smallest k-shell values present in the network. \(d(i,j)\) is the shortest path distance from node \(i\) to node \(j\).
The DK-based gravity model37 (DKGM) is measured by
assume that the value of the k-shell of node \(i\) is \(k_{s} (i).\) For the process of the k-degree iteration, the total iteration number is \(q(k)\), and node \(i\) is removed in the \(p(i)\) iteration of the k-degree process. \(k_{s}^{*} (i)\) is called the improved k-shell index of node \(i\).
The multi-characteristics gravity model38 (MCGM) is measured by
where \(k_{mid}\), \(k_{smid}\) and \(x_{mid}\) denote the median of degree value, k-shell value and eigenvector centrality value, respectively. \(k_{\max }\), \(k_{s\max }\) and \(x_{\max }\) denote the maximum values of degree value, k-shell value, and eigenvector centrality value.
The entropy-based gravity model39 (SEGM) is defined as
where \(E(i)\) is the information entropy of node \(i\), \(\Gamma (i)\) represents the set of neighboring nodes of node \(i\),and \(I(i)\) is the importance of node \(i\).
The SIR model used in this paper
To evaluate the ranking of impact generated by the algorithm and the simulation, we employed the widely used SIR model40. In the beginning, a single node in the network, referred to as the "source node," is in the infected state (I), while the remaining nodes are in the susceptible state (S). An infected node has the potential to infect its susceptible neighbors with a probability of \(\beta\), and the probability of each infected node entering the recovery (R) state is \(\lambda\), after which it ceases to participate in the dynamics. This propagation process continues until no infected nodes remain in the network. The impact of any given node \(i\) can be estimated by
the number of nodes that recover after the diffusion process has stabilized is represented by \(N_{r}\). For the sake of simplicity,\(\lambda\) has been set to 1. Subsequently, the corresponding epidemic threshold41 can be computed by
where \(\langle k\rangle\) and \(\langle k^{2} \rangle\) are the degree distribution's average degree and second-order moments.
Measures
Kendall’s tau coefficient
Kendall's tau coefficient42 is a measure of correlation between two sequences, with a larger value indicating a greater similarity between the sequences. The definition of Kendall's tau coefficient is as follows: given two sequences \(X\) and \(Y\) of the same length, where the \(i\) th values are represented by \(x_{i}\) and \(y_{i}\), respectively. Let each pair of elements \(x_{i}\) and \(y_{i}\) form a set, denoted by \((x_{i} ,y_{i} )\). If \(x_{i} > x_{j}\) and \(y_{i} > y_{j}\), or \(x_{i} < x_{j}\) and \(y_{i} < y_{j}\), the pairs \((x_{i} ,y_{i} )\) and \((x_{j} ,y_{j} )\) are considered concordant. They are considered discordant if \(x_{i} > x_{j}\) and \(y_{i} < y_{j}\), or \(x_{i} < x_{j}\) and \(y_{i} > y_{j}\). If \(x_{i} = x_{j}\) and \(y_{i} = y_{j}\), the pair is neither concordant nor discordant. Therefore, the Kendall's tau coefficient τ is defined as
where \(n_{ + }\) is the number of concordant pairs, and \(n_{ - }\) is the number of discordant pairs.
Jaccard similarity coefficient
In some applications, concentrating on the top-rank nodes rather than all nodes may be appropriate. In contrast to the Kendall correlation coefficient, the Jaccard similarity coefficient is utilized to assess the similarity between the top-k nodes in two ranking lists25,43. The Jaccard similarity is calculated by dividing the number of common nodes by the number of unique nodes in the two lists, and its expression is
where \(X\) and \(Y\) represent the top-k nodes with the highest influence as determined by two different methods. In the context of our experiments, \(X\) represents the top-k nodes identified by HvGC and other baseline methods, while \(Y\) represents the top-k nodes obtained through the SIR simulation. We use the Jaccard similarity coefficient to measure the similarity between these two sets of top-k nodes. The Jaccard similarity coefficient ranges from 0 to 1, where a higher value indicates a greater degree of similarity between the two ranking results. A Jaccard similarity coefficient of 0 indicates completely distinct results, while a value of 1 indicates that the two sets of top-k nodes are identical.
The monotonicity index
The monotonicity25 \(M\) is used to quantitatively measure the resolution of different indices in ranking list \(X\), and can be calculated by
where \(N\) is the size of network, and \(N_{c}\) is the number of nodes with the same index value \(c\).
Results
Algorithms
Previous research has utilized the gravity model approach to analyze node importance in complex networks. Degree and k-shell values are commonly used metrics to consider the number of neighbors a node has and its position within the network, respectively. However, these metrics alone do not capture the overall influence of a node's neighbors. While the H-index considers the importance of a node's neighbors, it may overlook certain information from neighboring nodes, failing to account for the collective impact of all neighbors. We take the toy network shown in Fig. 1 to illustrate the problem for H-index, where the node spreading capacity derived from 1000 independent runs of the SIR model has been numerically labeled in Fig. 1. Obviously, \(H(1) = H(2) = H(3) = H(4) = H(10) = 1\), \(H(5) = H(7) = H(8) = 3\),\(H(6) = H(9) = 2\), where \(H(i)\) represents the H-index of node \(i\). The H-index always assigns the same value to different nodes, which leads to a lack of excellence in the ability to differentiate the influence of nodes.
The same issue exists in DC17 and KS24. Additionally, from Fig. 1, it can be observed that Node 3 has a higher propagation capability compared to Node 9, but Node 3 has a lower H-index than Node 9. This indicates that the H-index overlooks some information from the neighbors of a node. From this, we take out all neighboring nodes in the set of neighbors of node \(i\) with degree values greater than or equal to \(H(i)\) and add up the degree values of these nodes to measure the overall influence of the neighboring nodes on node \(i\). The value obtained is denoted as \(HV(i)\), and the expression is
where \(\Lambda_{i}\) is the nearest neighborhood of node \(i\),\(H(i)\) represents the H-index of node \(i\).
By incorporating the overall influence of node neighbors into the definition, it enhances the discriminative power of node identification compared to the H-index. However, it is still insufficient to accurately distinguish cluster-like nodes, due to their close connections, these nodes can more easily achieve greater HV values, but, their actual influence may not be greater than that of nodes with lower HV values, As shown in Fig. 1. \(HV(6) = 8\),\(HV(9) = 7\),\(HV(3) = 4\), and the actual propagation capacity from high to low is nodes 3, 9, and 6, a similar problem with the k-shell approach was noted by Liu et al.44 In other words, removing node 3 from the network would result in nodes 1, 2, and 4 losing their interactions with the core nodes, while removing node 6 has a minimal impact on information transmission in the network. This finding demonstrates the higher importance of nodes that serve as bridges between different clusters compared to those within individual clusters.
Based on this, we considered the structural hole position of nodes to enhance the algorithm's ability to identify nodes within community networks. This allows us to identify those bridge nodes that may not have high HV values but play a crucial role in facilitating information flow across different parts of the network. The network constraint coefficient measures the level of constraints imposed on nodes forming a structural hole (SH) in a network45, and it can be calculated as follows:
where \(\Gamma (i)\) represents the set of neighboring nodes of node \(i\), and \(w \in \Gamma (i) \cap \Gamma (j)\) indicates the nodes that are common neighbors of both node \(i\) and node \(j\). \(p_{ij}\) represents the proportion of energy invested by node \(i\) to maintain its relationship with node \(j\). where \(z_{ij} = 1 \, (i \ne j)\) if there is a link between nodes \(i\) and \(j\), otherwise \(z_{ij} = 0\). Based on the above discussions, the gravity centrality based on the H-index (HVGC) measure proposed in this paper is defined as follows:
where \(c(i)\) represents the structural hole constraint coefficient in Eq. (29). A smaller value of \(c(i)\) indicates that the node occupies more structural holes and has a stronger ability to bridge different parts of the network. Finally, the metrics, including HVGC, H-index, HV, DC, and KS, were computed for each node in the toy network and compared with the node's spreading capability (SC). The results are presented in Table 1, revealing that HVGC achieves a nearly identical ranking to SC, indicating excellent performance. The algorithmic description of the HVGC is provided in Algorithm 1.
In addition, Fig. 2 depicts a network with a clear community structure, where the four nodes with the strongest propagation capabilities are marked in green. The propagation capabilities of these nodes were determined through 1000 independent experiments using the SIR model. We compared HVGC with other gravity model-based methods in identifying the top 5 nodes in this network, and the results are presented in Table 2.
Data description
This paper evaluates the efficacy of HVGC by analyzing ten real networks from six distinct domains, including a transportation network(USAir46), an infrastructure network (Power47), a communication network (Email48), a technology network (Router49), two collaborative networks (Jazz50and NS51), and four social networks (Facebook52, PB53, WV54, and Sex55). Table 3 presents the fundamental topological properties of these networks. \(N\) represents the number of nodes in the network, and \(M\) represents the number of links. The average degree of nodes is denoted as \(\langle k\rangle\), and the average distance between pairs of nodes is denoted as \(\langle d\rangle\). The clustering coefficient47 of the network is denoted by \(C\), while \(r\) represents the assortative coefficient56. The degree heterogeneity57 of the network is denoted by \(H\). Additionally, \(\beta_{c}\) represents the epidemic threshold58 of the SIR model40 used to simulate the diffusion process.
Empirical results
Based on the aforementioned real network, we conducted simulations and compared the influence rankings of various algorithms utilizing the SIR model. In order to ensure the credibility of our findings and the standard ranking of nodes' influence, we conducted 1000 independent experiments for each given network and transmission probability \(\beta\), with any one node being chosen as the seed node once during each run. The processor and runtime environment used for the calculations are i7-12700H and Python 3. The development platform used for this paper is Anaconda 3, and the code was executed in Jupyter Notebook. Kendall's tau (\(\tau\)) was utilized to evaluate the accuracy of the algorithms, with a higher value indicating a greater correlation between the observed sequences and an improved algorithm performance. Table 4 provides a comparison of the accuracy of the proposed algorithm (HVGC) and ten benchmark algorithms, which include degree centrality17 (DC), k-shell decomposition method24 (KS), the extended version of gravity centrality32 (G+), extended version of improved gravity centrality33 (IGC+), local gravity model34 (LGM), generalized gravity centrality35 (GGC), the improved gravitational centrality based on k-shell values36 (KSGC), the DK-based gravity model37 (DKGM), multi-characteristics gravity model38 (MCGM), and entropy-based gravity model39 (SEGM).Additionally, Fig. 3 displays the accuracy of the different algorithms for varying values of \(\beta\), within the range of \(0.5\beta_{c}\) to \(1.5\beta_{c}\).
According to Table 4, the methods that utilise the gravitational formula (G+, IGC+, LGM, GGC, KSGC, DKGM, MCGM, SEGM, and HVGC) exhibit significant advantages over classical methods (DC and KS). These advantages are especially prominent in the Power, Router, NS, and Sex networks. Furthermore, it is noteworthy that among all gravity-based algorithms tested on the ten networks, HVGC exhibited the best overall performance. Its Kendall coefficient ranked first in six out of ten networks, with a remarkable 70% proportion being in the top two ranks. Specifically, HVGC ranked first in the Jazz, email, Facebook, PB, WV, and USAir networks and second in the Router network. Additionally, as shown in Fig. 3, when \(\beta = \beta_{c}\), although HVGC did not perform best in the NS, Power, and Sex networks, as \(\beta\) increases, its performance becomes very close to or even surpasses the previous best-performing algorithm. Taking into account HVGC's superior performance in community-type networks discussed earlier, it demonstrates a stronger overall performance, affirming the robustness of our findings. Furthermore, Fig. 4 displays the optimal truncation radius of HVGC in the ten real networks, revealing that the majority of networks concentrate their optimal truncation radius at \(R = 1\). This indicates that HVGC achieves remarkably high accuracy by considering only the influence of the first-order neighbouring nodes of a node, while most other gravity model methods require considering information from second- or third-order neighbouring nodes. In other words, HVGC achieves a high level of accuracy while incurring lower time costs.
Discussion
This paper introduces a novel method called HVGC for identifying influential nodes in a network. While the original gravity model considered both neighbourhood and path information, this new method enhances the existing gravity centrality approaches by taking into account the overall influence of a node's neighbourhood, considering the structural hole position of nodes, and incorporating the differences in interactions between nodes. This method addresses the limitations of existing gravity centrality methods and strengthens the ability to identify important nodes in networks with clear community structures. Therefore, this approach demonstrates a high level of comprehensive performance. We conducted an analysis of the SIR dynamic propagation process in 10 real networks to compare the performance of HVGC with previous state-of-the-art methods. The results, as shown in Table 4, indicate the strong competitiveness of our method.
In certain scenarios, it is necessary to identify the top-k influential nodes for controlling information propagation. Therefore, in addition to evaluating the different ranking methods for individual nodes, we also assessed their performance in identifying the top-k influential spreaders. In other words, we compared the ranked lists of node influence obtained from the ranking methods with the ranked lists of node influence obtained from the SIR simulation, both sorted in descending order. Subsequently, we analysed the similarity between the two lists by considering the top-k nodes. Figure 5 illustrates the results of the Jaccard coefficient for identifying the top-k influential spreaders, ranging from 5 to 100 with a step size of 5. The X -axis shows the number of top influential spreaders, and the Y -axis shows the Jaccard similarity coefficients.
We can observe that, except for the Sex, Power, PB, and Router networks, HVGC exhibits the best and most stable overall performance in identifying the top-k influential spreaders in other networks. Specifically, across all networks, as the number of selected top-k nodes increases, HVGC consistently maintains a high-level or steadily increasing Jaccard coefficient, while other methods display varying degrees of fluctuations. Furthermore, we provide detailed plots for the top-25 nodes, revealing that HVGC consistently ranks among the top three in identifying the top-25 influential spreaders and, in some cases, even secures the first position, except for the Sex network. Therefore, we can conclude that HVGC not only accurately ranks the influence of all nodes in the network but also successfully identifies the top-k nodes with the highest impact.
After applying monotonicity25, we assessed the resolution of various algorithms. Table 5 illustrates that HVGC and MCGM demonstrate similar performance in terms of monotonicity. However, HVGC excels in the majority of networks by solely considering the first-order neighbour information of nodes, whereas MCGM, even with the inclusion of second-order neighbour information, does not necessarily outperform HVGC and incurs higher computational complexity. Furthermore, HVGC demonstrates significantly better performance in identifying important nodes in networks with community structure compared to MCGM. Therefore, overall, HVGC surpasses other gravity model algorithms. Based on the results presented in Table 5, HVGC consistently ranks either at the top or very close to the best-performing algorithm in terms of monotonicity.
Based on the above discussion, it is evident that centrality based on the gravitational model is more accurate than classical centrality. However, many of these models tend to identify false core nodes in the network and do not take into account the influence of neighbouring nodes. In our proposed HVGC (H-index-based Gravity Centrality), we address this limitation by comprehensively considering the overall impact of a node's neighbours and its position within the network's structural holes. This approach effectively overcomes the drawbacks of gravity-based methods and demonstrates superior performance compared to other algorithms.
Despite the excellent performance exhibited by HVGC, it shares a common limitation with other gravity-based methods, namely the need to determine the optimal truncation radius \(R\). However, this disadvantage is mitigated by the fact that most real networks exhibit small-world characteristics47,59, and the optimal truncation radius is approximately linearly related to the average distance34. Furthermore, since HVGC is derived from the domain centrality method, even considering only the first-order neighbor nodes in the ten real networks studied can lead to very high performance and accurate results.
In conclusion, while HVGC demonstrates better overall performance compared to other gravity-based methods and introduces improvements to existing gravity models, there are still areas that require further refinement. For example, the current approach does not consider the influence of weight factors associated with different indicators. Instead, it directly operates on the indicator values of the nodes. The weights of HV and the structural hole constraint coefficient \(c(i)\) in the computation process may affect the accuracy of the algorithm. In networks with clear community structures, a higher weight for \(c(i)\) may lead to better performance, while in other types of networks, a lower weight may yield better results. Therefore, future work may involve incorporating adjustable parameters to balance the weights of different indicators, which is a direction for further exploration. Additionally, these algorithms have not been evaluated in weighted networks, where the impact of the path from node \(i\) to node \(j\) may differ from that of the path from node \(j\) to node \(i\), and the link heterogeneity60 in a weighted network may result in varying node impact. Lastly, future research may involve incorporating adjustable parameters to modify the interplay of gravitational forces among nodes and balance the weights of different metrics in order to improve the performance of the algorithm.
Data availability
All relevant data are available at https://github.com/MLIF/Network-Data.
References
Strogatz, S. H. Exploring complex networks. Nature 410, 268–276 (2001).
Newman, M. E. J. The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003).
Barabási, A.-L., Gulbahce, N. & Loscalzo, J. Network medicine: A network-based approach to human disease. Nat. Rev. Genet. 12, 56–68 (2011).
Zhu, P., Zhi, Q., Guo, Y. & Wang, Z. Analysis of epidemic spreading process in adaptive networks. IEEE Trans. Circuits Syst. II Express Briefs 66, 1252–1256 (2019).
Yao, S., Fan, N. & Hu, J. Modeling the spread of infectious diseases through influence maximization. Optim. Lett. 16, 1563–1586 (2022).
Albert, R., Albert, I. & Nakarado, G. L. Structural vulnerability of the North American power grid. Phys. Rev. E 69, 025103 (2004).
Hosni, A. I. E., Li, K. & Ahmad, S. Minimizing rumor influence in multiplex online social networks based on human individual and social behaviors. Inf. Sci. 512, 1458–1480 (2020).
Ahmed, W., Vidal-Alaball, J., Downing, J. & Seguí, F. L. COVID-19 and the 5G conspiracy theory: Social network analysis of Twitter data. J. Med. Internet Res. 22, e19458 (2020).
Xu, W. et al. Identifying structural hole spanners to maximally block information propagation. Inf. Sci. 505, 100–126 (2019).
Csermely, P., Korcsmáros, T., Kiss, H. J. M., London, G. & Nussinov, R. Structure and dynamics of molecular networks: A novel paradigm of drug discovery: A comprehensive review. Pharmacol. Ther. 138, 333–408 (2013).
Liu, Y., Wang, X. & Kurths, J. Framework of evolutionary algorithm for investigation of influential nodes in complex networks. IEEE Trans. Evol. Comput. 23, 1049–1063 (2019).
Morone, F. & Makse, H. A. Influence maximization in complex networks through optimal percolation. Nature 524, 65–68 (2015).
Sui, L. et al. The fractal description model of rock fracture networks characterization. Chaos Solitons Fractals 129, 71–76 (2019).
Huang, Y., Dong, H., Zhang, W. & Lu, J. Stability analysis of nonlinear oscillator networks based on the mechanism of cascading failures. Chaos Solitons Fractals 128, 5–15 (2019).
Zhao, J. & Deng, Y. Complex network modeling of evidence theory. IEEE Trans. Fuzzy Syst. 29, 3470–3480 (2021).
Namtirtha, A., Dutta, A. & Dutta, B. Weighted kshell degree neighborhood: A new method for identifying the influential spreaders from a variety of complex network connectivity structures. Expert Syst. Appl. 139, 112859 (2020).
Bonacich, P. Factoring and weighting approaches to status scores and clique identification. J. Math. Sociol. 2, 113–120 (1972).
Lü, L., Zhou, T., Zhang, Q.-M. & Stanley, H. E. The H-index of a network node and its relation to degree and coreness. Nat. Commun. 7, 10168 (2016).
Zareie, A. & Sheikhahmadi, A. EHC: Extended H-index Centrality measure for identification of users’ spreading influence in complex networks. Phys. Stat. Mech. Appl. 514, 141–155 (2019).
Xu, G.-Q., Meng, L., Tu, D.-Q. & Yang, P.-L. LCH: A local clustering H-index centrality measure for identifying and ranking influential nodes in complex networks. Chin. Phys. B 30, 088901 (2021).
Freeman, L. C. Centrality in social networks conceptual clarification in Hawaii Nets conferences. Cent. Soc. Netw. Concept. Clarification Hawaii Nets Conf. 1, 215–239 (1979).
Freeman, L. C. A set of measures of centrality based on betweenness. Sociometry 40, 35–41 (1977).
Lü, C. et al Identifying Influential Nodes in Complex Networks.pdf (2012).
Kitsak, M. et al. Identification of influential spreaders in complex networks. Nat. Phys. 6, 888–893 (2010).
Bae, J. & Kim, S. Identifying and ranking influential spreaders in complex networks by neighborhood coreness. Phys. Stat. Mech. Appl. 395, 549–559 (2014).
Li, C., Wang, L., Sun, S. & Xia, C. Identification of influential spreaders based on classified neighbors in real-world complex networks. Appl. Math. Comput. 320, 512–523 (2018).
Wang, Z., Zhao, Y., Xi, J. & Du, C. Fast ranking influential nodes in complex networks using a k-shell iteration factor. Phys. Stat. Mech. Appl. 461, 171–181 (2016).
Zeng, A. & Zhang, C.-J. Ranking spreaders by decomposing complex networks. Phys. Lett. A 377, 1031–1035 (2013).
Qiu, L., Zhang, J. & Tian, X. Ranking influential nodes in complex networks based on local and global structures. Appl. Intell. 51, 4394–4407 (2021).
Ullah, A. et al. Identifying vital nodes from local and global perspectives in complex networks. Expert Syst. Appl. 186, 115778 (2021).
Yang, Y.-Z., Hu, M. & Huang, T.-Y. Influential nodes identification in complex networks based on global and local information. Chin. Phys. B 29, 088903 (2020).
Ma, L., Ma, C., Zhang, H.-F. & Wang, B.-H. Identifying influential spreaders in complex networks based on gravity formula. Phys. Stat. Mech. Appl. 451, 205–212 (2016).
Wang, J., Li, C. & Xia, C. Improved centrality indicators to characterize the nodal spreading capability in complex networks. Appl. Math. Comput. 334, 388–400 (2018).
Li, Z. et al. Identifying influential spreaders by gravity model. Sci. Rep. 9, 8387 (2019).
Li, H., Shang, Q. & Deng, Y. A generalized gravity model for influential spreaders identification in complex networks. Chaos Solitons Fractals 143, 110456 (2021).
Yang, X. & Xiao, F. An improved gravity model to identify influential nodes in complex networks based on k-shell method. Knowl. Based Syst. 227, 107198 (2021).
Li, Z. & Huang, X. Identifying influential spreaders in complex networks by an improved gravity model. Sci. Rep. 11, 22194 (2021).
Li, Z. & Huang, X. Identifying influential spreaders by gravity model considering multi-characteristics of nodes. Sci. Rep. 12, 9879 (2022).
Liu, Y., Cheng, Z., Li, X. & Wang, Z. An entropy-based gravity model for influential spreaders identification in complex networks. Complexity 2023, e6985650 (2023).
Hethcote, H. W. The mathematics of infectious diseases. SIAM Rev. 42, 599–653 (2000).
Castellano, C. et al. Thresholds for epidemic spreading in networks. Phys. Rev. Lett. 105, 218701. https://doi.org/10.1103/PhysRevLett.105.218701 (2010).
Kendall, M. G. A new measure of rank correlation. Biometrika 30, 81–93 (1938).
Zareie, A., Sheikhahmadi, A., Jalili, M. & Fasaei, M. S. K. Finding influential nodes in social networks based on neighborhood correlation coefficient. Knowl. Based Syst. 194, 105580 (2020).
Liu, Y., Tang, M., Zhou, T. & Do, Y. Core-like groups result in invalidation of identifying super-spreader by k-shell decomposition. Sci. Rep. 5, 9602 (2015).
Lu, M. Node importance evaluation based on neighborhood structure hole and improved TOPSIS. Comput. Netw. 178, 107336 (2020).
Pajek Datasets. http://vlado.fmf.uni-lj.si/pub/networks/data/.
Watts, D. J. & Strogatz, S. H. Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998).
Guimerà, R., Danon, L., Díaz-Guilera, A., Giralt, F. & Arenas, A. Self-similar community structure in a network of human interactions. Phys. Rev. E 68, 065103 (2003).
Spring, N., Mahajan, R., Wetherall, D. & Anderson, T. Measuring ISP topologies with Rocketfuel. IEEEACM Trans. Netw. 12, 2–16 (2004).
Gleiser, P. M. & Danon, L. Community structure in jazz. Adv. Complex Syst. 06, 565–573 (2003).
Newman, M. E. J. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 036104 (2006).
Leskovec, J. & Mcauley, J. Learning to discover social circles in ego networks. In Advances in Neural Information Processing Systems. Vol. 25 (Curran Associates, Inc., 2012).
Adamic, L. A. & Glance, N. The political blogosphere and the 2004 U.S. election: Divided they blog. In Proceedings of the 3rd International Workshop on Link Discovery. 36–43 (Association for Computing Machinery, 2005). https://doi.org/10.1145/1134271.1134277.
Leskovec, J., Huttenlocher, D. & Kleinberg, J. Predicting positive and negative links in online social networks. In Proceedings of the 19th International Conference on World Wide Web. 641–650 (Association for Computing Machinery, 2010). https://doi.org/10.1145/1772690.1772756.
Rocha, L. E. C., Liljeros, F. & Holme, P. Simulated epidemics in an empirical spatiotemporal network of 50,185 sexual contacts. PLOS Comput. Biol. 7, e1001109 (2011).
Newman, M. E. J. Assortative mixing in networks. Phys. Rev. Lett. 89, 208701 (2002).
Hu, H.-B. & Wang, X.-F. Unified index to quantifying heterogeneity of complex networks. Phys. Stat. Mech. Appl. 387, 3769–3780 (2008).
Castellano, C. & Pastor-Satorras, R. Thresholds for epidemic spreading in networks. Phys. Rev. Lett. 105, 218701 (2010).
Amaral, L. A. N., Scala, A., Barthélémy, M. & Stanley, H. E. Classes of small-world networks. Proc. Natl. Acad. Sci. 97, 11149–11152 (2000).
Bellingeri, M., Bevacqua, D., Scotognella, F. & Cassi, D. The heterogeneity in link weights may decrease the robustness of real-world complex weighted networks. Sci. Rep. 9, 10692 (2019).
Author information
Authors and Affiliations
Contributions
S.Z. devised the research project. S.Z. performed the research. S.Z. and J.Z. analyzed the data. S.Z. and X.L. wrote the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhu, S., Zhan, J. & Li, X. Identifying influential nodes in complex networks using a gravity model based on the H-index method. Sci Rep 13, 16404 (2023). https://doi.org/10.1038/s41598-023-43585-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-023-43585-x
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.