« Home « Kết quả tìm kiếm

Influence maximization algorithm based on Gaussian propagation model


Tóm tắt Xem thử

- Influence maximization algorithm based on Gaussian propagation model.
- Social network Influence maximization Gaussian propagation model Greedy algorithm.
- The influence of each entity in a network is a crucial index of the network information dis- semination.
- Further, the paper evaluates the effectiveness of the influence maximization algorithm based on the Gaussian propagation model supported by theoretical proofs.
- The results of the experiments demonstrate that the proposed algo- rithm shows significant improvement in both effectiveness and efficiency..
- The initial solution of influence maximization is based on the framework of the greedy algorithm [10] which requires con- siderable time to model in the Monte Carlo simulation process.
- To solve the defects of the greedy algorithm, heuristic algo- rithms [1] have been proposed.
- To solve the time consumption problem of greedy algorithms, some of the literature [13] has proposed hop-based IC and LT models.
- By modeling the multi-dimensional space of the social network, the influence of different nodes is calculated via the Gaussian propagation model.
- The influence maximization algo- rithm based on the Gaussian propagation model shows the potential of improving efficiency.
- Finally, this paper evaluates the properties of the Gaussian propagation model, theoretically underpinning the theoretical grounding of the greedy algorithm under this model.
- The Gaussian propagation model defines the parameters of the influence prop- agation environment under the social network.
- Gaussian propagation model does not require the use of the Monte Carlo model to calculate influence, which greatly improves the efficiency of the Gaussian model’s influence maximization algorithm;.
- A novel influence maximization algorithm under the Gaussian propagation model and an improved CELF algorithm to accelerate the efficiency of the greedy algorithm is presented;.
- This paper compares the efficiency and effectiveness of the Gaussian model influence maximization algorithm and tra- ditional algorithms on three different sets of social network data..
- Section 3 introduces the Gaussian propagation model and the extension of the Gaussian propagation model in social networks.
- Some properties of the objective function under the Gaussian propagation model are proved, and an improved CELF algo- rithm is proposed.
- Influence maximization.
- [11] proposed the CELF algorithm to improve the efficiency of the greedy algorithm by using the submodules of the objective function.
- But the efficiency of the greedy algorithm was still insufficient..
- Some scholars have proposed a series of heuristic algorithms to solve the problem of low efficiency of the greedy algo- rithms.
- respectively use the structural information of the network.
- The time efficiency of the heuristic algorithm is improved because it does not need to conduct the Monte Carlo simulation process.
- Influence propagation model.
- The Gaussian propagation model calculates the influence concentration in the propagation space to simulate the influence propagation process.
- The precision of the influence maximization algorithm based on the Gaussian propagation model can be proved theoretically to ensure the efficiency of the algorithm..
- Gaussian propagation model of social networks.
- The Gaussian propagation model is formulated as follows..
- h y and h z represent the propagation coefficients of pollutants in the y axis and z axis of the propagation space respectively determined by the propagation environment.
- Therefore, this paper employs the location of the social network structure, motif nat- ure, and node degree, the multi-dimensional space under social network is constructed to address this problem..
- Under the Gaussian propagation model, each node has a fixed coordinate (offset, motif, degree).
- The motif represents the measurement of nodes in the motif dimension, and degree represents the degree of the nodes..
- In this paper, the number of occurrences of a node in different three-node motifs determines the coordinate values of the nodes in the motif dimension..
- The influence intensity of the node is cal- culated as follows:.
- These two parameters repre- sent the standard deviation of the influence intensity distribution in the motif dimension and degree dimension.
- The average clustering coefficient of the network represents the clustering degree of the points in the graph, which can represent the density degree of the network..
- The diameter of the network indicates the size of the network.
- b 2 under the Gaussian propagation model of the social network are determined by the average aggregation coefficient and the network diameter.
- C and D are the average aggregation coefficient and network diameter of the network, respectively..
- In the Gaussian model, the pollutant concentration is calculated from the information of the source and atmospheric con- ditions.
- This threshold can be regarded as the acceptability of the node.
- The calculation formula of the Gaussian propagation model is as follows:.
- In social networks, it can be regarded as a condition indi- cator of the influence propagation in the entire network environment..
- Influence propagation process under the Gaussian propagation model.
- The Gaussian propagation model in the social network can simulate the influence propagation by connecting edges between nodes.
- Specifically, for the traditional Gaussian propagation model, if we can identify any coor- dinate around the source of the pollution, we can find the concentration of the pollutants at this coordinate.
- This process of finding the coordinates of the social network is equivalent to considering the connectivity between the propagation source and the affected nodes.
- affected node, then the coordinates of the offset dimension will not exist.
- Therefore, the Gaussian propagation model in the social network depends on the information of the transmission source and the transmitted point (offset, motif, degree), and it also considers the connection between nodes..
- Gaussian propagation model influence maximization algorithm.
- The influence maximization problem under the Gaussian propagation model is an NP-hard problem.
- For the objective function of the Gaussian propagation model with the above properties, a greedy algorithm can be used to solve the problem, and the optimal result is equivalent to 1 1 e.
- V is the node set of the network, and the number of nodes to be selected is k..
- Objective function under Gaussian propagation model.
- According to the definition of the Gaussian propagation model, its objective function under the influence maximization problem is:.
- The problem of influence maximization under the Gaussian propagation model is NP-hard..
- We can use the special case of the set coverage problem to prove that the influence maximization problem under the Gaussian propagation model is NP-hard..
- Each of the N nodes rep- resents an element in the background set U.
- Each of the M nodes represents a subset S j .
- Under the background of the set cov- erage problem, N nodes in a bipartite graph have at least one edge pointing to it.
- Under the Gaussian propagation model, the objective function satisfies f S ð Þ P 0.
- Under the Gaussian propagation model, the objective function f S ð Þ satisfies f S ð þ v Þ P f s ð Þ.
- According to the definition of the influence concentration function c S ð Þ, the function v c S ð Þ v is non-.
- Under the Gaussian propagation model, the objective function f S ð Þ satisfies f S ð þ v Þ f S ð Þ P f T ð þ v Þ f T ð Þ .
- Proof of Theorem 4 By the definition of the function.
- About 63 % of the optimal results can be obtained by using a greedy algorithm..
- If the condition is not satisfied, we can use the submodularity of the objective function to continue the comparison down- ward.
- The first part is the nodes that have an influence gain in the new round that is larger than that of the second-ranked node.
- the other part is the nodes that have an influence gain in the last round that is smaller than that of the second-ranked node.
- In this case, the algorithm calculates the influence gain of the nodes in the previous part, sorts them, and then selects the node with the maximum gain and puts it into the seed set.
- According to submodularity, the influence gain of the nodes in the latter part is smaller.
- Therefore, it is only necessary to calculate the influence gain of the nodes in the former part in the new round.
- Algorithm 2 Calculates the influence of the node set under the Gaussian propagation model Input: NodeSet.
- Algorithm 2 calculates the influence concentration of the remaining nodes in the network according to the seed node set and then judges whether the node can be affected or not according to the affected threshold of the inactive node..
- Algorithm 3 Gaussian propagation model influence maximization algorithm Input: k(SetSize).
- The f function in Algorithm 1 that calculates the influence of the nodes is replaced by the func- tion in Algorithm 2 that calculates influence using the Gaussian propagation model..
- The first data set is part of the DBLP data set, which contains 954 nodes.
- Reciprocal: The weight of each edge is set to be the inverse of the node degree.
- For convenience, the method proposed in this paper is referred to as GDM, and the names of the other methods have been given above..
- In general, the performance of GDM algorithm is consistent with that of HBIC algorithm, and it is better than that of the remaining six algorithms.
- Whether GDM can be adjusted according to the weight of the edges in the current network to improve the experimental performance will be further studied in future work..
- The running time of the algorithm is mainly compared with the time needed by the GDM algorithm and HBIC algorithm to select the number of seed nodes of different sizes..
- 5 that the running time of the HBIC algorithm increases significantly with the increase of the number of selected seed nodes.
- In contrast, the running time of the GDM algorithm rises only slightly, and even the selection of 50 seed nodes consumes very little time..
- 6, under the Wiki-Vote data set, the runtime of the GDM and HBIC algorithms is basically the same as under the DBLP data set.
- With the increase of the number of selected seed nodes, the running time of the HBIC algorithm increases linearly, while the running time of the GDM algorithm rises very little.
- 7, in the P2P data sets, when the number of seed nodes is less than 30, the running time of the HBIC algorithm is less than that of the GDM algorithm.
- However, after the number of seed nodes exceeds 30, it still takes more time for the HBIC algorithm to select more seed nodes, while the time consumption of the GDM algorithm increases very little.
- Through the analysis of the P2P data sets, it is found that the mean aggregation coefficient of the P2P data sets is very low, from which it can be inferred that the GDM algorithm consumes more time in networks with low mean aggrega- tion coefficient.
- The relationship between the running time of the GDM algorithm and the mean aggregation coefficient of the network needs to be further discussed in future work..
- To find out which dimension of the three dimensions in the three-dimensional space of social network has the greatest influence on the experimental results, ablation experiments are proceeded to verify which dimension is more effective..
- social network construction is used simultaneously with the three dimensions, it will be improved to a certain extent in the random and uniform configurations, while the effect of either the degree dimension or the motif dimension alone is closer to that of the reciprocal setting..
- Based on these properties, a greedy algorithm for influence maximization is proposed under the Gaussian propagation model.
- The experimental results showed that the performance of the seed nodes selected based on the Gaussian propagation model is consistent with traditional algorithms and occasionally exceeds that of tradi- tional algorithms.
- A future research direction includes examining whether each node should specifically set the affected threshold during the implementation of the influence maximization algorithm.
- Sec- ondly, the influence maximization algorithm under the Gaussian propagation model does not perform well in the Wiki-vote data set.
- A possible reason for this relates to the influence caused by the setting of the influence threshold when selecting the seed nodes.
- Richardson, Mining the network value of customers, in: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001, pp.
- Tardos, Maximizing the spread of influence through a social network, in, in: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003, pp.
- Glance, Cost-effective outbreak detection in networks, in: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007, pp

Xem thử không khả dụng, vui lòng xem tại trang nguồn
hoặc xem Tóm tắt