A Hybrid Seed Node Selection and No-Retracing Random Walk in Page Rank Algorithm

Document Type : Original Article

Authors

Department of Computer Engineering, Karaj Branch, Islamic Azad University, Karaj, Iran

Abstract

The random walk technique, which has a reputation for excellent performance, is one method for complex networks sampling. However, reducing the input data size is still a considerable topic to increase the efficiency and speed of this algorithm. The two approaches discussed in this paper, the no-retracing and the seed node selection algorithms, inspired the development of random walk technique. The Google PageRank method is integrated with these different approaches. Input data size is decreased while critical nodes are preserved. A real database was used for this sampling. Significant sample characteristics were also covered, including average clustering coefficient, sampling effectiveness, degree distribution, and average degree. The no-retracing method, for example, performs better. The efficiency increases even further when the no-retracing technique is combined with the Google PageRank. When choosing between public transportation and aircraft, for example, these algorithms might be used since time is crucial. Additionally, these algorithms are more energy-efficient methods that were looked at.

Keywords


  • Lusinchi, "“President” Landon and the 1936 Literary Digest Poll: were automobile and telephone owners to blame?," Social Science History, vol. 36, no. 1, pp. 23-54, 2012.
  • L. Lohr and J. M. Brick, "Roosevelt predicted to win: Revisiting the 1936 Literary Digest poll," Statistics, Politics and Policy, vol. 8, no. 1, pp. 65-84, 2017.
  • ScottArmstrong, "Why the 1936 literary digest poll failed: Peverill Squire, Public Opinion Quarterly 52 (1988) 125-133," International Journal of Forecasting, vol. 5, no. 2, pp. 295-295, 1989.
  • Etikan and K. Bala, "Sampling and sampling methods," Biometrics & Biostatistics International Journal, vol. 5, no. 6, p. 00149, 2017.
  • Taherdoost, "Sampling methods in research methodology; how to choose a sampling technique for research," International Journal of Academic Research in Management (IJARM), vol. 5, no. 2, pp. 18-25, 2016.
  • Zeadally, G. Martinez, and H.-C. Chao, "Securing cyberspace in the 21st century," Computer, vol. 46, no. 04, pp. 22-23, 2013.
  • S. Jalali, A. Rezvanian, and M. R. Meybodi, "Social network sampling using spanning trees," International Journal of Modern Physics C, vol. 27, no. 05, p. 1650052, 2016.
  • Fottovat, H. Izadkhah, and J. Hajipour, "Community Detection in Social Networks Considering the Depth of Relationships," in 2022 8th International Conference on Web Research (ICWR), IEEE, 2022, pp. 24-28.
  • Frank, "Network Sampling," in International Encyclopedia of Statistical Science, M. Lovric Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 941-942.
  • Xie, S. Chang, Z. Zhang, M. Zhang, and L. Yang, "Efficient sampling of complex network with modified random walk strategies," Physica A: Statistical Mechanics and its Applications, vol. 492, pp. 57-64, 2018.
  • Hajarian, A. Bastanfard, J. Mohammadzadeh, and M. Khalilian, "SNEFL: Social network explicit fuzzy like dataset and its application for Incel detection," Multimedia tools and applications, vol. 78, no. 23, pp. 33457-33486, 2019.
  • L. Peterson and B. S. Davie, Computer networks: a systems approach. Elsevier, 2007.
  • R. Rohani and A. Bastanfard, "Algorithm for persian text sentiment analysis in correspondences on an e-learning social website," Journal of Research in Science, Engineering and Technology, vol. 4, no. 01, pp. 11-15, 2016.
  • Hajarian, A. Bastanfard, J. Mohammadzadeh, and M. Khalilian, "Introducing fuzzy like in social networks and its effects on advertising profits and human behavior," Computers in Human Behavior, vol. 77, pp. 282-293, 2017.
  • Dolan, J. Conduit, C. Frethey-Bentham, J. Fahy, and S. Goodman, "Social media engagement behavior: A framework for engaging customers through social media content," European Journal of Marketing, vol. 53, no. 10, pp. 2213-2243, 2019.
  • Hajarian, A. Bastanfard, J. Mohammadzadeh, and M. Khalilian, "A personalized gamification method for increasing user engagement in social networks," Social Network Analysis and Mining, vol. 9, no. 1, pp. 1-14, 2019.
  • J. Watts and S. H. Strogatz, "Collective dynamics of ‘small-world’networks," nature, vol. 393, no. 6684, pp. 440-442, 1998.
  • A.-L. Barabási and R. Albert, "Emergence of scaling in random networks," science, vol. 286, no. 5439, pp. 509-512, 1999.
  • Erdos and A. Rényi, "On the evolution of random graphs," Publ. Math. Inst. Hung. Acad. Sci, vol. 5, no. 1, pp. 17-60, 1960.
  • N. Gilbert, "Random graphs," The Annals of Mathematical Statistics, vol. 30, no. 4, pp. 1141-1144, 1959.
  • Hughes, SAGE Internet Research Methods: Policy, Professionalism and Change (SAGE Internet Research Methods). London: Sage Publications Ltd, 2012.
  • K. Thompson, "Adaptive web sampling," Biometrics, vol. 62, no. 4, pp. 1224-1234, 2006.
  • P. Stumpf and C. Wiuf, "Sampling properties of random graphs: the degree distribution," Physical Review E, vol. 72, no. 3, p. 036118, 2005.
  • K. Moghadam and A. Bastanfard, "Improving Random Walk Sampling, Inspired by Two Methods of Choosing Seed Node And No-Retracing With Combination of them with Page Rank Algorithm," in 2022 8th International Conference on Web Research (ICWR), IEEE, 2022, pp. 195-202.
  • Pearson, "The problem of the random walk," Nature, vol. 72, no. 1865, pp. 294-294, 1905.
  • J. Geyer, "Practical markov chain monte carlo," Statistical science, vol. 7, no. 4, pp. 473-483, 1992.
  • L. Jones and Q. Qin, "Markov chain Monte Carlo in practice," Annual Review of Statistics and Its Application, vol. 9, pp. 557-578, 2022.
  • S. Myers, L. Wallin, and P. Wikström, "An introduction to Markov chains and their applications within finance," MVE220 Financial Risk: Reading Project, 2017.
  • Behrends, Introduction to Markov chains. Springer, 2000.
  • W. Stroock, An introduction to Markov processes. Springer Science & Business Media, 2013.
  • C. Chan, C. Lenard, and T. Mills, "An introduction to Markov chains," in Proceedings of the 49th Annual Conference of Mathematical Association of Victoria, J. Cheeseman, Ed., 2012: Mathematical Association of Victoria, in It's my maths: personalised mathematics learning, pp. 40-47.
  • Ghahramani, Fundamentals of probability: with stochastic processes. Chapman and Hall/CRC, 2018.
  • Sharma, "Pros and cons of different sampling techniques," International journal of applied research, vol. 3, no. 7, pp. 749-752, 2017.
  • M. Adam, "Sample size determination in survey research," Journal of Scientific Research and Reports, vol. 26, no. 5, pp. 90-97, 2020.
  • Phillips, "Sample size and power: What is enough?," in Seminars in Orthodontics, 2002, vol. 8, no. 2: Elsevier, pp. 67-76.
  • Papagelis, G. Das, and N. Koudas, "Sampling online social networks," IEEE Transactions on knowledge and data engineering, vol. 25, no. 3, pp. 662-676, 2011.
  • Rodero-Merino, A. F. Anta, L. López, and V. Cholvi, "Performance of random walks in one-hop replication networks," Computer Networks, vol. 54, no. 5, pp. 781-796, 2010.
  • Page, S. Brin, R. Motwani, and T. Winograd, "The PageRank citation ranking: Bringing order to the web," Stanford InfoLab, 1999.
  • Berkhin, "A survey on PageRank computing," Internet mathematics, vol. 2, no. 1, pp. 73-120, 2005.
  • Perc, "The Matthew effect in empirical data," Journal of The Royal Society Interface, vol. 11, no. 98, p. 20140378, 2014.
  • J. Jansen and A. Spink, "How are we searching the World Wide Web? A comparison of nine search engine transaction logs," Information processing & management, vol. 42, no. 1, pp. 248-263, 2006.
  • Spink, Web search engine research. Emerald Group Publishing, 2012.
  • Wang, F. Ma, and B. Yao, "Arbitrary degree distribution networks with perturbations," AIP Advances, vol. 11, no. 2, p. 025301, 2021.
  • N. Dorogovtsev and J. F. Mendes, "Evolution of networks," Advances in physics, vol. 51, no. 4, pp. 1079-1187, 2002.
  • Yang, X. Huang, X. Cai, X. Zhu, and L. Lu, "ILSR rumor spreading model with degree in complex network," Physica A: Statistical Mechanics and Its Applications, vol. 531, p. 121807, 2019.
  • Ayyappan, C. Nalini, and A. Kumaravel, "A study on SNA: measure average degree and average weighted degree of knowledge diffusion in Gephi," Indian Journal of Computer Science and Engineering, vol. 7, no. 6, pp. 230-237, 2016.
  • Pan, G. Xu, B. Wang, and T. Zhang, "A novel community detection algorithm based on local similarity of clustering coefficient in social networks," IEEE Access, vol. 7, pp. 121586-121598, 2019.
  • Said, R. A. Abbasi, O. Maqbool, A. Daud, and N. R. Aljohani, "CC-GA: A clustering coefficient based genetic algorithm for detecting communities in social networks," Applied Soft Computing, vol. 63, pp. 59-70, 2018.
  • Tong, Y. Lian, J. Niu, Z. Xie, and Y. Zhang, "A novel green algorithm for sampling complex networks," Journal of Network and Computer Applications, vol. 59, pp. 55-62, 2016.
  • Fang, J. Yin, and X. Zhu, "Active exploration for large graphs," Data mining and knowledge discovery, vol. 30, no. 3, pp. 511-549, 2016.
  • Rezvanian and M. R. Meybodi, "Sampling social networks using shortest paths," Physica A: Statistical Mechanics and its Applications, vol. 424, pp. 254-268, 2015.
  • Batagelj and A. Mrvar. Pajek datasets. [Online]. Available: http://vlado.fmf.uni-lj.si/pub/networks/data/
  • Rossi and N. Ahmed, "The network data repository with interactive graph analytics and visualization," in Twenty-ninth AAAI conference on artificial intelligence, USA, AAAI Press, 2015, vol. 29, no. 1, pp. 4292-4293.
  •