Методи побудови моделі поведінки користувачів

Шаховська, Н. Б.; Мельникова, Н. І.; Shakhovska, N. B.; Melnykova, N. I.

Методи побудови моделі поведінки користувачів

dc.citation.epage	51
dc.citation.issue	1
dc.citation.journalTitle	Український журнал інформаційних технологій
dc.citation.spage	43
dc.citation.volume	2
dc.contributor.affiliation	Національний університет “Львівська політехніка”
dc.contributor.affiliation	Lviv Polytechnic National University
dc.contributor.author	Шаховська, Н. Б.
dc.contributor.author	Мельникова, Н. І.
dc.contributor.author	Shakhovska, N. B.
dc.contributor.author	Melnykova, N. I.
dc.coverage.placename	Львів
dc.coverage.placename	Lviv
dc.date.accessioned	2022-05-24T11:10:11Z
dc.date.available	2022-05-24T11:10:11Z
dc.date.created	2020-09-23
dc.date.issued	2020-09-23
dc.description.abstract	Наведено методи побудови моделі поведінки користувачів, які дадуть змогу виявити закономірності планування зустрічей друзів на підставі аналізу їхнього щоденного руху. Для цього попередньо проаналізовано низку методів і алгоритмів кластеризації даних і виокремлено особливості їхнього застосування. З'ясовано, що основними перевагами методів кластеризації даних на підставі їхньої щільності є можливість виявлення кластерів вільної форми різного розміру та стійкості до шуму та викидів. Однак до недоліків цих методів можна віднести високу чутливість до встановлення вхідних параметрів, не чіткий опис класів і непридатність для кластеризації даних великих розмірів. З'ясовано, що основною проблемою всіх алгоритмів кластеризації є їх масштабованість із збільшенням обсягу оброблених даних. Встановлено, що основними проблемами більшості з них є складність налаштування оптимальних вхідних параметрів (для алгоритмів щільності, сітки чи моделі), ідентифікація кластерів різної форми та щільності (алгоритми розподілу, алгоритми на підставі сітки), нечіткі критерії завершення (ієрархічний, розділовий та на підставі моделі). Оскільки процедура кластеризації є тільки одним із етапів оброблення даних системи загалом, обраний алгоритм повинен бути простим у використанні та простим для налаштування вхідних параметрів. Дослідження показують, що ієрархічні методи кластеризації містять ряд алгоритмів, придатних як для оброблення даних невеликого обсягу, так і для аналізу великих даних, що є актуальним у галузі соціальних мереж. На підставі виконаного аналізу даних, зібрано інформацію для заповнення розумного профілю користувача. Значну увагу приділено дослідженню асоціативних правил, на підставі чого запропоновано алгоритм для вилучення асоціативних правил, що дало змогу знаходити статистично значущі правила, а також шукати тільки залежності, визначені загальним набором вхідних даних, та має високу обчислювальну складність, якщо існує багато правил класифікації. Розроблено підхід, що орієнтований на створення та розуміння моделей поведінки користувачів, прогнозування майбутньої поведінки за допомогою створеного шаблону. Досліджено методи моделювання попереднього оброблення даних (кластеризація) та виявлено закономірності планування зустрічей друзів на підставі аналізу щоденного руху людей та їхніх друзів. Наведено методи створення та розуміння моделей поведінки користувачів, застосовано алгоритм k-means для групування користувачів, що дало змогу визначити, наскільки добре кожен об'єкт знаходиться у своєму кластері. Введено поняття правил асоціації, розроблено метод пошуку залежностей, оцінено точність моделі.
dc.description.abstract	The number of clustering methods and algorithms were analysed and the peculiarities of their application were singled out. The main advantages of density based clustering methods are the ability to detect free-form clusters of different sizes and resistance to noise and emissions, and the disadvantages include high sensitivity to input parameters, poor class description and unsuitability for large data. The analysis showed that the main problem of all clustering algorithms is their scalability with increasing amount of processed data. The main problems of most of them are the difficulty of setting the optimal input parameters (for density, grid or model algorithms), identification of clusters of different shapes and densities (distribution algorithms, grid-based algorithms), fuzzy completion criteria (hierarchical, partition and model-based). Since the clustering procedure is only one of the stages of data processing of the system as a whole, the chosen algorithm should be easy to use and easy to configure the input parameters. Results of researches show that hierarchical clustering methods include a number of algorithms suitable for both smallscale data processing and large-scale data analysis, which is relevant in the field of social networks. Based on the data analysis, information was collected within fill a smart user profile. Much attention is paid to the study of associative rules, based on which an algorithm for extracting associative rules is proposed, which allows to find statistically significant rules and to look only for dependencies defined by a common set of input data, and has high computational complexity if there are many classification rules. An approach has been developed that focuses on creating and understanding models of user behaviour, predicting future behaviour using the created template. Methods of modelling pre-processing of data (clustering) are investigated and regularities of planning of meetings of friends on the basis of the analysis of daily movement of people and their friends are revealed. Methods of creating and understanding models of user behaviour were presented. The k-means algorithm was used to group users to determine how well each object lay in its own cluster. The concept of association rules was introduced; the method of search of dependences is developed. The accuracy of the model was evaluated.
dc.format.extent	43-51
dc.format.pages	9
dc.identifier.citation	Шаховська Н. Б. Методи побудови моделі поведінки користувачів / Н. Б. Шаховська, Н. І. Мельникова // Український журнал інформаційних технологій. — Львів : Видавництво Львівської політехніки, 2020. — Том 2. — № 1. — С. 43–51.
dc.identifier.citationen	Shakhovska N. B. Methods of building a model of user behavior / N. B. Shakhovska, N. I. Melnykova // Ukrainian Journal of Information Technology. — Lviv : Vydavnytstvo Lvivskoi politekhniky, 2020. — Vol 2. — No 1. — P. 43–51.
dc.identifier.uri	https://ena.lpnu.ua/handle/ntb/56901
dc.language.iso	uk
dc.publisher	Видавництво Львівської політехніки
dc.relation.ispartof	Український журнал інформаційних технологій, 1 (2), 2020
dc.relation.ispartof	Ukrainian Journal of Information Technology, 1 (2), 2020
dc.relation.references	[1] Bonchi, F., Castillo, C., Gionis, A., & Jaimes, A. (2011). Social Network Analysis and Mining for Business Applications. ACM Transactions on Intelligent Systems and Technology, 1(3), 1–37. https://doi.org/10.1145/1961189.1961194
dc.relation.references	[2] Hardiman, S. J., & Katzir, L. (2013). Estimating clustering coefficients and size of social networks via random walk. Proceedings of the 22nd International Conference on World Wide Web (WWW'2013), 539–550. https://doi.org/10.1145/2488388.2488436
dc.relation.references	[3] Hrytsiuk, Yu. I., & Grytsyuk, P. Yu. (2019). The methods of the specified points of the estimates of the parameter of probability distribution of the random variable based on a limited amount of data. Scientific Bulletin of UNFU, 29(2), 141–149. https://doi.org/10.15421/40290229
dc.relation.references	[4] ISO/IEC TR 24028:2020. Information technology – Artificial intelligence – Overview of trustworthiness in artificial intelligence. International Organization for Standardization and International Electrotechnical Commissio (англ.). May 2020. Retrieved from: https://www.iso.org/obp/ui/#iso:std:77608:en
dc.relation.references	[5] Jadhav, B. S., Bhosale, D. S., & Jadhav, D. S. (2016). Pattern based topic model for data mining. International Conference on Inventive Computation Technologies (ICICT'2016), 1–6. https://doi.org/10.1109/inventive.2016.7824855
dc.relation.references	[6] Maulik, U., & Bandyopadhyay, S. (2000). Genetic algorithmbased clustering technique. Pattern Recognition, 33(9), 1455–1465. https://doi.org/10.1016/s0031-3203(99)00137-5
dc.relation.references	[7] Melnykova, N., Marikutsa, U., & Kryvenchuk, U. (2018). The New Approaches of Heterogeneous Data Consolidation. IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT'2018), 408–411. https://doi.org/10.1109/stccsit.2018.8526677
dc.relation.references	[8] Newman, M. E. J. (2003). Mixing patterns in networks. Physical Review E, 67(2), 113–126. https://doi.org/10.1103/physreve.67.026126
dc.relation.references	[9] Osman, Ahmed M. Shahat. (2019). A Novel Big Data Analytics Framework for Smart Cities. Future Generation Computer Systems, 91, 620–33. https://doi.org/10.1016/j.future.2018.06.046
dc.relation.references	[10] Ramírez-Rubio, R., Aldape-Pérez, M., Yáñez-Márquez, C., López-Yáñez, I., & Camacho-Nieto, O. (2017). Pattern classification using smallest normalized difference associative memory. Pattern Recognition Letters, 93, 104–112. https://doi.org/10.1016/j.patrec.2017.02.013
dc.relation.references	[11] Ranjith, K. S., Zhenning, Y., Caytiles, R. D., & Iyengar, N. C. S. N. (2017). Comparative Analysis of Association Rule Mining Algorithms for the Distributed Data. International Journal of Advanced Science and Technology, 102, 49–60. https://doi.org/10.14257/ijast.2017.102.05
dc.relation.references	[12] Shakhovska, N., Fedushko, S., Greguš ml., M., Melnykova, N., Shvorob, I., & Syerov, Y. (2019). Big Data analysis in development of personalized medical system. Procedia Computer Science, 160, 229–234. https://doi.org/10.1016/j.procs.2019.09.461
dc.relation.references	[13] Shakhovska, N., Kaminskyy, R., Zasoba, E., & Tsiutsiura, M. (2018). Association Rules Mining in Big Data. International Journal of Computing, 17, 25–32.
dc.relation.references	[14] Yang, T., Hou, Z., Liang, J., Gu, Y., & Chao, X. (2020). Depth Sequential Information Entropy Maps and Multi-Label Subspace Learning for Human Action Recognition. IEEE Access,8, 135118–135130. https://doi.org/10.1109/access.2020.3006067
dc.relation.references	[15] Yang, X., Lin, X., & Lin, X. (2019). Application of Apriori and FP-growth algorithms in soft examination data analysis. Journal of Intelligent & Fuzzy Systems, 37(1), 425–432. https://doi.org/10.3233/jifs-179097
dc.relation.referencesen	[1] Bonchi, F., Castillo, C., Gionis, A., & Jaimes, A. (2011). Social Network Analysis and Mining for Business Applications. ACM Transactions on Intelligent Systems and Technology, 1(3), 1–37. https://doi.org/10.1145/1961189.1961194
dc.relation.referencesen	[2] Hardiman, S. J., & Katzir, L. (2013). Estimating clustering coefficients and size of social networks via random walk. Proceedings of the 22nd International Conference on World Wide Web (WWW'2013), 539–550. https://doi.org/10.1145/2488388.2488436
dc.relation.referencesen	[3] Hrytsiuk, Yu. I., & Grytsyuk, P. Yu. (2019). The methods of the specified points of the estimates of the parameter of probability distribution of the random variable based on a limited amount of data. Scientific Bulletin of UNFU, 29(2), 141–149. https://doi.org/10.15421/40290229
dc.relation.referencesen	[4] ISO/IEC TR 24028:2020. Information technology – Artificial intelligence – Overview of trustworthiness in artificial intelligence. International Organization for Standardization and International Electrotechnical Commissio (anhl.). May 2020. Retrieved from: https://www.iso.org/obp/ui/#iso:std:77608:en
dc.relation.referencesen	[5] Jadhav, B. S., Bhosale, D. S., & Jadhav, D. S. (2016). Pattern based topic model for data mining. International Conference on Inventive Computation Technologies (ICICT'2016), 1–6. https://doi.org/10.1109/inventive.2016.7824855
dc.relation.referencesen	[6] Maulik, U., & Bandyopadhyay, S. (2000). Genetic algorithmbased clustering technique. Pattern Recognition, 33(9), 1455–1465. https://doi.org/10.1016/s0031-3203(99)00137-5
dc.relation.referencesen	[7] Melnykova, N., Marikutsa, U., & Kryvenchuk, U. (2018). The New Approaches of Heterogeneous Data Consolidation. IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT'2018), 408–411. https://doi.org/10.1109/stccsit.2018.8526677
dc.relation.referencesen	[8] Newman, M. E. J. (2003). Mixing patterns in networks. Physical Review E, 67(2), 113–126. https://doi.org/10.1103/physreve.67.026126
dc.relation.referencesen	[9] Osman, Ahmed M. Shahat. (2019). A Novel Big Data Analytics Framework for Smart Cities. Future Generation Computer Systems, 91, 620–33. https://doi.org/10.1016/j.future.2018.06.046
dc.relation.referencesen	[10] Ramírez-Rubio, R., Aldape-Pérez, M., Yáñez-Márquez, C., López-Yáñez, I., & Camacho-Nieto, O. (2017). Pattern classification using smallest normalized difference associative memory. Pattern Recognition Letters, 93, 104–112. https://doi.org/10.1016/j.patrec.2017.02.013
dc.relation.referencesen	[11] Ranjith, K. S., Zhenning, Y., Caytiles, R. D., & Iyengar, N. C. S. N. (2017). Comparative Analysis of Association Rule Mining Algorithms for the Distributed Data. International Journal of Advanced Science and Technology, 102, 49–60. https://doi.org/10.14257/ijast.2017.102.05
dc.relation.referencesen	[12] Shakhovska, N., Fedushko, S., Greguš ml., M., Melnykova, N., Shvorob, I., & Syerov, Y. (2019). Big Data analysis in development of personalized medical system. Procedia Computer Science, 160, 229–234. https://doi.org/10.1016/j.procs.2019.09.461
dc.relation.referencesen	[13] Shakhovska, N., Kaminskyy, R., Zasoba, E., & Tsiutsiura, M. (2018). Association Rules Mining in Big Data. International Journal of Computing, 17, 25–32.
dc.relation.referencesen	[14] Yang, T., Hou, Z., Liang, J., Gu, Y., & Chao, X. (2020). Depth Sequential Information Entropy Maps and Multi-Label Subspace Learning for Human Action Recognition. IEEE Access,8, 135118–135130. https://doi.org/10.1109/access.2020.3006067
dc.relation.referencesen	[15] Yang, X., Lin, X., & Lin, X. (2019). Application of Apriori and FP-growth algorithms in soft examination data analysis. Journal of Intelligent & Fuzzy Systems, 37(1), 425–432. https://doi.org/10.3233/jifs-179097
dc.relation.uri	https://doi.org/10.1145/1961189.1961194
dc.relation.uri	https://doi.org/10.1145/2488388.2488436
dc.relation.uri	https://doi.org/10.15421/40290229
dc.relation.uri	https://www.iso.org/obp/ui/#iso:std:77608:en
dc.relation.uri	https://doi.org/10.1109/inventive.2016.7824855
dc.relation.uri	https://doi.org/10.1016/s0031-3203(99)00137-5
dc.relation.uri	https://doi.org/10.1109/stccsit.2018.8526677
dc.relation.uri	https://doi.org/10.1103/physreve.67.026126
dc.relation.uri	https://doi.org/10.1016/j.future.2018.06.046
dc.relation.uri	https://doi.org/10.1016/j.patrec.2017.02.013
dc.relation.uri	https://doi.org/10.14257/ijast.2017.102.05
dc.relation.uri	https://doi.org/10.1016/j.procs.2019.09.461
dc.relation.uri	https://doi.org/10.1109/access.2020.3006067
dc.relation.uri	https://doi.org/10.3233/jifs-179097
dc.rights.holder	© Національний університет “Львівська політехніка”, 2020
dc.subject	вибірка шаблонів
dc.subject	послідовний асоціативний аналіз
dc.subject	кластеризація
dc.subject	pattern sampling
dc.subject	sequential associative analysis
dc.subject	clustering
dc.title	Методи побудови моделі поведінки користувачів
dc.title.alternative	Methods of building a model of user behavior
dc.type	Article

Files

Original bundle

Now showing 1 - 2 of 2

Name:: 2020v2n1_Shakhovska_N_B-Methods_of_building_43-51.pdf
Size:: 2.16 MB
Format:: Adobe Portable Document Format

Download

Name:: 2020v2n1_Shakhovska_N_B-Methods_of_building_43-51__COVER.png
Size:: 1.83 MB
Format:: Portable Network Graphics

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.84 KB
Format:: Plain Text
Description:

Download

Collections

Ukrainian Journal of Information Technology. – 2020. – Vol. 2, No. 1