Development of a software and hardware solution to identify trends in demand for goods
https://doi.org/10.21822/2073-6185-2023-50-1-114-122
Abstract
Objective. The aim of the research work is to develop a software solution for identifying trends in demand for consumer goods by analyzing big data.
Method. To achieve this goal, the work analyzed the current state of development of the Internet retail market in Russia, as well as the technologies and tools for analyzing big data necessary for designing a software and hardware solution. To evaluate the effectiveness of the obtained data processing model, a sample obtained from open sources is used.
Result. As a result of the study, a technical solution has been developed that allows analyzing the demand for goods in a given time range based on data from open sources.
Conclusion. A software component has been developed to analyze the demand for consumer goods based on order data. The resulting technical solution supports batch processing of data, and the architecture of the infrastructure component allows distributed computing. Testing the tool on a real sample showed the effectiveness of this approach to analyzing consumer demand trends.
About the Authors
A. I. MiftahovaRussian Federation
Albina I. Miftahova, Student
49, lit. A, Kronverksky Ave., St. Petersburg 197101
E. I. Yangirov
Russian Federation
Emil I. Yangirov, Student
49, lit. A, Kronverksky Ave., St. Petersburg 197101
E. I. Karaseva
Russian Federation
Ekaterina I. Karaseva, Cand. Sci. (Econom), Assoc. Prof.
49, lit. A, Kronverksky Ave., St. Petersburg 197101
A. I. Yangirov
Russian Federation
Adil I. Yangirov, Head of the sector of functional testing of engineering and technical means of protection
of the department of technical expertise and functional tests
12B Reutovskaya Str., Moscow 111539
E. Yu. Nikulina
Russian Federation
Ekaterina Yu. Nikulina, Cand. Sci. (Eng), Assoc. Prof., Department of Automated Information Systems of
the Department of Internal Affairs
53 Patriotov Ave., Voronezh 394065
I. G. Drovnikova
Russian Federation
Irina G. Drovnikova, Dr. Sci. (Eng.), Prof., Assoc. Prof., Department of Automated Information Systems
of Internal Affairs Bodies
53 Patriotov Ave., Voronezh 394065
References
1. Velichko N. A. Big Data technology. Analysis of the Big Data market/ N. A. Velichko, I. P. Mitreikin. Synergy of Sciences. 2018; 30: 937-943.[In Russ]
2. Chernenko O. S. Application of the TF-IDF algorithm in recommendatory public procurement systemsWorld of Computer Technologies: Collection of articles of the student scientific and technical conference, Sevastopol, April 04–07, 2017 / Scientific editor E .N. Mashchenko. - Sevastopol: Federal State Autonomous Educational Institution of Higher Education "Sevastopol State University", 2017; 66-67 [In Russ].
3. Leontieva, S. A. Clustering of images by the "k-means" method / S. A. Leontieva, A. Yu. Demin . Youth and modern information technologies: Proceedings of the XVI International scientific and practical conference of students, graduate students and young scientists , Tomsk, December 03–07, 2018 / Tomsk Polytechnic University. Tomsk: National Research Tomsk Polytechnic University, 2019; 86-87.[In Russ]
4. Prokhorenkov P. A. Modern information marketing technologies / P. A. Prokhorenkov, O. M. Gusarova, T. V. Averyanova. Fundamental research. 2018;12(1): 158-162.[In Russ]
5. Marketing research Internet commerce in Russia 2021. Data Insight URL: https://datainsight.ru/eCommerce_2021 (Accessed 09/19/2022). [In Russ]
6. Kiran M. et al. Lambda architecture for cost-effective batch and speed big data processing //2015 IEEE International Conference on Big Data (Big Data). IEEE, 2015; 2785-2792.
7. Panwar A., Bhatnagar V. Data lake architecture: a new repository for data engineer. International Journal of Organizational and Collective Intelligence (IJOCI). 2020; 10(1): 63-75.
8. Grigoriev Yu. A., Ermakov O. Yu. Processing of requests in a system with lambda architecture at the level of acceleration. Computer Science and Control Systems. 2020; 2:3-16.[In Russ]
9. Matveeva P.R. Comparison of lambda and traditional architecture.Forum of young scientists. 2018;1: 734- 740 [In Russ]
10. Fernández-Manzano E. P., Neira E., Clares-Gavilán J. Data management in audiovisual business: Netflix as a case study. El profesional de la información (EPI). 2016;25(4): 568-576
11. Big Data Solution with Hadoop, Spark, Jupyter and Docker. Medium URL: https://medium.com/@martinkarlsson.io/big-data-solution-with-hadoopspark-jupyter-and-docker6763983ed5d8 (Accessed: 09/24/2022)
12. Kozintsev D. A., Shiyan A. A. containerization for big data analysis on the example of kubernetes and docker. Actual problems of infotelecommunications in science and education (APINO 2020). 2020; 393- 396. [In Russ]
13. Raschka S., Patterson J., Nolet C. Machine learning in python: Main developments and technology trends in data science, machine learning, and artificial intelligence.Information. 2020;11(4):193.
14. Khyani D. et al. An Interpretation of Lemmatization and Stemming in Natural Language Processin. Journal of University of Shanghai for Science and Technology. 2021
15. URL: https://www.nltk.org/_modules/nltk/stem/snowball.html (Accessed: 01.10.2022). Source code for nltk.stem.snowball // NLTK:: nltk.stem.snowball
16. URL: https://www.nltk.org/_modules/nltk/tokenize/regexp.html (Accessed: 01.10.2022). Source code for nltk.tokenize.regexp // NLTK:: nltk.tokenize.regexp
17. URL:https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html(Accessed: 03.10.2022). sklearn.cluster.KMeans // scikit-learn 1.1.2 documentation
18. Granato D. et al. Use of principal component analysis (PCA) and hierarchical cluster analysis (HCA) for multivariate association between bioactive compounds and functional properties in foods: A critical perspective. Trends in Food Science & Technology. 2018;72:83-90.
19. Text Clustering with TF-IDF in Python // Medium URL: https://medium.com/mlearning-ai/text-clusteringwith-tf-idf-in-pythonc94cd26a31e7 (Accessed: 29.09.2022).
20. Seaborn: statistical data visualization //Seaborn Documentation URL: https://seaborn.pydata.org/index.html (Accessed: 02.10.2022).
21. H&M Personalized Fashion Recommendations. Kaggle URL: https://www.kaggle.com/competitions/h-andm-personalized-fashionrecommendations (Accessed: 01.10.2022).
22. Saavedra M. Z. N., Yu W. E. A comparison between text, parquet, and PCAP formats for use in distributed network flow analysis on Hadoop. Journal of Advances in Computer Networks. 2018; 5(2): 59-64.
Review
For citations:
Miftahova A.I., Yangirov E.I., Karaseva E.I., Yangirov A.I., Nikulina E.Yu., Drovnikova I.G. Development of a software and hardware solution to identify trends in demand for goods. Herald of Dagestan State Technical University. Technical Sciences. 2023;50(1):114-122. (In Russ.) https://doi.org/10.21822/2073-6185-2023-50-1-114-122