Preview

Herald of Dagestan State Technical University. Technical Sciences

Advanced search

APPROACH OF PROCESSING, CLASSIFICATION AND DETECTION OF NEW CLASSES AND ANOMALIES IN HETEROGENIOUS AND DIFFERENT STREAMS OF DATA

https://doi.org/10.21822/2073-6185-2018-45-3-85-93

Abstract

Objectives. The aim of the study is to search for effective methods and approaches to the processing of heterogeneous data streams and the management of problems of infinite length, conceptual evolution and conceptual drift. A heterogeneous data stream can have infinite length and contain structured or unstructured data. Processing a heterogeneous and multi-scale data flow is a major challenge for researchers. Most of the research focuses on solving problems of infinite length and concept-drift.

Method. New class detection strategies are classified as parametric and non-parametric. This work is based on a non-parametric approach. The classifier works on the ensemble of three models. The separation generates a different number of classes in each fragment. Classes are calculated by applying the K-Medoid clustering method on each fragment. The effectiveness of the K-media clustering method is more suitable for a data set containing anomalies.

Result. The developed algorithm is capable of processing heterogeneous and multi-scale data. Each instance that is present in the model belongs to only one class. Experimental work was performed on four samples of stream data of 2000 lines each. After performing the pre-processing, the multi-valued characteristics of the data were found in the data set.

Conclusion. This paper presents an effective approach for processing heterogeneous data streams and managing tasks of infinite length, conceptual evolution and conceptual drift. The developed approach is based on the string matching parameter instead of the distance for processing the four tasks of data streams. The level of false positives in the developed algorithm is rather low and can be considered insignificant. The approach does not classify a new instance of the class as an existing class, but can effectively handle the functional evolution.

About the Author

R. A. Bagutdinov
Tomsk Polytechnic University.
Russian Federation

30 Lenina Ave., Tomsk 634050.

Ravil A. Bagutdinov– Assistant, Department of Automation and Robotics, School of Information Technology and Robotics.



References

1. A. El-Hoiydi, “Spatial TDMA and CSMA with preamble sampling for low power ad hoc wireless sensor networks”, Proceedings of ISCC 2002, Seventh International Symposium on Computers and Communications, pp. 685 - 692, July 2002.

2. Bagutdinov R.A. The processing of heterogeneous data for multisensor systems of technical vision on the example of analysis of temperature and gas concentration / MSIT TPU. 2018. P. 25-26.

3. Bagutdinov R.A. Printsip razrabotki algoritmicheskogo obespecheniya sistemy tekhnicheskogo zre-niya robotov / Naukoyemkiye tekhnologii v kosmicheskikh issledovaniyakh Zemli. 2017. T. 9.№5. S. 66 -71. [Bagutdinov R.A. The principle of developing algorithmic support of the robotic vision system / High technology in space exploration of the Earth.2017. Т. 9.№5. P. 66-71. (In Russ.)]

4. Bagutdinov R.A., Zaharova A.A. The task adaptation method for determining the optical flow problem of interactive objects recognition in real time / Journal of Physics: Conference Series. 2017. Т. 803. №1. С. 012014.

5. Barsegyan A.A. Tekhnologii analiza dannykh. Data Mining, Visual Mining, Text Mining, OLAP / Barsegyan A.A., Kupriyanov M.S., Stepanenko V.V., Kholod I.I // SPb.: BKHV-Peterburg, 2007.- 384 s.: il. [ [Barsegyan A.A. Data Analysis Technologies. Data Mining, Visual Mining, Text Mining, OLAP / Barsegyan AA, Kupriyanov MS, Stepanenko VV, Kholod II // // St. Petersburg: BHV-Petersburg, 2007.- 384 p. (In Russ.)]

6. Ganguly AR, Steinhaeuser K Data mining for climate change and impacts. In Proceedings of IEEE international conference on data mining (ICDM)workshops. 2009; pp 385–394

7. Kadiyev I.P., Kadiyev P.A. Osnovy indeksnoy strukturizatsii nxn - kombinatornykh konfigura-tsiy./Vestnik Dagestanskogo gosudarstvennogo tekhnicheskogo universiteta. Tekhnicheskiye nauki. 2018;45(1):139-146 [Kadiev IP, Kadiev PA Fundamentals of index structuring of nxn - combinatorial configurations. / Herald of Dagestan State Technical University. Technical science. 2018;45(1):139-146. (In Russ.)]

8. Kachayeva G.I., Popov A.D., Rogozin Ye.A. Pokazateli effektivnosti funktsionirovaniya pri razra-botke sistem zashchity informatsii ot nesanktsionirovannogo dostupa v avtomatizirovannykh informatsion-nykh sistemakh.//Vestnik Dagestanskogo gosudarstvennogo tekhnicheskogo universiteta. Tekhnicheskiye nauki. 2018;45(1):147-159. [Kachaeva G.I, Popov A.D, Rogozin E.A. Performance indicators for the development of information security systems against unauthorized access in automated information systems. /Herald of the Dagestan State Technical University. Technical science. 2018;45(1):147-159. (In Russ.)]

9. Kuei-Ping Shih , Hung-Chang Chen , Chien-MinChou , Bo-JunLiu “On target coverage in wireless heterogeneous sensor networks with multiple sensing units”, Journal of Network and Computer Applications, vol. 32, pp. 866– 877, 2009.

10. Ostrovskiy O.A. Definitsal'nyy analiz korrelyatsionnoy zavisimosti informatsionnoy modeli i kriminalisticheskoy kharakteristiki prestupleniya v sfere komp'yuternoy informatsii / Yevraziyskiy yuridicheskiy zhurnal. №7 (110). 2017. S. 221-225. [Ostrovsky OA A definitive analysis of the correlation dependence of the information model and the forensic characteristics of the crime in the sphere of computer information / Eurasian Juridical Journal. №7 (110). 2017. P. 221-225.(In Russ.)]

11. Ostrovskiy O.A. Printsip ob"yektnoy dekompozitsii v sistematizatsii identifikatsionnykh kodov, kharakterizuyushchikh prestupleniya v sfere komp'yuternoy informatsii / Politseyskaya deyatel'nost'. № 3. 2017. S. 10-18. [Ostrovsky OA The principle of object decomposition in the systematization of identification codes characterizing crimes in the sphere of computer information / Police activity.№ 3. 2017. P. 10-18. (In Russ.)]

12. Ostrovskiy O.A.Kriminalisticheskiy analiz, opisyvayushchiy sostoyaniye determinirovannogo ko-nechnogo avtomata v modeli nablyudatelya pri rassledovanii prestupleniy v sfere komp'yuternoy informa -tsii / Yevraziyskiy yuridicheskiy zhurnal. №3 (118). 2018. S. 294-296. [Ostrovsky OA Criminalistic analysis describing the state of a deterministic finite automaton in the observer model in the investigation of crimes in the sphere of computer information / Eurasian Juridical Journal. №3 (118). 2018. С. 294-296. (In Russ.)]

13. Ostrovskiy O.A.Algoritmy provedeniya osmotrov tsifrovykh nositeley informatsii dlya predot-vrashcheniya komp'yuternykh prestupleniy / Voyenno-yuridicheskiy zhurnal. № 11. 2017. S. 3-6. [Ostrovsky OA Algorithms for carrying out examinations of digital media for preventing computer crimes / Military-legal journal. № 11. 2017. pp. 3-6. (In Russ.)]

14. Pietro Ciciriello, Luca Mottola, Gian Pietro Picco, “Efficient routing from multiple sources to multiple sinks in wireless sensor networks”, in Proceedings of the 4th European Conference on Wireless Sensor Networks (EWSN’07), Lecture Notes in Computer Science, vol. 4373, pp. 34–50, January 2007.

15. Petrenko N.A., Bagutdinov R.A. Analiz mul'tisensornykh sistem i sensornogo sliyaniya dannykh / V sbornike: Molodozh' i sovremennyye informatsionnyye tekhnologii. Sbornik trudov XV Mezhdunarodnoy nauchno-prakticheskoy konferentsii studentov, aspirantov i molodykh uchonykh. Natsional'nyy issledova-tel'skiy Tomskiy politekhnicheskiy universitet. 2018. S. 73-74. [Petrenko NA, Bagutdinov RA Analysis of multi-sensory systems and sensory data merging / In the collection: Youth and modern information technology. Proceedings of the XV International Scientific and Practical Conference of students, graduate students and young scientists. National Research Tomsk Polytechnic University. 2018; P. 73-74. (In Russ.)]

16. Ramaswamy S, Rastogi R, Shim K Efficient algorithms for mining outliers from large data sets. ACM SIGMOD. 2000; Rec 29(2):427–438

17. Hart JK, Martiez K Environmental sensor networks: a revolution in the earth system sciene? 2006; Earth Sci Rev 78:177–191.

18. Yurkova O.N. Primeneniye metodov analiza dannykh dlya avtomatizatsii formirovaniya ontologii. / Vestnik Dagestanskogo gosudarstvennogo tekhnicheskogo universiteta. Tekhnicheskiye nauki. 2018;45(1):172-180.[ Yurkova ON Application of data analysis methods to automate the formation of ontology. / Herald of the Dagestan State Technical University. Technical science. 2018;45(1):172-180. (In Russ.)]

19. W. Ye, J. Heidemann, and D. Estrin, “Medium Access Control with Coordinated Adaptive Sleeping for Wireless Sensor Networks”, IEEE/ACM Trans. Networking, 2004;vol. 12, no. 3, pp. 493–506.

20. Xie M, Hu J, Tian B Histogram-based online anomaly detection in hierarchical wireless sensor networks. In: Trust, Security and Privacy in Computing and Communications, 2012 IEEE 11th International Conference On. IEEE . 751–759


Review

For citations:


Bagutdinov R.A. APPROACH OF PROCESSING, CLASSIFICATION AND DETECTION OF NEW CLASSES AND ANOMALIES IN HETEROGENIOUS AND DIFFERENT STREAMS OF DATA. Herald of Dagestan State Technical University. Technical Sciences. 2018;45(3):85-93. (In Russ.) https://doi.org/10.21822/2073-6185-2018-45-3-85-93

Views: 553


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2073-6185 (Print)
ISSN 2542-095X (Online)