Synthesis of Semantic Object Recognizers
Abstract
The expansion of the natural language processing (NLP) domain and the emergence of new tasks in this field have sparked increasing interest in both the formalization of natural language processes and the development of formalized design tools. One such area is the identification and recognition of certain objects that are linguistically present in text streams and that fulfill specific goals and intentions. The present study is conducted within this area and develops a formal method for designing a recognizer to identify semantic objects in natural language text streams based on their linguistic traces. As part of the research, a formal model of a semantic object was developed, which includes concepts such as behavior function, scenario, and linguistic trace. A formal model of a semantic object recognizer and a recognition function are proposed. A formal method for synthesizing the recognizer is developed based on regular expression algebra and an automaton model in the form of a transition system. A previously developed computational representation of meaning was used to compare text fragments for semantic proximity. The proposed solutions are primarily recommended for detecting and preventing crimes in social networks.
Full Text:
PDF (Russian)References
Y.M. Vishnyakov, R.Y. Vishnyakov. Identification of semantic objects in information stream. Journal of Physics: Conference Series; Bristol. Vol. 1902, 1 (May 2021). doi: 10.1088/1742-6596/1902/1/012104
Yu.M. Vishnyakov, R.Yu. Vishnyakov. Formalization of recognition and identification of semantic objects in natural language text streams. Izvestiya SFedU. Engineering Sciences. 2024, No. 4. P. 110-128.
Yu.M. Vishnyakov, R.Yu. Vishnyakov. The Linguistic Proximity in Information Retrieval and Document Classification. In: 14th IEEE International Symposium on Computational Intelligence and Informatics. Budapest, Hungary; 2013. P. 131-134.
Yuri M. Vishnyakov, Renat Y. Vishnyakov Computational theory of semantics representation in scientific and technical texts. AMCSM_2018 IOP Publishing IOP, Conf. Series: Journal of Physics: Conf. Series. 2019. Vol. 1202. 012008. doi: 10.1088/1742-6596/1202/1/012008.
Philip M. Lewis, Daniel J. Rosenkrantz, Richard E. Stearns, R. E. Stearns Compiler Design Theory. Addison-Wesley Publishing Company, 1976, 647 s.
Koncel-Kedziorski R, Hajishirzi H and Sabharwal A et Al. 2015 Parsing algebraic word problemsinto equations. Transactions of the Association for Computational Linguistics, 3:585–597.
Devlin J, Chang MW, Lee K and Toutanova K 2018 Bert: Pre-training of deep bidirectional transformers for language understanding arXiv preprint arXiv:1810.04805
Cruse A. Meaning in language: An introduction to semantics and pragmatics Oxford University Press UK, 2011.
Nalimov V.V. Verojatnostnaja model' jazyka. O sootnoshenii estestvennyh i iskusstvennyh jazykov. M.: Nauka, 1979, 303 p.
Nikolaev I.S. Komp'juternaja i prikladnaja lingvistika / Nikolaev I.S., Mitrenina O.V., Lando T.M. (eds.) M.: Lenand, 2016. 316 p.
Testelec Ja.G. Vvedenie v obshhij sintaksis. Uchebnoe posobie. M.: Izd-vo Rossijskogo gumanitarnogo universiteta, 2001. 830 p.
Prohorenok N.A., Dronov V.A. Python 3. Samoe neobhodimoe. SPb.: BHV-Peterburg, 2019. 608 p.
Bengfort Bendzhamin Prikladnoj analiz tekstovyh dannyh na Python / Mashinnoe obuchenie i sozdanie prilozhenij obrabotki estestvennogo jazyka / Bengfort Bendzhamin, Bilbro Rebekka, Oheda Toni. SPb.: Piter, 2019. 368 p.
Devlin J, Chang MW, Lee K and Toutanova K Bert: Pre-training of deep bidirectional transformers for language understanding arXiv preprint arXiv:1810.04805, 2018.
Hu K., Wu H., Qi K. et Al. A domain keyword analysis approach extending Term Frequency- Keyword Active Index with Google Word2Vec model Scientometrics, Springer, 2017, p. 1-38.
Tianshuo Peng, Zuchao Li, Lefei Zhang, Hai Zhao, Ping Wang, Bo Du; Multi-modal Auto-regressive Modeling via Visual Tokens, MM '24. In: Proceedings of the 32nd ACM International Conference on Multimedia. 2024. doi: https://doi.org/10.1145/3664647.3681685
Aman Bhadouria, Pranav Gupta, Parish Bindal, Kapil Madan, Sonal Sonal; Automated Examination System using Machine Learning and Natural Language Processing. In: IC3-2024: Proceedings of the 2024 Sixteenth International Conference on Contemporary Computing. 2024. doi: https://doi.org/10.1145/3675888.3676144
Azhar Kassem Flayeh, Yaser Issam Hamodi, Nashwan Dheyaa ZakiText Analysis Based on Natural Language Processing (NLP), 2022 2nd International Conference on Advances in Engineering Science and Technology (AEST); doi: 10.1109/AEST55805.2022.10413039
Xin Wu, Yi Cai, Zetao Lian, Ho-fung Leung, Tao Wang; Generating Natural Language From Logic Expressions With Structural Representation, IEEE/ACM Transactions on Audio, Speech, and Language Processing (Volume: 31); doi: 10.1109/TASLP.2023.3263784
Lalitha Manasa Chandrapati, Ch. Koteswara Rao; Descriptive Answers Evaluation Using Natural Language Processing Approaches, IEEE Access. Vol. 12. 2024. doi: 10.1109/ACCESS.2024.3417706
Komal Kalra, Raj Gaurang Tiwari; Exploring Common Areas and Types of Cybercrime in Today's Digital Landscape. In: 2023 3rd Asian Conference on Innovation in Technology (ASIANCON). doi: 10.1109/ASIANCON58793.2023.10270422
Zhenhua Zhao, Chao Wang, Shaopei Ji; Text Similarity Calculation Model Based on Semantic Information and Syntactic Structure Fusion Weighting. In: 2024 6th International Conference on Communications, Information System and Computer Engineering (CISCE). doi: 10.1109/CISCE62493.2024.10653095
Anshul Modi, Yuvraj Singh Dhanjal, Anamika Larhgotra; Semantic Similarity for Text Comparison between Textual Documents or Sentences. In: 2023 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES). doi: 10.1109/ICSES60034.2023.10465440
Sonali Mhatre, Shilpa Satre, Mansi Hajare, Aditi Hire, Aniket Itankar, Shruti Patil; Text Comparison Based on Semantic Similarity. In: 2023 3rd International Conference on Intelligent Technologies (CONIT). doi: 10.1109/CONIT59222.2023.10205616
Aadeesh Bali, Aniket Bhagwat, Aditya Bhise, Sarang Joshi; Semantic Similarity Detection and Analysis For Text Documents, 2024 Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE); https://doi.org/10.1109/ic-ETITE58242.2024.10493834
Refbacks
- There are currently no refbacks.
Abava Кибербезопасность IT Congress 2024
ISSN: 2307-8162