Query understanding via Language Models based on transformers for e-commerce

Fedor Krasnov

Abstract


Determining the user's intention by the text of the search query is one of the stages of extracting information in intelligent product search systems on an electronic trading platform. Considering search queries as a collection of short text documents, and user intentions as classes, the author continued to study approaches to the task of multi-class classification of short texts using models based on the architecture of transformers.  The approach to teaching a language model based on token sequences and further fine-tuning to the subject area has proven itself well recently. Inspired by this approach, the author considered the probability of a class label appearing as one of the tokens of a language model based on a transformer. This approach differs from a linear superposition of tokens using an activation function to determine the probability of a class in fine learning. One of the advantages of this approach is that classes acquire compact vector representations (embeddings). The author experimentally confirmed the advantages and disadvantages of both approaches on the text data of search queries. With optimal hyper-parameters, the accuracy of the proposed approach obtained by the f1-score weighted metric was 96%. Consideration of small data sets allowed us to assess the disadvantages characteristic of language models, which will only increase with scaling, to make sure once again that language models are a forced solution in the conditions of huge data sets, and not an alternative advantage.


Full Text:

PDF (Russian)

References


Skinner M., Kallumadi S. E-commerce Query Classification Using Product Taxonomy Mapping: A Transfer Learning Approach // eCOM@ SIGIR. – 2019.

Papenmeier, A., Kern, D., Hienert, D., Sliwa, A., Aker, A., & Fuhr, N. (2021, March). Dataset of Natural Language Queries for E-Commerce. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval (pp. 307-311).

Hirsch, S., Guy, I., Nus, A., Dagan, A., & Kurland, O. (2020, July). Query reformulation in E-commerce search. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1319-1328).

Kong, W., Khadanga, S., Li, C., Gupta, S. K., Zhang, M., Xu, W., & Bendersky, M. (2022, August). Multi-aspect dense retrieval. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 3178-3186).

Zhang, Q., Yang, Z., Huang, Y., Chen, Z., Cai, Z., Wang, K., ... & Gao, J. (2022). A Semantic Alignment System for Multilingual Query-Product Retrieval. arXiv preprint arXiv:2208.02958.

Gu Y. et al. Speech intention classification with multimodal deep learning // Advances in Artificial Intelligence: 30th Canadian Conference on Artificial Intelligence, Canadian AI 2017, Edmonton, AB, Canada, May 16-19, 2017, Proceedings 30. – Springer International Publishing, 2017. – С. 260-271.

Chen Q., Zhuo Z., Wang W. Bert for joint intent classification and slot filling // arXiv preprint arXiv:1902.10909. – 2019.

Gangadharaiah R., Narayanaswamy B. Joint multiple intent detection and slot labeling for goal-oriented dialog // Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). – 2019. – С. 564-569.

Joulin, A., Grave, É., Bojanowski, P., & Mikolov, T. (2017, April). Bag of Tricks for Efficient Text Classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers (pp. 427-431).

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

Ma, Y., Cao, Y., Hong, Y., & Sun, A. (2023). Large language model is not a good few-shot information extractor, but a good reranker for hard samples!. arXiv preprint arXiv:2303.08559.

Dereza O. V., Kayutenko D. A., Marakasova A. A., Fenogenova A. S. A Complex Approach to Spellchecking and Autocorrection for Russian // Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2016”. — 2016. – С. 1-11.

Näther M. An in-depth comparison of 14 spelling correction tools on a common benchmark //Proceedings of the 12th Language Resources and Evaluation Conference. – 2020. – С. 1849-1857.

Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.

Bulatov, A., Kuratov, Y., & Burtsev, M. S. (2023). Scaling Transformer to 1M tokens and beyond with RMT. arXiv e-prints, arXiv-2304.

Gage, P. (1994). A new algorithm for data compression. C Users Journal, 12(2), 23-38.

Joulin, Armand, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, and Tomas Mikolov. "Fasttext. zip: Compressing text classification models." arXiv preprint arXiv:1612.03651 (2016).

Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. "Glove: Global vectors for word representation." In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532-1543. 2014.

Reddy, C. K., Màrquez, L., Valero, F., Rao, N., Zaragoza, H., Bandyopadhyay, S., ... & Subbian, K. (2022). Shopping Queries Dataset: A Large-Scale ESCI Benchmark for Improving Product Search. arXiv preprint arXiv:2206.06588.

Papenmeier, A., Kern, D., Hienert, D., Sliwa, A., Aker, A., & Fuhr, N. (2021, March). Dataset of Natural Language Queries for E-Commerce. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval (pp. 307-311).


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность IT Congress 2024

ISSN: 2307-8162