Machine learning model serving system for event streams

Aleksei Starikov, Dmitry Namiot


The paper considers the existing systems of streaming event processing, analyzes the characteristics of these systems, analyzes their advantages and disadvantages. Stream processing significantly reduces the time from the moment data is received to the reaction to it compared to batch processing. Using this approach, companies can respond to incoming information in a timely manner, taking action when they are needed. The paper presents an analysis of the systems used in large companies to develop machine learning models, from data collection to putting the model into operation. The paper discusses the issues of introducing machine learning models into a productive environment for streaming processing applications, including the presentation formats of models, and compares the advantages and disadvantages of various approaches. The architecture and practical are given and the practical implementation of the application is developed, which allows using each of the considered formats.

Full Text:

PDF (Russian)


Shahrivari S, Jalili S. Beyond batch processing: towards real-time and streaming big data. Computers. 2014; 3(4):117–29.

Thones, J. Microservices. IEEE Software 32, 1, 2015, 116 – 116.

T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. J. Fernandez-Moctezuma, R. Lax, S. McVeety, D. Mills, ́ F. Perry, E. Schmidt, et al. The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. PVLDB, 2015.

Iqbal, M.H., Soomro, T.R. Big data analysis: Apache storm perspective. Int. J. Comput. Trends Technol, 2015; 9–14.

Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, et al. “Apache Spark: A Unified Engine for Big Data Processing”. In: Commun. ACM 59.11, 2016, pp. 56–65.

Nair, L. R., Shetty, S. D., & Shetty, S. D. Applying spark based machine learning model on streaming big data for health status prediction. Computers & Electrical Engineering, 2017

M. Kiran, P. Murphy, I. Monga, J. Dugan, and S. S. Baveja. Lambda architecture for cost-effective batch and speed big data processing. In IEEE Intl Conf. on Big Data, pages 2785– 2792. IEEE, 2015. 14

P. Carbone, S. Ewen, S. Haridi, A. Katsifodimos, V. Markl, and K. Tzoumas. Apache flink: Stream and batch processing in a single engine. IEEE Data Engineering Bulletin, page 28, 2015.

Bejeck, William P, Kafka Streams in action: real-time apps and microservices with the Kafka Streams APIShelter Island, NY: Manning Publications, 2018.

Shree R. et. al., KAFKA: The modern platform for data management and analysis in big data domain, in Proceedings of the 2nd International Conference on Telecommunication and Networks (TEL-NET), 2017.

S. A. Noghabi, K. Paramasivam, Y. Pan, N. Ramesh, J. Bringhurst, I. Gupta, and R. H. Campbell. Samza: Stateful Scalable Stream Processing at LinkedIn. Proc. VLDB Endow., 10(12):1634–1645, Aug. 2017.

Confluence. Retrieved: April 2020

LinkedIn. Retrieved: April 2020

Vinod Kumar Vavilapalli, Arun C. Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, Bikas 45 Saha, Carlo Curino, Owen O'Malley, Sanjay Radia, Benjamin Reed, Eric Baldeschwieler: Apache Hadoop YARN: yet another resource negotiator. SoCC 2013:5

Jordan MI, Mitchell TM Machine learning: Trends, perspectives, and prospects, Science 349(6245), 2015, pp. 255–260.

SAS. Retrieved: April 2020

Google. Retrieved: April 2020

Microsoft. Retrieved: April 2020

Facebook. Информация о компании. Retrieved: June 2019.

K. Hazelwood, S. Bird, D. Brooks, S. Chintala, U. Diril, D. Dzhulgakov, M. Fawzy, B. Jia, Y. Jia, A. Kalro, J. Law, K. Lee, J. Lu, P. Noordhuis, M. Smelyanskiy, L. Xiong, and X. Wang, “Applied machine learning at facebook: A datacenter infrastructure perspective,” in Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2018.

Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, Tommer Leyvand, Hao Lu, Yang Lu, Lin Qiao, Brandon Reagen, Joe Spisak, Fei Sun, Andrew Tulloch, Peter Vajda, Xiaodong Wang, Yanghan Wang, Bram Wasti, Yiming Wu, Ran Xian, Sungjoo Yoo∗ , Peizhao Zhang, Machine Learning at Facebook: Understanding Inference at the Edge, Facebook, Inc., 2019

Introducing FBLearner Flow: Facebook’s AI backbone. Retrieved: June 2019.

Open Source Search & Analytics • Elasticsearch | Elastic. Retrieved: June 2019.

TensorFlow Extended (TFX) is an end-to-end platform for deploying production ML pipelines. Retrieved: May 2020.

TensorFlow - An end-to-end open source machine learning platform. Retrieved: May 2020.

Sara Landset, Taghi M. Khoshgoftaar, Aaron N. Richter, and Tawfiq Hasanin. 2015. A survey of open source tools for machine learning with big data in the Hadoop ecosystem. Journal of Big Data 2, 1 (2015), 24

Jimmy J. Lin and Alek Kolcz. 2012. Large-scale machine learning at twitter. In SIGMOD. 793–804.

Evan R. Sparks, Shivaram Venkataraman, Tomer Kaftan, Michael J. Franklin, and Benjamin Recht. 2016. KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics. CoRR abs/1610.09451 (2016).

D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, JeanFran ̧cois Crespo, and Dan Dennison. 2015. Hidden Technical Debt in Machine Learning Systems. In NIPS. 2503–2511.

Yann Dauphin, Razvan Pascanu, C ̧ aglar G ̈ul ̧cehre, Kyunghyun Cho, Surya Ganguli, and Yoshua Bengio. 2014. Identifying and attacking the saddle point problem in high- dimensional non-convex optimization. CoRR abs/1406.2572 (2014).

Apache Beam: An Advanced Unified Programming Model. Retrieved: June 2019.

Sinno Jialin Pan and Qiang Yang. 2010. A Survey on Transfer Learning. IEEE Trans. on Knowl. and Data Eng. 22, 10 (Oct. 2010), 1345–1359.

Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?. In NIPS. 3320–3328.

Running your models in production with TensorFlow Serving. Retrieved: June 2019

Jeremy Hermann and Mike Del Balso. Meet Michelangelo: Uber’s machine learning platform., 2017. [Online; accessed 14-April-2019].

Jupyter Notebook. Retrieved: June 2019

Databricks Inc. Retrieved: June 2019

M. Zaharia, A. Chen, A. Davidson, A. Ghodsi, S. Hong, A. Konwinski, S. Murching, T. Nykodym, P. Ogilvie, M. Parkhe, F. Xie, and C. Zumar. Accelerating the machine learning lifecycle with MLflow. IEEE Data Engineering Bulletin, 41(4), 2018.

Klaus Greff, Aaron Klein, Martin Chovanec, Frank Hutter, and Jurgen Schmidhuber. The Sacred Infrastructure for Computational Research. In Katy Huff, David Lippa, Dillon Niederhut, and M. Pacer, editors, Proceedings of the 16th Python in Science Conference, pages 49 – 56, 2017.

M. Vartak, H. Subramanyam, W.-E. Lee, S. Viswanathan, S. Husnoo, S. Madden, and M. Zaharia. Modeldb: A system for machine learning model management. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics, HILDA ’16, pages 14:1–14:3, New York, NY, USA, 2016. ACM.

Google. Tensorboard: Visualizing learning. Retrieved: June 2019

Git. Retrieved: June 2019. 47

GitHub. Retrieved: June 2019.

Sklearn. Retrieved: June 2019.

Zhang YY, Jiao YQ. Design and Implementation of Predictive Model Markup Language Interpretation Engine. 2015 International Conference on Network and Information Systems for Computers (ICNISC). 2015:527–31. 10.1109/Icnisc., 2015.105.

Data Mining Group. Retrieved: June 2019.

Pmml examples. Retrieved: May 2020.,

Library for serialization and deserialization PMML models. Retrieved: May 2020.

Spark.mlib library. Retrieved: May 2020.

Flink-jpmml library. Retrieved: May 2020.

Java PMML API. Retrieved: May 2020.

Tensorflow, saved model. Retrieved: June 2019.

Protobuf. Retrieved: June 2019.

ONNX. Retrieved: June 2019.

ONNX libraries. Retrieved: May 2020.

ONNX Model Zoo. Retrieved: May 2020.

MLeap., Retrieved: June 2019.

J. Pivarski, C. Bennett and RL. Grossman, “Deploying Analytics with the Portable Format for Analytics (PFA)”, Proceedings of the International Conference of Knowledge Discovery and Data Mining, (2016)

PFA description. Retrieved: May 2020.

Plase D., Niedrite L., Taranovs R. Comparison of HDFS compact data formats: Avro Versus Parquet // Mokslas-Lietuvos ateitis. 2017. No. 9. P. 267-276.

Titus 2 - Portable Format for Analytics (PFA) implementation for Python 3.5+. Retrieved: May 2020.

How to Easily Deploy Machine Learning Models Using Flask. Retrieved: May 2020.

Python serialization format – pickle. Retrieved: May 2020.

Docker. Retrieved: May 2020.

Nginx. Retrieved: May 2020.

Apache Tomcat. Retrieved: May 2020.

Implement RESTful Web Service using Java. Retrieved: May 2020.

TensorFlow - Serving Models. Retrieved: May 2020.

Implement RESTful Web Service using Java. Retrieved: May 2020.

TensorFlow - Serving Models. Retrieved: May 2020.

Relise of TensorFlow Serving as an open source tool for serving machine learning model in production. Retrieved: May 2020. models-in-production-with.html,

ONNX to TensorFlow SavedModel. Retrieved: May 2020. tensorflow/issues/490

PyTorch. Retrieved: May 2020.

GRPC. Retrieved: May 2020.

Remote call procedure, wikipedia. Retrieved: May 2020.

Kubeflow. Retrieved: May 2020. Retrieved: May 2020. Retrieved: May 2020.

Kubernates. Retrieved: May 2020.

Boris Lublinsky. Serving Machine Learning Models. O'Reilly Media, Inc. 2017.

PMML model export - RDD-based API. Retrieved: May, Retrieved: 2020.

Looking under the hood of pipelines. Retrieved: May

Release Notes - Flink 1.9. Retrieved: May

LIP-39 Flink ML pipeline and ML libs. Retrieved: May 2020. 39+Flink+ML+pipeline+and+ML+libs

Twiter. Retrieved: May 2020.

Ирисы Фишера. Википедия. Retrieved: May 2020. Retrieved: May 2020. Retrieved: May 2020.

Asynchronous I/O Apache Flink. Retrieved: May 2020.

Machine learning model serving system for event streams. Retrieved: May 2020.


  • There are currently no refbacks.

Abava  Absolutech Convergent 2020

ISSN: 2307-8162