Estimation using Least Absolute Deviations Method of Regression Models with Integer Floor and Ceiling Functions
Abstract
Among the current scientific problems of machine learning, one can highlight the search for new structural specifications of regression models that successfully cope with the processing of the most heterogeneous statistical data. When conducting regression analysis, situations often arise when the type of dependent variable influences the choice of regression model specification. For example, if the dependent variable takes values 0 and 1, then it is advisable to construct a logistic regression, and if the values are positive, it is advisable to construct a Poisson regression. If the dependent variable is an integer, then linear regression is not appropriate because the predicted values will likely not be an integer. For example, if the dependent variable is the number of employees in the organization, then the possible forecast value of «1200.517 people» can hardly be considered correct. This article is devoted to finding an answer to the question of what to do in such a situation. Regression models with known integer functions floor and ceiling are proposed. The problem of estimating the proposed regressions using the least absolute deviations method is reduced to a mixed integer linear programming problem. It is shown how the proposed models relate to each other. The integration of integer functions floor and ceiling into linear regression is considered. The developed regressions with integer functions were used to model the number of researchers with academic degrees in the Irkutsk region. In this case, the explanatory variables were additionally transformed using the elementary function natural logarithm. All models with integer functions were construct in the LPSolve package in an acceptable time and, in terms of the sum of residual modules, turned out to be better than the corresponding linear regressions.
Full Text:
PDF (Russian)References
Mahesh B. Machine learning algorithms-a review // International Journal of Science and Research. 2020. Vol. 9. No. 1. P. 381-386.
Janiesch C., Zschech P., Heinrich K. Machine learning and deep learning // Electronic Markets. 2021. Vol. 31. No. 3. P. 685-695.
Namiot D.E., Il'yushin E.A., Chizhov I.V. Voennye primeneniya mashinnogo obucheniya // International Journal of Open Information Technologies. 2022. Vol. 10. No. 1. P. 69-76.
Svetskiy A.V. Primenenie iskusstvennogo intellekta v sel'skom khozyaystve // Sel'skoe khozyaystvo. 2022. No. 3. P. 1-12.
Gusev A.V., Novitskiy R.E., Ivshin A.A., Alekseev A.A. Mashinnoe obuchenie na laboratornykh dannykh dlya prognozirovaniya zabolevaniy // Farmakoekonomika. Sovremennaya farmakoekonomika i farmakoepidemiologiya. 2021. Vol. 14. No. 4. P. 581-592.
Molnar C. Interpretable machine learning. Lulu. com, 2020.
Wooldridge J.M. Two-way fixed effects, the two-way mundlak regression, and difference-in-differences estimators. 2021. Available at SSRN 3906345.
Chan J.Y.L., Leow S.M.H., Bea K.T., Cheng W.K., Phoong S.W., Hong Z.W., Chen Y.L. Mitigating the multicollinearity problem and its machine learning approach: a review // Mathematics. 2022. Vol. 10. No. 8. P. 1283.
Liu Z., Yang Y. Least absolute deviations estimation for uncertain regression with imprecise observations // Fuzzy Optimization and Decision Making. 2020. Vol. 19. P. 33-52.
Bazilevskiy M.P. Kriterii nelineynosti kvazilineynykh regressionnykh modeley // Modelirovanie, optimizatsiya i informatsionnye tekhnologii. 2018. Vol. 6. No. 4 (23). P. 185-195.
Bazilevskiy M.P. Issledovanie dvukhfaktornoy modeli polnosvyaznoy lineynoy regressii // Modelirovanie, optimizatsiya i informatsionnye tekhnologii. 2019. Vol. 7. No. 2 (25). P. 80-96.
Bazilevskiy M.P. Otbor informativnykh regressorov s uchetom mul'tikollinearnosti mezhdu nimi v regressionnykh modelyakh kak zadacha chastichno-bulevogo lineynogo programmirovaniya // Modelirovanie, optimizatsiya i informatsionnye tekhnologii. 2018. Vol. 6. No. 2 (21). P. 104-118.
Bazilevskiy M.P. Otbor informativnykh operatsiy pri postroenii lineyno-neelementarnykh regressionnykh modeley // International Journal of Open Information Technologies. 2021. Vol. 9. No. 5. P. 30-35.
Bazilevskiy M.P. Metod postroeniya neelementarnykh lineynykh regressiy na osnove apparata matematicheskogo programmirovaniya // Problemy upravleniya. 2022. No. 4. P. 3-14.
Schober P., Vetter T.R. Logistic regression in medical research // Anesthesia & Analgesia. 2021. Vol. 132. No. 2. P. 365-366.
Amin M., Akram M.N., Kibria B.M.G. A new adjusted Liu estimator for the Poisson regression model // Concurrency and computation: Practice and experience. 2021. Vol. 33. No. 20. P. e6340.
Grekhem R., Knut D., Patashnik O. Konkretnaya matematika. Osnovanie informatiki: Per. s angl. Moscow : Mir, 1998. 703 p.
Wolsey L.A. Integer programming. John Wiley & Sons, 2020.
Refbacks
- There are currently no refbacks.
Abava Кибербезопасность IT Congress 2024
ISSN: 2307-8162