atm cash prediction kaggle

For both strategies, a history of feature matrix and target vectors have been constructed to generate the future observations (next days) input variables; however, to prevent the forecasted observations from entering into the learning process, this history was not added to the training set when the model was fitted. Alongside the contributions already mentioned, this study aims to fill the gap and propose a comprehensive evaluation for ATM cash demand prediction both before and during the COVID-19 pandemic (i.e., just after a disruption in demand) to choose the most promising algorithms based on a new performance metric that simultaneously takes both the error and accuracy of directions change into account. The former obtained a Fitness at 68.87, and the latter achieved 71.57. is a decision tree based ensemble algorithm that has been dominating Kaggle competitions and applied machine learning for tabular data. Results obtained with the approximate iteration strategy in Fig. Apart from an exhaustive hyperparameter tuning, we extensively evaluated a total of 192 configurations (including 12 predictors, 2 iteration strategies, 2 statuses of before/during COVID-19, and 4 ATMs with different time series). Parmezan ARS, Souza VM, Batista GE. Generally, as is shown in Fig. Ramrez C, Acua G. Forecasting cash demand in ATM using neural networks and least square support vector machine. The MSE is computed to figure out how far the prediction values are from the actual values in terms of quantity. 9 and and10,10, the updated iteration strategy has a better predictive performance compared to the approximate iteration strategy. The site is secure. 11, non-parametric models can reliably predict the changes in the withdrawal pattern both before and during COVID-19. [16] and Wichard [46]. The ongoing coronavirus (COVID-19) pandemic and the measures (e.g., total or partial lockdown) taken to prevent its outbreak have sharply decreased cash demand and significantly changed cash withdrawal patterns. improved accuracy of the cash demand forecasts due to reduction in computational complexity when predicting an ATMs daily cash demand for groups of ATM centers with similar day-of-the week cash withdrawal seasonality patterns. Several parametric models (MA, SES, HES, ARIMA, and SARIMA) and non-parametric models (MLP, SVM, RF, and KNN), followed by the data-sequence and regular-features algorithms were employed in the analysis of the collected datasets. Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Another highlighted takeaway from the table is that analyzing and forecasting ATMs cash demand based on their withdrawal patterns (i.e., three different types) compared to researching them in a single class resulted in higher predictive performance. In the case of Iran, for example, the number of ATMs and debit cards are about 60 thousand and 23 million, respectively [14]. ATM 101. Multiple ATMs can be grouped into main clusters depending on the demand (e.g. These models, along with some variations, have been used to find the best models according to the Fitness metric. 9B). Venkatesh et al. As can be seen, before COVID-19 (Fig. The non-parametric models employed in this study are well-known ML regressors, namely MLP, SVM, RF, and KNN. Bao Y, Xiong T, Hu Z. Multi-step-ahead time series prediction using multiple-output support vector regression. Hierarchy of employed time series prediction models in this study, In the data-sequence algorithm, the feature matrix is constructed via the transposition of the data-sequence with the sliding window of length 7 (the yellow-shaded rows in Fig. Their approach consisted of two essential components, including a combination of forecasts from different models (i.e., ANN, linear models, and regression) and seasonality modeling, which achieved 18.95% of SMAPE. The evaluation metric is set to be the Fitness measure proposed for the first time in this study. ATM Cash Prediction Using Time Series Approach Conference: 2020 3rd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) Authors: Muhammad Rafi National. Having historical . However, during COVID-19 (Fig. Finally, SectionConclusion reports the conclusion and possible directions for future work. Springer, 2011; p. 51522. The presented metric simultaneously considers the difference between the predicted and the actual values, as well as the accuracy of directions changesparticularly when the pattern abruptly changes. Learn more about the CLI. 1) people will have a tendency to withdraw money on Friday for the weekend or 2) end of the month when people get their salaries or 3) between 710th day of each month some people get their pension. CatBoost too has shown almost similar prediction result like LGB. Khashei M, Bijari M, Hejazi SR. Wadi SAL, Almasarweh M, Alsaraireh AA, Aqaba J. This paper proposed a time-series model for forecasting the cash demands of each . Therefore, inspired by the case analysis of Bank of Serbia where time series analysis and linear models were employed for obtaining predictions of daily cash demand at each ATM, we choose to start with a simple Linear model for our use case. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. IEEE, 2014; p. 15. Therefore, developing cash demand forecasting model for ATM network is a challenging task. See Figs. This paper strived to conduct a trustful comparison approach and perform different models equally well. 5). However, the last month of the third year has a contrary trend because of the beginning of the COVID-19 pandemic and the announcement of a stay-at-home order. The only difference is the product which is cash needs to be replenished for a priory set period of time. Some researchers studied the uncertainty and chaos in an ATMs daily cash demand. The https:// ensures that you are connecting to the Therefore, in this typical cash demand forecast models we will present time series and regression machine learning models to troubleshoot the above use case. Figures S1C and and88 show the influence of this feature on the cash withdrawal pattern. However, choosing the most efficient model to appropriately forecast an ATMs cash demand is one of the most important activities. IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, In: Proceedings of the 10th Annual Conference companion on genetic and evolutionary computation. Our last comparison constitutes all of the datasets (i.e., different cash withdrawal patterns based on different environments and levels of accessibility around the ATMs). However, its scikit-learn implementation still requires all features to be numerical. The KNN regressor, unlike the stochastic models, looks for the k most similar samples that have been observed in the training set for each new sample. The shaded rows on the left panel show the window of length 7 for the data-sequence algorithm based on the autocorrelation matrix results that report a lag of 7 for the available data, Selected input variables (features) considered in the regular-features algorithm, The first eight features were considered in the previous studies [20, 21], while the last four features were added to accurately model the cash withdrawal pattern (Table (Table1).1). (ATM 1) Comparison of performance measures for different models (parametric: MA, SES, HES, ARIMA, and SARIMA; non-parametric-data-sequence: MLP_DS, SVM_DS, RF_DS, and KNN_DS; and non-parametric-regular-features: MLP, SVM, RF, and KNN) in the prediction of cash demands with updated iterations. Gurgul H, Suder M. Modeling of withdrawals from selected ATMs of the Euronet network. to use Codespaces. National Library of Medicine It is necessary to predict the daily demand for the amount of cash for various ATMs. OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE The MA model uses the arithmetic mean of the last n values of datapoints to predict future datapoints [30]. This model emphasizes the more recent observations by giving higher weights to them compared to datapoints from the more distant past. One primary assumption of models in the literature is that the amount of cash demand and withdrawal patterns are not overly volatile (though some studies have investigated chaos time series and uncertainty in demand). Hassan MR, Al-Insaif S, Hossain MI, Kamruzzaman J. It should be noted that in Iran, the weekdays are Saturday to Wednesday, and weekends are Thursday and Friday. ATM Cash Prediction Using Time Series Approach - ResearchGate 10 illustrates the results with the updated iteration strategy on ATM 1. A Before and B during the COVID-19 pandemic. Time Series and Neural Networks to forecast daily cash demands in ATMs The MLP, well known as a class of feedforward artificial neural network, is a popular stochastic technique used for forecasting purposes via performing a non-linear mapping from previous datapoints to future datapoints [33, 43]. ATM cash demand forecasting in an Indian bank with chaos and hybrid The .gov means its official. However, the POCID metric does not consider the exact closeness of the prediction to the actual values. Furthermore, other methods, such as deep learning, hybrid AI, and metaheuristics optimization algorithms, can be utilized and compared. The weighting average can be dependent or independent of the distance. A machine learning model to forecast the amount of cash withdrawal for many ATMs with time-series based data MIT License Copyright (c) 2018 I Komang Sena Aji Buwana Machine Learning Mastery; 2017. Simutis R, Dilijonas D, Bastina L. Cash demand forecasting for ATM using neural networks and support vector regression algorithms. The ARIMA model makes its prediction using the difference between the values of datapoints, rather than their actual values. behaviour of users. First, three statistical analyses, including a fuller test, autocorrelation function (ACF) plot, and partial autocorrelation function (PACF) plot, were employed to initially estimate an acceptable range of required parameters for the parametric models. In the case of high forecast and high unused cash stored in the ATM incur costs to the bank. Lim B, Zohren S. Time series forecasting with deep learning: a survey. The rest of this paper can be summarized as follows. FOIA The reasons for such results might be mainly related to the high performance of ARIMA and SARIMA for short-term prediction [44], while avoiding or minimizing overfitting. copies or substantial portions of the Software. 3 Autoregressive moving average models, intervention problems and outlier detection in time series. Before COVID-19 (Fig. Andrawis RR, Atiya AF, El-Shishiny H. Forecast combinations of computational intelligence and linear models for the NN5 time series forecasting competition. The model is evaluated for specific day d. Values li are meta-parameters which govern the computation of statistics hi, like the length of the history considered. Another well-known ML regressor is RF, which constructs a combination of multiple decision trees for regression purposes [34]. Moreover, the results show that during an unprecedented challenge, when a sudden change in the withdrawal pattern occurs, by utilizing preceding days datainstead of features such as day-of-the-week, month, or yearwe can better map the following datapoints and, in turn, boost the prediction outcome. By definition, stationary time series hold the p value and test statistics lower than 0.05 and critical values, respectively (see Table Table22 for more details). We have to keep in mind that, the performance of the trained model deteriorates as the size of the training data set shrinks. Because of this, over the past years, the number of ATMs in the world has increased, reaching over 3 million machines [7]. The MSE, as formalized by Eq. It often performs better than one-hot encoding. Predicting closed price time series data using ARIMA Model. However, in case there is normal volatility in the time series, these features play a more pivotal role than the preceding days cash withdrawal information. Business would probably be interested to see a final tabular report. ATM (a group of ATMs can also be worked that is treated as a single ATM) to develop a SA: methodology, software, formal analysis, and writingoriginal draft. The former utilized self-organizing fuzzy neural networks and obtained 21.5% SMAPE, while the latter predicted time series with recurring seasonal periods and developed a model based on a combination of forecasting methods via a simple average of forecasts, achieving 22.2% SMAPE. Teddy and NG [40] incorporated local learning to model the complex dynamics of heteroscedastic time series effectively. In: 20th International Conference, EURO Mini Conference,Continuous Optimization and Knowledge-Based Technologies(EurOPT-2008), Selected Papers, Vilnius, 2008; p. 41621. As shown in Table Table4,4, the category-wise prediction can enhance the forecasting by at least 4%. Some banks might store 40% more cash in ATMs than the actual demand and banks might have thousands of nationwide ATMs. Cash demand in ATMs require accurate prediction which is no different than in other vending machines. The multi-step prediction approach is an intuitive method used in the prediction of the sequence of values in time series problems via using observed values in the past [7]. Later, using the same dataset, Taieb et al. The number of trees (n-trees) and the fraction of features used to grow each tree (max-features) are the primary hyperparameters that need to be tuned for this method [29]. However, the cash demand from ATM 2 (located in business districts) is lower because fewer peoplemostly personnel of the companies/agencies in the vicinityhave access to such ATMs. In machine learning exercises, there are three broader parts: (1) data extraction & mining which helps to decide on the features (this normally takes around 6070%), (2) decide and fit a model which includes hyper-parameter optimization(this normally takes 1015%), (3) accuracy metrics & testing takes 1015% of time). If the forecast is wrong, it induces a considerable amount of costs. Variables importance plot is shown below: LightGBM offers good accuracy with integer-encoded categorical features. [39] obtained the best results with a multi-input, multi-output forecasting strategy that selected autocorrelation selection criteria using input variable selection, deseasonalization, and average weight combination. After OHE we have 2244 observations with 27 columns (features). However, the approximated approach seems to be a more reasonable strategy when it comes to forecasting ATM cash demand in the following days due to using the previously estimated values, and not adopting the actual values. It should be mentioned that the improvement of the performance of these models is not possible without finding the optimal combination of hyperparameters. demand for every ATM fluctuates with time and often superimposed with non-stationary where i are the parameters of the model to be estimated from the data, and hi are different statistics of cash demand history. Also, the chronological cash Comparing Fig. As a library, NLM provides access to scientific literature. They boosted the prediction quality by modeling days of the month, labeling each day from 1 to 31. Due to vacillating users demands and seasonal patterns, it is a very challenging problem for the financial institutions to keep the optimal amount of cash for each ATM. On utilizing self-organizing fuzzy neural networks for financial forecasts in the NN5 forecasting competition. To address this issue, we proposed a modified fitness metric that simultaneously considers both prediction error (MSE) and trend (POCID). [31] concluded that machine learning models underperform the counterpart in terms of accuracy and thereby might be upsetting from a scientific perspective. for the weekend or, end of the month when people get their salaries or. Therefore, a small optimization in business operations would contribute to high earnings. However, this metrics limitation is that the prediction trend in time series is not clearspecifically when the pattern abruptly changes. They used MAPE (mean absolute percentage error) as a performance criterion and obtained prediction errors for simulated and real-case data equal to 510% and 2530%, respectively. The description of each model is beyond the scope of the current study, but each is briefly discussed here for context. To accomplish this, first, we collected real data from three different categories of ATMs, based on their accessibility and environmental factors that substantially affect both the daily cash demand and the withdrawal pattern. ATMs in each category have a similar distribution, and ATMs used in this study are the most representative of their group. The ATM demand forecasting problem became more popular after the Forecasting Competition for Artificial Neural Networks and Computational Intelligence (NN5 Competition) in [17]. between 710th day of each month some people get their pension. S8. The prediction of the change in direction (POCID) metric, denoted by Eqs. (2) and (3), addresses the issue by mapping the prediction trend and estimating the accuracy of the directions changes. This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. On the other hand, if banks do not have the proper mechanism to track the usage pattern, then frequent re-filling ATMs will reduce freezing and insurance cost but increase logistics cost. The bank pays different re-filling costs depending on its policy with the money transportation company. Taking comprehensive hyperparameter tuning into account, the reasons for such impressive results might be mainly related to the high performance of ARIMA and SARIMA for short-term prediction [44] and the fact that we aimed to predict the demand just after the occurrence of the pandemic, while avoiding or minimizing overfitting. sharing sensitive information, make sure youre on a federal Notebook. Arabani SP, Komleh HE. government site. iksena/atm-withdrawal-forecasting - GitHub Figure6 schematically compares these two iteration strategies. to use, copy, modify, merge, publish, distribute, sublicense, and/or sell Our above exercise is for illustration purpose. Barrow D, Kourentzes N, Sandberg R, Niklewski J. The online version contains supplementary material available at 10.1007/s42979-021-01000-0. Fuller test results for the ATMs time series. 9B, generally, all predictors performed more accurately before COVID-19 than during COVID-19 in terms of MSE. 12B), the parametric method of ARIMA outperformed the other predictors with high performances in both MSE and POCID. 1, in the last month of the first 2 years, the cash demand has an upward trend for all ATMs. In aiming to provide the feature matrix for machine learning models, some new influential variables are added to the literature. It is quite obvious that daily cash withdrawal amounts are time series. We will work on the demand for a single SectionMethodology introduces the methodology and research process, followed by the results and discussion in SectionResults and Discussion. With that aim, the models were implemented and compared after performing an exhaustive statistical analysis, coupled with grid search and k-fold cross-validation techniques that led to the highest performance of models. The fuller test is a test that is known to ensure the data are stationary. Lima Junior AR. We will work on the demand for a single ATM (a group of ATMs can also be worked that is treated as a single ATM) to develop a model for the given data set. According to an extensive study in time series prediction conducted by Parmezan et al. ATM Cash Prediction Using Time SeriesApproach, in . Ding S, Li Y, Wu D, Zhang Y, Yang S. Time-aware cloud service recommendation using similarity-enhanced collaborative filtering and ARIMA model. In the literature, many studies examined time-related independent variables to capture the seasonality in the data. Coyle D, Prasad G, McGinnity TM. high, medium, low) based on withdrawal amount and number of withdrawal transactions. Introduction to time series forecasting with python: how to prepare data and develop models to predict the future. Careers, Unable to load your collection due to an error. To achieve this, first, ATMs are categorized based on accessibility and surrounding environmental factors that significantly affect the cash withdrawal pattern. However, cash demand is inherently comes with high variance and non-stationary stochastic process which can affect the reliability of many approaches. In the following section, we review the literature on modeling and analyzing ATM cash withdrawal predictions. Cash Demand Forecasting of ATMs: Time Series Regression Model 2, ATMs 1 and 2 have a high cash demand during weekdays, followed by a low amount of money withdrawn on weekends. Figures S9S11 report the results of other ATMs. [20] included a location feature as an independent variable of the model and proposed grouping ATMs into nearby-location clusters. It also considers the importance of day-of-the-week and includes it as a dummy exogenous variable. A lot of machine learning models are available to choose from and deciding where to start can be intimidating. The moving average (q) of 0 or 1 is good enough to be applied. Figure3 represents the splitting of the data for ATM 1. (4) as follows: The min and max possible values of the Fitness metric are 0 and 100, respectively. In this study, it has been revealed that error measure (e.g., MSE) alone cannot be the best evaluation metric in comparing the performance of the predictors on ATM cash demandespecially when the withdrawal pattern drastically changes as a result of preventive measures such as a stay-at-home order or partial lockdowns that are taken to reduce the spread of COVID-19. The only difference is the product which is cash needs to be replenished for a priory set period of time. See Figs. Though, the model missed some of the days but on an average the prediction is better than XG Model. MA: conceptualization, methodology, software, formal analysis, writingoriginal draft, and writingreview & editing. S1A, The day-of-the-week withdrawal pattern from different types of ATMs. S1B, C of the supplementary material. 99th European Study Group with Industry. Andrawis et al. Permission is hereby granted, free of charge, to any person obtaining a copy The results also showed that the comprehensive analysis conducted in this study led to a high level of accuracy in estimating cash withdrawal from ATMs.
Cheers Recliner Website, Articles A