This experiment aims to evaluate the effect of historical data periodization on gold price forecasting performance using the Long Short-Term Memory (LSTM) method.
- The main data used is gold price data downloaded from https://www.kaggle.com/datasets/rizkykiky/gold-price-dataset. The dataset used is daily gold price data in US dollars (file name: Daily_US.csv). This dataset consists of 11,626 data points, representing gold prices from December 29, 1978, to July 21, 2023. For research purposes, the dataset was converted into Excel format and renamed hargaemas.xlsx without changing the attribute names and content of the original data. The attributes of this dataset are Date and USD (containing the gold price in US dollars).
- After running the main program code (file name: modifikasiemas28Mei25.ipynb), an Excel file is created containing the RMSE training and RMSE testing values for each iteration, 10 times for each iteration period. The resulting Excel file is named rmse_results.xlsx. This file is not stored in this repository, as it will be created when the program is run.
- For each training data period, the average value of 10 iterations for the training RMSE and testing RMSE is recorded, which are then saved in an Excel file (file name: Ringkasan21periode.xlsx). Since there are 21 training data periods, this file contains 21 data items. This Excel file consists of the following attributes: Count of data, train RMSE, and test RMSE.
- The RekapRMSETesting.xlsx data is essentially the same as the Ringkasan21periode.xlsx file, but only includes the number of data and test RMSE columns used for the GrafikRMSESmooted.ipynb code.
There are two main codes or algorithms used:
- simulasiemas28Mei25.ipynb: Google Colab notebook code used to record the RMSE value for each training data period with 10 iterations using the hargaemas.xlsx dataset. Each run generates a file named rmse_results.xlsx.
- GrafikRMSE.ipynb: Google Colab notebook code used to display a graph of the train RMSE and test RMSE values for each period using the Ringkasan21periode.xlsx dataset. The graph will reveal the pattern trends from the RMSE test.
Meanwhile, Grafikhargaemasmenyeluruh.ipynb, GrafikRMSEorde2.ipynb, GrafikRMSESmooted.ipynb each is Google Colab notebook code used to display the corresponding graph.
Some of the Python libraries used are:
- Ipython: provides an interactive shell for Python
- Pandas: used for table-based data manipulation and analysis (dataframe)
- Numpy: used for numerical computation
- Matplotlib: used for data visualization in graphical form
- scikit-learn (sklearn): used for machine learning
- TensorFlow (Keras API): Framework for deep learning and neural networks
- Openpyxl: used to read and write Excel files (.xlsx).
- scipy.interpolate (make_interp_spline): used to interpolate data to make the graph smoothe
https://www.kaggle.com/datasets/rizkykiky/gold-price-dataset