COMPARISON OF SUPPORT VECTOR REGRESSION AND RANDOM FOREST REGRESSION ALGORITHMS ON GOLD PRICE PREDICTIONS

-This research was conducted to test how the Support Vector Regression and Random Forest Regression algorithms predict gold futures prices. The data used in this research was taken from the Investing.com website which will later be processed into a prediction model by comparing the SVR and RVR algorithms. The Support Vector Regression and Random Forest Regression algorithms will be tested to see the performance of each prediction model. The test results show that the Support Vector Regression model is superior in terms of accuracy with a value of 83%. However, the Random Forest Regression algorithm is superior with a smaller error rate, namely with an MSE value of 270.85 and an MAE value of 12.53.


INTRODUCTION
Gold is a type of precious metal that is widely used in the production of jewelry and is also a financial asset because it can be used as a store of value.Apart from having high aesthetic value, gold can also be a profitable commodity for people to invest in due to its extraordinary performance which is able to survive in the midst of the world economic crisis and also its resistance to inflation rates which is quite good.[1] Gold is an asset that is usually used as a long-term investment whose value is stable, liquid and safe in real terms.The nature of gold which is resistant to changes in the rate of inflation, easy to cash in and there is also no tax imposed on gold itself, these are what make investors interested in investing.[2] Gold as an investment tool has several types, namely; investing in gold jewelry, investing in gold bars, investing in gold pieces, investing in gold certificates and investing in gold in online trading.In this research, the data used is historical gold trading investment data based on futures contracts with the code XAU/USD with the unit scale used being the Troy Ounce.[3] In investing in gold, investors will not be separated from guessing the rise or fall of gold prices so they don't lose in investing.Investors must be able to predict prices which are certainly always changing so that investors are right in carrying out buying and selling activities.Prediction is to minimize errors that may occur, so that the difference between estimates and actual events can be minimized.Thus, the author will conduct research on gold futures price predictions by comparing the Support Vector Regression and Random Forest Regression algorithms.[4] To predict the price of gold futures, a good calculation is needed.Calculation of gold data can be done using the Support Vector Regression (SVR) algorithm.SVR itself is a development method of Support Vector Machine (SVM) which includes regression attributes which can provide calculations in the form of predictions like a statistical depiction.This algorithm works by matching data and lines, but while maintaining margins and epsilon and this algorithm will provide an overview of a line diagram by comparing the actual data from the previous period with the calculated data.Basically, this algorithm can work on nonlinear data by paying attention to kernel tricks which can make this algorithm very good at overcoming overfitting in training data and testing data which will give the smallest possible error results with a maximum hyperplane.[5] The Random Forest algorithm used for regression modeling of historical gold futures data is also called Random Forest Regression or RFR.Random Forest itself builds trees using bootstrap samples from different data and changes the way regression builds trees.However, in Random Forest Regression, this algorithm works by building many regression trees and then calculating the average value of the predicted results from all the regression trees.[6]

Types of Research
The type of research used here is through a qualitative approach, which is a type of research using data that exists in the past over a consecutive and continuous period of time, with the aim of applying knowledge for problem solving.In this case, the research focuses on comparing the accuracy of the Support Vector Regression and Random Forest Regression algorithms in predicting gold futures prices with a time period starting from June 1 2021 to June 30 2023.

Work Procedures Figure 1. Research process
The stages of comparing the SVR algorithm and the RFR algorithm in predicting gold futures prices consist of several steps.The following are these stages: 1. Data Collection: The initial stage is to collect gold futures price data.In this gold price dataset there are 546 data records with 7 attributes, namely: Date, Last, Opening, Highest, Lowest, Vol., Change (%).This data must be in the appropriate format and ready for processing.
2. Pre-Processing: Data that has been collected requires additional processing, such as cleaning the data from empty values or invalid data.3. Training Data: Data that has been normalized will be continued for training and the data that will be used for training data is from the "Last" column which is the last price of gold price trading in one day, and the data will be trained in the form of numpy data.4. Training Process: Data that has been trained will be stored in a variable and will continue to divide the data by 70% for training data and 30% for testing data. 5. Testing Data: After the data training stage, the process will continue into the data testing process, then the data will be made into a prediction model.6. Error Value Evaluation: After the data enters the data testing process, the data will then be evaluated for the predicted value with the actual value, by determining the MSE value and MAE value.7. Prediction Results: At this stage the results of the data testing process will be visualized in the form of a graph of the results of the prediction model.

Data
The data that will be used for this research is historical gold futures data downloaded from the Investing.comwebsite, with a troy ounce unit scale and in USD.Historical gold data was taken from the time period 1 June 2021 to 30 June 2023.In this gold price dataset there are 546 data records with 7 attributes, namely: Date, Last, Opening, Highest, Lowest, Vol., Change (%).

Method Support Vector Regression
Support Vector Regression (SVR) is an algorithm that matches data and lines, but while maintaining margins and epsilon.Basically, this algorithm can work on non-linear data by paying attention to kernel tricks which can make this algorithm very good at overcoming overfitting in training data and testing data which will give the smallest possible error results with a maximum hyperplane.

Random Forest Regression
Random Forest Regression (RFR) is a machine learning algorithm whose testing process uses a supervised concept in building classifier classes.This algorithm combines predictions based on Multiple Decision Trees.

Research Dataset
The data that will be used for this research is historical gold futures data downloaded from the Investing.comwebsite.Investing.com is a website that provides complete information about Foreign Exchange, Indices & Shares at the url address: https://id.investing.com/commodities/gold-historical-data.This data was taken over a daily data period from 1 June 2021 to 30 June 2023.In this gold price dataset there are 546 data records with 7 attributes, namely: Date, Last, Opening, Highest, Lowest, Vol., Change (%).
The data was initially in CSV (Comma Spread Values) format, which is a data format commonly used to store datasets.However, this data format is not suitable for the data analysis process.So the historical data on gold futures prices which is still in CSV form is entered into data format in the form of Pandas Dataframe for processing internally Jupyter Notebook program based on the Python programming language.With the Pandas Dataframe format, historical data on gold futures prices can be visualized easily and can be further processed using an algorithm that we will test for levels.its accuracy.

Testing the Prediction Model
In the process of testing the prediction model, it includes split data, and Support Vector Regression and Random Forest Regression modeling.The limitations of this research are processing gold futures price data by testing two methods, Support Vector Regression and Random Forest Regression to obtain prediction model results with the highest accuracy and evaluating the resulting error values.

Testing the Support Vector Regression Prediction Model with the Kernel Radial Basis Function
At the testing stage with the Support Vector Regression algorithm, the gold price data that has been input will be processed data with a data division of 70: 30, with 70% for training data (378 data records from the "Last" attribute) and 30% for testing data ( 163 data records from the "Last" attribute).The "Last" attribute is the target column in testing the Support Vector Regression algorithm using the Radial Basis Function (RBF) kernel.As for testing this algorithm, the parameters set in the RBF kernel are, parameter value C = 1000.0and parameter value gamma = 0.0001, and get a prediction accuracy value of 0.83.In testing with the Support Vector Regression algorithm, it is evaluated based on the resulting error value, where the Mean Squared Error (MSE) value is 1214.

Figure 1.Support Vector Regression algorithm testing
The prediction visualization results from testing the Support Vector Regression algorithm on the gold futures price dataset are:

Testing the Random Forest Regression Prediction Model
In testing with the Random Forest Regression algorithm, gold price data will be divided 70: 30, 70% for training data (378 data records from the "Last" attribute) and 30% for testing data (163 data records from the "Last" attribute).").As for testing this algorithm, with the RandomForestRegressor regression model, it gets a predicted accuracy value of 0.76.Testing with the Random Forest Regression algorithm will be evaluated based on the resulting error value, where the Mean Squared Error (MSE) value is 270.85 and the Mean Absolute Error (MAE) value is 12.53.

Figure 3.Random Forest Regression algorithm testing
The prediction visualization results from testing the Random Forest Regression algorithm on the gold futures price dataset, namely:

Figure 2 .
Figure 2.Prediction visualization with the Support Vector Regression algorithm

Figure 4 .
Figure 4.Prediction visualization with the Random Forest Regression algorithm