
Day-to-day Stock market prediction using machine learning


Machine Learning can play a key role in a wide range of critical applications. In machine learning, Linear Regression (LR) and Support Vector Machines (SVMs) are widely used to predict the stock market. But SVM has advanced features such as high accuracy and predictability. The nature of the stock market movement has always been ambiguous for investors because of various influential factors. The prediction of a stock market direction may serve as an early recommendation system for short-term investors and as an early financial distress warning system for long-term shareholders. Forecasting accuracy is the most important factor in selecting any forecasting method. The appropriate stock selections that are suitable for investment is a very difficult task. The key factor for each investor is to earn maximum profits on their investments. We use the Support Vector Machine Algorithm (SVM) for predicting the accurate value of day to day stock market.
Why: Problem statement
Recently, a lot of interesting work has been done in the area of applying Machine Learning Algorithms for analyzing price patterns and predicting stock prices. Most stock traders nowadays depend on Intelligent Trading Systems which help them in predicting prices based on various situations and conditions. Recent researches use input data from various sources and multiple forms. Some systems use historical stock data, some use financial news articles, some use expert reviews while some use a hybrid system that takes multiple inputs to predict the market. Also, a wide range of machine learning algorithms is available that can be used to design the system. These systems have different approaches to solving the problem. Some systems perform mathematical analysis on historic data for prediction while some perform sentiment analysis on financial news articles and expert reviews for prediction. However, because of the volatility of the stock market, no system has a perfect or accurate prediction.
How: Solution description
Investors are familiar with says, buy low and sell high but this does not provide enough context to make proper investment decisions. Before an investor invests in any stock, he needs to be aware of how the stock market behaves. Investing in a good stock but at a bad time can have disastrous results, while investment in a mediocre stock at the right time can bear profits. Financial investors of today are facing this problem of trading as they do not properly understand which stocks to buy or which stocks to sell in order to get optimum profits. Predicting the long term value of the stock is relatively easy than predicting on a day-to-day basis as the stocks fluctuate rapidly every hour based on the world.
A support vector machine (SVM) is a supervised machine learning model that uses classification algorithms for two-group classification problems. After giving an SVM model set of labelled training data for each category, they’re able to categorize new text.
Compared to newer algorithms like neural networks, they have two main advantages: higher speed and better performance with a limited number of samples (in the thousands). This makes the algorithm very suitable for text classification problems, where it’s common to have access to a dataset of at most a couple of thousands of tagged samples.
How is it different from competition
Our solution uses a different algorithm and different techniques to perform the prediction. We are using Support Vector Machines with C type classification and Radial Basis Function(RBF) kernel. And we use a Decision tree to predict the stock market. This project is useful for both Long and Short term shareholders.
It uses SVM and Decision Trees which have better performance than Neural networks. Moreover, using SVM will take away the burden of matching the present price pattern with historic patterns and also SVM trains faster than a NN and has a lower computational cost. Also, other solution uses the financial data as it is without using any indicators, whereas our solution uses many indicators such as EMA, RSI, MACD, SMI, CCI, ROC, CMO, WPR and ADX to get better results.
We will implement the system using a machine learning technique. We will train both the systems using 75% of 2 years of historic data and then test our model to check which systems yield better output using the remaining 25% of historic data.
Who are your customers
Businessman
Shareholders
Entrepreneur
Project Phases and Schedule
Planning, Analysis, Requirements, Design, Software development, Implementation, Testing, Maintenance.
Resources Required
Hardware specification :
- Processor - Minimum 2GHz
- Micro SD card
- Hard Disk - Minimum 80GB
- RAM - 4 GB
Software specification :
- Environment - RStudio 2010
- OS - Windows Environment 7 and above
- Backend - R
/* Your file Name : Svm.r */ /* Your coding Language : r */ /* Your code snippet start here */ library(quantmod) library(lubridate) library(e1071) library(rpart) library(rpart.plot) library(ROCR) options(warn = -1) a <- c('AAPL', 'FB', 'GE', 'GOOG', 'GM', 'IBM', 'MSFT') for (i in 1:length(a)) { SYM <- a[i] print('-------------------------------------------------------------------- -----') print(paste('Predicting the output for', SYM, sep = ' ')) trainPerc <- 0.75 date <- as.Date(Sys.Date() - 1) endDate <- date#as.Date("2016-01-01") d <- as.POSIXlt(endDate) d$year <- d$year - 2 startDate <- as.Date(d) STOCK <- getSymbols( SYM, env = NULL, src = "yahoo", from = startDate, to = endDate ) RSI3 <- RSI(Op(STOCK), n = 3) #Calculate a 3-period relative strength index (RSI) off the open price EMA5 <- EMA(Op(STOCK), n = 5) #Calculate a 5-period exponential moving average (EMA) EMAcross <- Op(STOCK) - EMA5 #Let us explore the difference between the open price and our 5-period EMA MACD <- MACD(Op(STOCK), fast = 12, slow = 26, signal = 9) #Calculate a MACD with standard parameters MACDsignal <- MACD[, 2] #Grab just the signal line to use as our indicator. SMI <- SMI( Op(STOCK), n = 13, slow = 25, fast = 2, signal = 9 ) #Stochastic Oscillator with standard parameters SMI <- SMI[, 1] #Grab just the oscillator to use as our indicator WPR <- WPR(Cl(STOCK), n = 14) WPR <- WPR[, 1] ADX <- ADX(STOCK, n = 14) ADX <- ADX[, 1] CCI <- CCI(Cl(STOCK), n = 14) CCI <- CCI[, 1] CMO <- CMO(Cl(STOCK), n = 14) CMO <- CMO[, 1] ROC <- ROC(Cl(STOCK), n=2) ROC <- ROC[, 1] #DPO <- DPO(Cl(STOCK), n = 10) #DPO <- DPO[, 1] PriceChange <- Cl(STOCK) - Op(STOCK) #Calculate the difference between the close price and open price Class <- ifelse(PriceChange > 0, 'UP', 'DOWN') #Create a binary classification variable, the variable we are trying to predict. DataSet <- data.frame(Class, RSI3, EMAcross, MACDsignal, SMI, WPR, ADX, CCI, CMO, ROC) #Create our data set colnames(DataSet) <- c( "Class","RSI3", "EMAcross", "MACDsignal", "Stochastic", "WPR", "ADX", "CCI", "CMO", "ROC" ) #Name the columns #DataSet <- DataSet[-c(1:33), ] #Get rid of the data where the indicators are being calculated TrainingSet <- DataSet[1:floor(nrow(DataSet) * trainPerc),] #Use 2/3 of the data to build the tree TestSet <- DataSet[(floor(nrow(DataSet) * trainPerc) + 1):nrow(DataSet),] #And leave out 1/3 data to test our strategy SVM <- svm( Class ~ RSI3 + EMAcross + WPR + ADX + CMO + CCI + ROC, data = TrainingSet, kernel = "radial", type = "C-classification", na.action = na.omit, cost = 1, gamma = 1 / 5 ) print(SVM) confmat <- table(predict(SVM, TestSet, type = "class"), TestSet[, 1], dnn = list('predicted', 'actual')) print(confmat) tryCatch({ acc <- (confmat[1, "DOWN"] + confmat[2, "UP"]) * 100 / (confmat[2, "DOWN"] + confmat[1, "UP"] + confmat[1, "DOWN"] + confmat[2, "UP"]) #if (acc > 60) { xy <- paste('SVM : Considering the output for', SYM, sep = ' ') yz <- paste('Accuracy =', acc, sep = ' ') out <- paste(xy, yz, sep = '\n') print(out) write(out, file = "out", append = TRUE, sep = "\n\n") #} }, error = function(e) { }) predds <- data.frame(predict(SVM, TestSet), TestSet$Class) colnames(predds) <- c("pred", "truth") predds[, 1] <- ifelse(predds[, 1] == 'UP', 1, 0) predds[, 2] <- ifelse(predds[, 2] == 'UP', 1, 0) pred <- prediction(predds$pred, predds$truth) perf = performance(pred, measure = "tpr", x.measure = "fpr") auc.perf = performance(pred, measure = 'auc', col = "red") rmse.perf = performance(pred, measure = 'rmse') print(paste('RMSE =', rmse.perf@y.values), sep = ' ') print(paste('AUC =', auc.perf@y.values), sep = ' ') plot(perf, col = 1:10) abline(a = 0, b = 1, col = "red") print('-------------------------------------------------------------------- -----') }
Comments
Are you Interested in this project?
Do you need help with a similar project? We can guide you. Please Click the Contact Us button.
Contact Us
Most Recent Projects
Remote Farm Monitoring System posted by t c adityaa
Leave a Comment