Stock Prediction using Machine Learning and Python | Machine Learning Training | Edureka

Stock Prediction using Machine Learning and Python | Machine Learning Training | Edureka

SUBTITLE'S INFO:

Language: English

Type: Robot

Number of phrases: 630

Number of words: 4392

Number of symbols: 19277

DOWNLOAD SUBTITLES:

DOWNLOAD AUDIO AND VIDEO:

SUBTITLES:

Subtitles generated by robot
00:00
[Music] the art of forecasting stock prices has been a difficult task for many of the researchers and analysts in fact investors are highly interested in the research of area of stock price reduction for a good and successful investment many investors are keen on knowing the future situation of the stock market in such a scenario an effective prediction system for stock markets helped traders investors and
00:35
analysts by providing support of information like the future value of certain stocks hi all I welcome you to this stock prediction session using machine learning in this work I present a record new rule network and long short term memory or lsdm approach to predict stock market indices so without much ado let's get started so on our agenda today I'm gonna start out by a little introduction on this piece then I'm
01:06
gonna explain to you what an lsdm is and then we are going to move on straight to a model before concluding this session also don't forget to subscribe to us and hit that Bell icon to never miss an update from the a Eureka YouTube channel and if you want to know more on machine learning data science or any other related field do not forget to check out our certification trainings the link to which I will leave in the description box below so let's get straight to the introduction shall we now there's a load of complicated financial indicators and
01:36
also the fluctuation of the stock market is very very violent however as the technology is getting advanced the opportunity to gain a steady fortune from the stock market is increased and it also helps experts to find out the most infinitive indicators to make a better prediction for those of you who do not understand stocks stocks are basically an equity investment that represents part ownership in a corporation or a company it entitles you to a part of that company's earnings and
02:07
assets now the prediction of the market value is of great importance to help in maximizing the profit of your stock option purchase while keeping the risk low and this is important because you need to invest your money in a stock which is going to increase in value over time and not decrease so RN ins or recurrent neural networks have proven to be one of the most powerful models for processing sequential data the long short term memory is our the most
02:39
successful RN and architectures now the Ellis team introduces the memory cell a unit of computation that replaces the traditional artificial neurons in the hidden layer of the network with these memory cells networks are able to effectively associate memories and input remote in time hence the suit to grasp the structure of the data dynamically over time with high prediction capacity now this is what the lsdm architecture looks like we have the
03:10
forget gate now for the sake of illustration let's assume that we are reading words in a piece of text and want to use lsdm to keep track of grammatical structures such as whether the subject is singular or plural if the subject changes from singular word to a plural word we need to find a way to get rid of our previously stored memory value of the singular or plural state in an Alice TM the forget state lets us do this here WUF are weights that govern the forget
03:40
gates behavior he concatenate the value in the square brackets and multiplied with the weight the equation results in a vector called tau F with values between 0 and 1 this forget gate vector will be multiplied element wise by the previous cell state so if one of the values of tau F is 0 or close to zero then it means that the lsdm should remove that piece of information in the corresponding component if one of the values is 1 then it will keep the
04:11
information next kindly pay attention to the update gate once we forget that the subject being discussed is singular we need to find a way to data to reflect the new subject is plural now there is a set formula to update the gate similar to the forget get here towel you is again a vector of values between 0 & 1 and this will be multiplied element-wise with CET in order to compute the value to update the new subject we need to create a new
04:43
vector of numbers that we can add to our previous cell state which gives us our final new state and finally I have to discuss the output gate now to decide which outputs we are going to use we will be using two formulae which are given on your screens right now but first equation is where you decide what to output using a sigmoid function and the second equation you multiply that by the tan h of the previous state now I wouldn't go much into it but let me just
05:15
conclude this section by saying that Alice Tian's are basically explicitly designed to avoid long term dependency problems remembering information for longer periods of time is practically that default behavior it is not something they struggle to learn unlike the more rudimentary model of an RNN so let's get straight into the model now the model I will be using today is based on Google stock prediction now this is an RNN model that i saw a couple of
05:46
places floating around so i'm not sure who originally did it but it's a great way to start understanding how you can use RNN and lsdm to do stock prediction okay so how this is gonna go is I'm gonna discuss the methodology for a little bit and then I'm gonna run it on Jupiter notebook and Shore - yes alright so we'll go through this step by step first step is the raw data obviously in the
06:17
stage the historical stock data is collected from the Google stock price and this historical data is used for the prediction of the future stock prices I hope this much is clear now don't worry the data set that I will be using the training the testing data as well as the code that you will be seeing right now and leave the links to all of these things in the description box below so you can follow along if you like or also run them later to see if you can
06:48
so this is the model again the first thing we're gonna do is import the libraries so I'll take a little bit of time discussing all the libraries we are using numpy so that we can apply mathematical functions and operations to our arrays multi-dimensional arrays then we are using matplotlib for visualization purposes we are using pandas which is basically a data analysis and manipulation tool and what's cool about pandas is that it
07:19
takes data like a csv file or an excel file or an SQL database and creates a Python object with rows and columns called a data frame which looks very similar to a table in a statistical software yes and then finally we are also going to be importing daytime library to work with dates as date objects so quickly gonna run that next what I'm going to do is read the data set as our step mentioned here's the raw
07:49
data set and I'm also going to go ahead and display the head the head meaning the top five rows present in my data set as you can see we have open high low close and volume along with date which are the six columns present in any stock history type of a data set if you go through them now the listed low or the listed closing price is the last price that anybody paid for a share of that stock during the business hours of
08:22
the exchange where the stock trades or the open is the price from the first transaction of a business day apart from that there's also high and low hi basically means the highest price in a given period of time and low means the lowest price in a given period of time and finally you might see the last column which is volume it's nothing but the number of shares or contracts traded in a security in context of a single
08:52
stock trading on a stock exchange the volume is commonly reported as the number of shares that changed hands during a given day the transactions are then measured in stocks bonds options contracts future contracts and commodities alright now since I have used head here you can see the top five rows if I would have used tail and run it here you can see the bottom five so now you can see our data starts from the 3rd of March 2012 and goes up to the end
09:24
of December 2016 ok so next what I'm going to do is I'm going to check if any of my data is not applicable this right here this is any function right here is used to detect missing values it returns a boolean since eyes object indicating if the values are not applicable now not applicable values such as none or numpy any N gets mad to true values and everything else gets mapped to false values now you can see all of our
09:56
columns here show false which basically means we have no more applicable values then we are printing out the basic info of our data sets that's what this function does we can see all our 5 columns all their non land values their data type along with memory usage finally we are printing out the growth of the price and stocks from 2012 to 2017 because up like this as you can see
10:30
it has risen quite a lot over five years and why wouldn't it be we are talking about Google and if I'm not wrong Google's parent company saw its stock price rise by almost 85 percent between 2014 to 2017 going from about eight hundred twenty dollars to fifteen hundred nineteen dollars in three years the rise was primarily driven by a significant increase in total revenue and a slight decrease in the shares
11:01
outstanding now I'd like you to go back and check the data type as you can see three of the columns have float as the data type while to have object so to homogenize it what we are gonna do is we are going to convert the column types of the data frame all right now the next thing we are interested in is what is the seven days rolling mean of our stock price this means in the simple example
11:34
that for every single stock prediction we look seven days back collect all the transactions that fall in this range and get the average of our column luckily this is extremely easy to achieve with pandas so here we go as you can see the first six layers do not have any output it is going to start from the 7/3 as we have taken the rolling mean of seven days all right and as I had mentioned earlier the head function here is going
12:04
to give us the first 20 rows here we are basically taking a bracket of seven days and beginning the mean of the previous seven days I'll treated every day forward after the seventh day yes and this is how it compares with the previous graph that we had gotten we had the rolling means in orange and we have a previous four off in the same blue this right here basically gives you the moving average of the past 30 days
12:40
let's note I'll plot the closed column versus the seven day moving average of the closed column so when we look at this plot the blue line here is the close price column and the orange line here is the thirty day rolling mean of the truce price column now you also have an option of going ahead and specifying a minimum number of periods for this so if you just keep like the minimum period of one this is what your graph is going
13:10
to look here we are basically trying to say that the minimum number of observation per window witches of thirty days should be one and this is why our graph looks like this and with that we are creating our first data frame which is of the training set and that's it and reading the contents of the data set using pandas now what do we do when we take into account everything from the start of this time series to the rolling point of the value now we are going to be data
13:43
pre-processing we are on our stage to the pre-processing stage involves data discretization where we will reduce a part of our data but with particular importance for the numerical data next we are going to do a data transformation which is basically nothing but data normalization then we will clean the data fill in the missing values we are going to integrate our data files and after the data set is transformed into a clean data set the data set will be divided into training and testing sets
14:13
so as to evaluate so in the end what we are trying to do here is creating a data structure with 60 timestamps and one output in total so we're gonna start out by cleaning our data we are doing the same thing which we had done before is checking if there's any not applicable possibilities and then move on to feature scaling for which we are going to be importing min max scalar from scikit-learn which is nothing but a machine learning library for python
14:43
we'll be using the min max scaler to transform features by scaling each of them to set range and here our feature range is 0 to 1 for obvious reasons then we're going to go ahead and run it then finally we are going to be creating a data structure with 60 a time stamps and 1 output so basically what we're trying to do here is that you're basically going to take the data from day 1 to day 60 and then make predictions on the 61st
15:17
day and then we are going to follow it up by taking data from day number 2 to day number 61 and then predict on the 60 second day so our eye is going to go from 60 to the end of our range which is 1 2 5 8 and then we are going to append to the extreme white rain-x train starts at I minus 60 so if I is 61 then I minus 60 is 1 it starts from I minus 60 so it
15:49
starts from the first date ends at 61 and then the Y train is basically going to give us our prediction on the IFD which if we take 61 is our first prediction from the first 60 days so while train is going to give us the prediction on the 61st day so we will be appending both these trains and then reshape the data next is stage 3 which is for feature extraction now in this layer what we're going to do
16:20
is we are going to choose only the features which are to be fed to the neural network will be choosing features from date high low close and volume so to start off with that we're gonna start by importing Kira's libraries and packages this is going to be the first step towards building your RNN Kira's is basically tensorflow is high-level API for building and draining deep learning models and if you know import for
16:49
libraries first is sequential which is basically a linear stack of layers to which you can create a sequential model by passing the list through it then we are going to input something known as a dense now the dense layer is the regular deeply connected neural network layer is the most commonly and frequently used layer and it's used to change the dimensions of your output where the dense layer basically represents a matrix vector multiplication so the
17:22
values in the matrices which are trainable parameters get updated during back propagation so if you get an M dimensional vector as an output a dense layer is used to change the dimensions of your vector next we are going to be initializing our RN n so for a time series problem like this we are basically going to be using a regression model so for our regression deep learning model the first step is to read in the data which is a sequential data
17:54
and we are going to assign this to the model called rake wrestle with that we are moving on to stage four which is the most important stage here and that is training the neural network in this stage your data is going to be fed to the neural network and trained for prediction you're going to assign random biases and weights to your model now this LSD model is composed of a sequential input layer followed by three lsdm layers and a dense layer with activation and then finally a dense
18:24
output layer with the linear activation function so here you can see the first input layer you have your units return sequences the input shape and then the dropout motapod basically is a regularization technique for reducing overfitting in neural networks so it drops out units in a neural network then you have your three layers yes and then finally you
18:54
have your output layer and since you need only one output units equals one next what you're gonna do is compile your RNN here you're going to be using something known as an optimizer and let me just begin by saying that an optimizer is one of the two arguments that are required for compiling from the scarce model another type of optimizer you can greatly affect how fast the algorithm converges to the minimum value also it is important that there is some
19:27
notion of randomness to avoid getting stuck in a local minimum and not reaching the global minimum now there are a few great algorithms but I have chosen to use the atom optimizer which combines the perks of the adder Grad and the rmsprop optimizers now the EDA Grad optimizer essentially uses a different learning rate for every parameter and every step the reasoning behind it is that the parameters that are infrequent
19:57
must have larger learning rates while parameters that our frequent must have smaller learning rates make sense right in other words the learning rate is calculated based on the past gradients that have been computed for each parameter now the rmsprop considers fixing the diminishing learning rate by only using a certain number of previous gradients now that we understand how both of these optimizers work we can look at how Adam works now the adaptive
20:28
movement estimation or Adam is another method that computes the adaptive learning rates for each parameter by considering exponentially decaying average of the past square gradients and exponentially decaying average of past gradients now what you see on the screen are how they are represented the V and the M can be considered as the estimates of the first and second moment of the gradients respectively hence getting the name adaptive moment estimation when this was first used researchers observed
21:00
that there was an inherent bias towards zero and they countered this by using the two estimates you see on your right which leads us to a fire gradient update rule this is the optimizer that I'm using and the benefits are quite a bit now because of this the learning rate is different for every parameter and every iteration and does not diminish as with the ad a grad the gradient update uses moments of the distribution of weights allowing from more statistically sound descent now
21:32
another important aspect of training the model is making sure the weights do not get too large and start focusing on one data point hence overfitting so we should always include a penalty for larger weights where of course the definition of large would be depending upon the type of regular riser used and I have chosen to use the taken of regularization which can be thought of as a minimization problem the fact that the function space is in the reproducing kernel Hilbert space and shows that the notion of norm
22:03
exists this allows us to encode the notion of the norm into our regular riser and finally we can talk a little bit about dropouts now a new method of preventing overfitting considers what happens when some of the neurons are suddenly not working this forces the model not to be over dependent on any groups of neurons and consider all of them dropouts have found their use in making the neurons more robust and hence allowing them to predict the trend without focusing on
22:34
any one neuron here is the result of classification error percentages with and without the dropout as you can see without the dropout they are way higher than with the dropout so upon running fitting the Armen into a training set we have our epochs equals hundred and bat size equals thirty-two epoch is nothing but a frame of time in machine learning it indicates the number of passes through the entire training data set the machine learning algorithm has completed and here we also have
23:11
something known as a bat size which refers to the number of training examples utilized in a single iteration usually this is the number that can be divided into the total data set size and in general 32 is a good starting point for a bat size you could also over 64 128 you get it the multiples of 32 but 32 is a good range to start experimenting with let's put it that way so in this layer you can see the output
23:41
value has been generated by the output layer of the RNN and is compared with the target value the arrow or the difference between the target and the obtained output value is minimized by using back propagation algorithm which adjusts the weights and the bias ease of the network we followed the same four test data pre-processing finally we are going to be making the predictions and visualizing the results we are getting the real stock price of 2017 for which
24:14
we are doing exactly what we did for the data pre-processing part here we are using ILO C or e lock depends on what you guys call it to select rows and columns by number in order that they appear in the data frame you can imagine that each row has a row number from 0 to the total number of rows and dialogue allows selections based on these numbers so now basically we have a prediction the head of the data or the first 5 rows are from the
24:48
3rd to 9th of January of 2017 we are again doing the same thing that we did previously we're checking out the data set info we have all the columns we have the date time index of all the entries we have the memory usage we also have the data types here again volume comes up as object instead of float so we are again
25:19
replacing it with float here again reading the test set and putting it in a data frame and getting out the info and finally to get the predicted stock price out of 2017 we're gonna merge the training set and the test set on the zeroth axis as you can see right here we're gonna set the time step as 60 as we had done previously and reshape the data
25:50
our X test list we are going to be assigning to predict with stock price and then taking an inverse transform of it to get it out of the matrices form here you can see the information on the predicted stock price data frame and finally we are going to use matplotlib to visualize the result of the predicted stock and the real stock price plot you can see that the real stock price went up while our model also predicted that the price of the stock
26:23
will go up this clearly shows how powerful Ellis teams are for analyzing the time series and the sequential data or the analysis here has been implemented with relative ease thanks to Karis and the functional API now in conclusion I'd like to say that the popularity of stock market trading is growing extremely rapidly which is encouraging researchers to find out new methods for the prediction using new techniques before casting technique is not only helpful to researchers but also
26:53
helps investors or any person dealing with the stock market in order to help predict the stock indices a forecasting model with the good accuracy is required in this work we have used one of the most precise forecasting technology is using our own ends and lsdm units which help investors analysts or any person interested in investing in the stock market by providing them a good knowledge of the future situation of the stock market in such little time at least we were able to get the trend
27:23
right right so that's it for today guys I'll be really happy to hear any questions or feedback in the comment section below I'd leave you with that thought my name is a Bosma thank you have a great day I hope you have enjoyed listening to this video please be kind enough to like it and you can comment any of your doubts and queries and we will reply them at the earliest do look out for more videos in our playlist and subscribe to any rekha channel to learn more happy learning

DOWNLOAD SUBTITLES: