Are some companies easier to predict than others?
I am beginning some research into stock market prediction. This will serve two primary goals.
First, I want Encog to really "flex its muscles" in terms of training on some large input data. Soon enough Encog 2.1 will begin and I really hope to increase training efficiency, as well as introduce concurrency that should really speed up Encog on dual, quad and dual-quad machines. Grid is a goal as well.
Secondly, I want to create quite a few examples for financial forecasting using Encog 2.x. This article series is the beginning of this.
I decided to start simple. I am using an input window of 10 days of trading to predict the next day (day 11). I trained the neural network with 10 years worth of market data using only the stock price. In the future I want to experiment with using sectors, where, other stocks in the same sector may be used as indicators for yet other stocks in that sector. More on that later.
For this experiment I wanted to determine two things. First, what is the optimal number of hidden layers. Encog 2.0 has a brute force method for attempting to determine optimal hidden layer structure. You input max/min hidden layers and counts into the pruning algorithm and then it simply trains every combination and sees which one has the best error drop-off. Using 500 iterations of RPROP training, over the ten years I was able to determine that 29 neurons in the first hidden layer and 13 in the second worked well. It took nearly 2 days of processing to do this on a dual core machine. The pruning algorithm even in Encog 2.0 is fully threaded and takes advantage of multicore.
Then using this feedforward neural network I wanted to see if certain members of the S&P 500 were easier to predict than others. Again, using ONLY the historical price of the stock as an indicator. To do this I collected ten years of data on each of the S&P stocks and trained each for 1000 iterations of RPROP. This was quite a few trianing sets given that it was 10 years of data. This took almost 4 days of training on my dual core. Again, using threaded code. All Encog 2.0 stuff.
You can see the results here:
http://www.heatonresearch.com/node/1020
Some stocks performed quite a bit better than others. Stocks such as CBS and UPS seemed to train much more easy than stocks on the other end of the spectrum, such as Dell. I have no explanation for this, other than that some stocks seem to exhibit more "rhythmic" price fluctuation than others.
This is just the beginning of this article series, and more will be coming as I continue the research. Part of the delay will be training times as it can take consierable time to train these large datasets.
Jeff




For the next experiment, I took the last 60 days of prices for each of the S&P 500 stocks and used a the 1000-iteration trained neural networks to attempt to predict future prices over the past 60 days. Some stocks clearly performed better than others. The performance correlated to the companies that had better training results. For example CBS's direction was predicted with 66% accuracy. CBS had trained to a 1.4% accuracy. You can see the prediction results here:
http://www.heatonresearch.com/node/1020
the problem with stock market predictions is that there are too many variables. i don't think using price and time will be enough for accurate predictions. there are a lot of other variables (transports index, oil price, gold price, home prices, volume, open interest, commitment of traders report, currency pairs, economic indicators like consumer spending, industrial production, and a lot more) . And another thing is the timeframe interval. For example, the up trend on the stock market began 200 years ago, so the network has to predict prices within small interval, using as reference a bigger timeframe interval, for example if you try to predict daily prices you have to use monthly prices as reference to where the trend is going. You also have to identifty trading patterns on the chart, there are 2 of them , "trading range" or "trending", so you can use one network for identifying trends, another for prediction based on what the first network suggested. Very complicated stuff, Goldman Sachs uses large clusters to do all this stuff automatically.
Hi ,
I am new to this forum. I really excited to use the Encog API. I have started exploring the APIs.
For last 1 year I have worked days and nights on building a trading model based on neural networks. I primarily test the networks on Forex. Predicting price or a percentage accurately is a very hard task.I would say its an impossible task. But we dont need the network to be so accurate. The output of one network should just give a clue and this clue should be used as a input to another neural network which classifies the market as bull / bear.
Predict the future value of an leading technical indicator and provide that output as an input to another network which classifies the market as a bull and bear(+1,-1). To avoid overtraining, optimize your model for a smooth equity curve. This you could do it by basing the fitness criteria to drawdown percent.