Machine Learning and Algorithms Supporting Trading

Since there’s no thread like this here, I thought I’d start one. I’ve personally delved into the world of machine learning and algorithms out of my own interest (and a bit for my profession). So, in short: I’ve long been (and will continue to be) a proponent of technical analysis, but one can’t always notice everything with their own eyes, and I see harnessing machine learning and algorithms for data acquisition and visualization as important.

For algo strategies, I mainly use the Turtle Trading strategy, referring to trend trading made famous by Richard Dennis, a superstar of his time. I support this with the ABCD strategy.

I don’t fully trust algorithms and still base my actions on TA, so by analyzing data from different perspectives, such as returns, volatility, sentiments, emotions, average momentum, cyclicity, trend, season, linear analysis, probability calculation, and other methods, I believe that we can lift that veil of future uncertainty from trading. I use Jupyter Notebook as my platform due to its graphical features and code segmentation.

Machine learning has progressed rapidly in recent years but is still in its infancy, so the claims I make in this thread should be viewed relatively critically, and results should be questioned.

Data has been retrieved from Yahoo Finance with daily precision, and the primary timeframe used is five years. This means that in certain cases, the machine has not been taught what a crisis means (2008). Of course, data can be retrieved from elsewhere and with better timeframes, but the problem with other places is that the Finnish stock exchange usually isn’t found there. So far, the most pleasant API I’ve encountered for exploring, for example, US stock exchanges, is Alphavantage.

There’s already a separate thread here for TA, which I use as a supplement to this, so I recommend posting TA-related questions here.

I won’t reveal all the secrets behind the methods, but if you have any questions, wishes, or requests, I’m happy to help.

16 Likes

As a small showcase of what this thread is about for those who don’t feel like reading the introduction but would rather look at pictures, I’ll give two examples. The first example relates to portfolios. In portfolios, it’s important that investments are well-diversified and that stock-specific volatility is at a tolerable level for one’s own risk tolerance. By using return (+/-), price development, average, and volatility as aids, stocks can be compared to each other from a technical perspective.

For the example, I grabbed a handful of Finnish companies; the stock market names are in the pictures:

Apparently, Orion, Kone, Renkaat (Tires), Sampo, Elisa, etc., are low-volatility stocks, while Vaisala, Valmet, Metsä Board, etc., are the higher-volatility individuals in the comparison group. So, the machine gives the above recommendations when returns are ratioed in one direction and volatility in the other, and separated by a divisor. The data is five years old.

If one wasn’t shopping but wanted to do a risk assessment of their own portfolio (all stocks in the example combined), it could be visualized with the same data as follows:


The diversification is thus relatively diverse.

If, instead, one were shopping and hypothetically focused on companies in the “Buy” area, the risk assessment would look like this:

Later tonight, I’ll post another example of a stock-specific analysis using the methods from the thread’s intro message to demonstrate how to get a comprehensive technical overview of a stock.

18 Likes

This is an interesting set. Please also show the machine analysis for Nokia.

2 Likes

I initially thought about using Metsä Board for this example, but let’s quickly put together a rushed Nokia example here. A reminder that this does not consider social media, funds, or other external factors beyond the price at all.

I fed the last 4 years of Nokia’s stock data (<18.7.2019) into the machines. To get an idea of where Nokia stands compared to other companies, I first ran a price comparison for Nokia. In an optimal situation, this would be done against competitors, but I ran this with other random stocks whose data I had already captured. I colored Nokia red for better visibility.

Once you know the approximate position of a stock relative to others, you can start analyzing it through visible data. The simplest way to assess performance is to compare years with each other to find strong and weak periods. Below are the last 4 years in one image.

Broken down, the years look like this:


The turn of the year has often served as a turning point in price. This information is good to keep in mind, but it’s just an observation that we won’t be doing much with in the end.

What interests us is the stock’s price profile. The profile is obtained by calculating the most common price level for the stock:


Apparently, Nokia’s most common level is around five euros. A more precise view of this is obtained by calculating the CloseMean from the 4-year average, i.e., the average closing price level.

By adding the linear average to the chart, we notice that, on average, Nokia has been declining for the past 4 years.

Using the linear average as an aid, we can also illustrate what the stock price looks like compared to the average decline:
6hinnankuvausgraafi
This calculation can also be used to create an indicator for monitoring prices if needed.

By adding MA12 to the chart, we get the following image, whose results we will use in the future:

And that future comes now. So, by utilizing MA12 and closing prices, we can calculate and visualize the average direction of the stock price and the average price deviations (the nuances of the Finnish language). Technically, in the image, you see black as the standard, blue as the closing price, and orange as MA12. The downward trend is visible.

So what does this mean for the stock itself? Can this be made to sound more sensible for the average person? Thanks to dates, prices, and algorithms, seasonal price changes can also be filtered from the data:


The clearest pattern is likely the downward trend, and the most interesting is the seasonal fluctuation. According to the graph, we’ll apparently hit temporary lows again in a couple of days.

But! Does this mean one should buy?

That’s a personal question for everyone and depends on the individual. I personally prefer to wait until all the stars align if a buying decision needs to be made. Identifying overreactions in the stock price is useful in some cases:


The previous overreaction occurred in the spring when the price reached 4.23. This would have been a good time to cast a line.

How does the stock behave at different price levels? Price volatility/scatter can be described as follows:
11scatter-vola
The price is most stable between 4.2 and 5.8 euros.

Turned into a 3D image:

How would an algo identify buying opportunities?

This depends on the algorithm being used. If the algorithm is based on price anomalies and aims for a normal level, i.e., it identifies price extremes and sells based on that, the output could look like this:

Translated into a bar graph, this shows the most favorable buying and selling opportunities over 4 years:


So, the less red there is, the calmer and less risky the movement has been. The red dots represent the points in the image above, and blue represents the price level.

The Monte Carlo simulation method can also be used to map results, which to my knowledge is used, for example, in weather forecasting. You can read more about the method on Wikipedia.
In short, the simulation generates a given number of random forecasts that can be filtered by probability. I will not use the method in this analysis other than for demonstration purposes. Below is a demonstration of the 100 most likely routes for the stock price based on 4 years of data.

This is also much easier to read from a bar chart:

Not strictly an investment strategy, but it gives a good idea of what to expect from the future by combining it with all the other data discussed above.

But, but! At what point do we look at the algorithm results?
Let’s look at them now.

I personally like to use a combination of two different algorithms. One is based on the teachings of the legend himself, Richard Dennis (see Turtle strategy), i.e., changes in trends, and the other on the ABCD pattern in the stock market.

The first mentioned is below. Rules: Max buys: 200, Max sells: 1000:
day 314: buy 200 units at price 887.600000, total balance 9112.400000 (INV 200 )
day 324: buy 200 units at price 864.800000, total balance 8247.600000 (INV 400 )
day 325: buy 200 units at price 830.000000, total balance 7417.600000 (INV 600 )
day 326: buy 200 units at price 813.600000, total balance 6604.000000 (INV 800 )
day 327: buy 200 units at price 794.400000, total balance 5809.600000 (INV 1000 )
day 329: buy 200 units at price 787.200000, total balance 5022.400000 (INV 1200 )
day 330: buy 200 units at price 771.600000, total balance 4250.800000 (INV 1400 )
day 337: buy 200 units at price 762.000000, total balance 3488.800000 (INV 1600 )
day 451, sell 1000 units at price 5370.000000, investment 40.944882 %, total balance 8858.800000, (INV 600 )
day 452, sell 600 units at price 3231.000000, investment 41.338583 %, total balance 12089.800000, (INV 0 )
day 453: cannot sell anything, inventory 0
day 454: cannot sell anything, inventory 0
day 455: cannot sell anything, inventory 0
day 456: cannot sell anything, inventory 0
day 466: cannot sell anything, inventory 0
day 603: buy 200 units at price 802.800000, total balance 11287.000000 (INV 200 )
day 605: buy 200 units at price 784.000000, total balance 10503.000000 (INV 400 )
day 607: buy 200 units at price 780.000000, total balance 9723.000000 (INV 600 )
day 608: buy 200 units at price 774.000000, total balance 8949.000000 (INV 800 )
day 886, sell 800 units at price 4344.000000, investment 40.310078 %, total balance 13293.000000, (INV 0 )
day 887: cannot sell anything, inventory 0
day 958: buy 200 units at price 872.800000, total balance 12420.200000 (INV 200 )
day 959: buy 200 units at price 868.400000, total balance 11551.800000 (INV 400 )
day 960: buy 200 units at price 854.200000, total balance 10697.600000 (INV 600 )
day 961: buy 200 units at price 853.700000, total balance 9843.900000 (INV 800 )
day 962: buy 200 units at price 846.500000, total balance 8997.400000 (INV 1000 )

And below is an image of the ABCD results on the map:


And in indicator form:
The difference between the two, as the most observant have already noticed, is that the first one is suitable for long timeframes and the second one for medium timeframes. In the first one, I use 4 years of data, and in ABCD, 1 year of data.

I’ll throw the forest B (metsä B) here at some point.

And by all means, if any good strategies or observations come to mind, shout them out.

18 Likes

I didn’t find anything in your description that I couldn’t find in TA myself. TA can’t directly predict; it’s based on optimally timing buy and sell moments. These are also dependent on the time horizon of the position. But I didn’t see how future prediction happens from this, and how much data there is on it (the success of predictions).
Since you yourself are familiar with TA, could you explain a bit how this differs from it (compared to the TA methods you have applied yourself)? The calculations/algo would need to work in real-time to be useful, for example, in trading.

2 Likes

You’re partially right about that. Let me break it down a bit. If you want to transition directly to solely algo-trading and take a backseat yourself, it requires access to live data from an exchange via an API, such as from Alpha Vantage. Unfortunately, I haven’t found any API that directly pulls live data from the Finnish stock exchanges I trade, so I’ve had to settle for feeding “static” data via a CSV file. This, of course, excludes live algo-trading and forces day and week trading. If you’re interested in trading the German DAX, it’s available on Alpha Vantage as the closest exchange to Finland. Alpha Vantage’s API is free.

If you’re interested in building a fully automated algo that executes trades for you with millisecond reaction times, the best way to get into this hobby right now is to either fully focus on cryptocurrencies because many exchanges offer their own APIs through which you can get both live data and execute orders, and these APIs are often completely free. In the stock market, full automation is more challenging and mainly focuses on large players. For crypto, there’s a good library for setting up an algo here.

Nordnet here seems to be a pioneer in offering its own API, but apparently not completely free. Additionally, the latest news is from 4 years ago, so there’s no guarantee of its functionality. :confused:

To answer your question about the differences between TA and machine-executed day trading: the advice and strategies learned from TA can be taught to algos through deep learning. The process, in a nutshell, is to code a bunch of indicators and rules, and then, through iterations, teach the model in which situations to do what. Once the model is complete, you can either keep it or change the training data and settings. A suitable model should be saved and put into use. Thanks to multiple iterations in conjunction with training data, models learn to identify correct buy and sell moments much more reliably than humans, which, in live trading, is crucial precisely because of reaction time. Popular models to teach include different variations using ARIMA, CNN, and LSTM separately or together. The most important aspect of a model’s operation is probability calculation and valuing different possibilities, so a “perfect model” doesn’t exist.

A visual example of how a model “sees” the given rules, e.g., for MA-crosses:


Above is the data given to the machine with different types of Moving Averages. Below is the data showing what the model expects to happen in different cross situations.

If we want to go into future prediction, it’s not as reliable information as indicator-based predictions. Here, we often no longer talk about algos, but pure machine learning, which has just reached its early teens. I wouldn’t trust 100% ML-based models too much yet, but one example of “future prediction” is shown in the image below. The technology used is a JavaScript version of TensorFlow.


Image explanation:
The lines plotted are MAs, and the light green is the price development predicted by TensorFlow with one year of training data, which has been processed 50 times during training. Additionally, there’s the predicted volume. If you want somewhat more reliable results than the demo plot above of Metsä Board’s stock, the recommended number of iterations/epochs is around a thousand.
The graph on the bottom left shows the model’s predictions side-by-side with the actual price in the training data. The bottom right shows the number of training iterations and how well the learning has progressed each time, i.e., the lower the loss in the data, the better. I cannot emphasize enough that if relying solely on ML for future prediction, the data must be from a long period and there must be many iterations.

In an optimal model, the AI is given tools that it is taught to use. Machine learning teaches the model the behavior of a certain stock price and certain ‘special features’ if they are present in the training data. For future prediction, a variation of the Monte Carlo model should be linked, from which the most probable outcomes could be selected using machine learning and AI with the help of indicators. Additionally, an NLP processor capable of examining news data should be linked to the model. The simplest NLP model to implement is a Twitter sentiment analyzer, e.g., with Tweepy and TextBlob, but ideally, data should also be obtained directly from relevant news sources, probably with a web scraper like BeautifulSoup4.

I’m not much of a teacher (never have been), but I hope this segmented answer provided some answers to your questions :slight_smile:

11 Likes

Thanks, your answer pretty much aligned with my understanding of what you’ve described. Therefore, the benefit of a bot/algo (calculating historical data based on a chosen model) for up-to-date trading is not so straightforward if one is skilled in TA. The human brain is a powerful processor, and in addition to analyzing patterns/indicators, a person can process a lot of other related information (e.g., news, chat discussions, etc.).

Processing and utilizing online data is currently only possible if you can afford to pay for the data (which, for example, in US stock exchanges is really expensive). That’s why crypto exchanges are quite heavily manned by bots. Since there isn’t much fundamental analysis yet, bots heavily monitor and manipulate various indicator models of TA that TA traders also follow. But because bots are fast, humans lose in this battle, which often forces trading to be done on a slightly larger time frame. A few years ago, I used to trade a lot according to the 1-minute flow.

I could think of many things that would be useful in my own trading. I wouldn’t want an auto-bot that makes trades, but rather new computational assistance for interpreting situations when looking for entry/exit points. But with the condition that the data used would be up-to-date, combined with historical data. I have a long programming background, so the logic/algorithmic thinking of bots is quite understandable in that sense.

But just processing historical data and deriving new insights from it might not be beneficial if one knows TA. I might be wrong, of course, but this is the impression I’ve gotten so far. A truly interesting topic, though. Please continue describing it. It also occurred to me that, of course, the programmer himself locks down what the bot does. For example, with MA/EMA crosses – how much weight does one assign to them in the first place.

9 Likes

Yup. The biggest benefit of this would be if you could directly access live data from the stock market. However, that doesn’t prevent leveraging machine learning for analytics on large day/week timeframes - especially once the model is built and it’s enough to feed it a couple of CSV files without having to think about anything else as the model already outputs the results.

I was contemplating whether to start a new thread or continue under TA because of the analytics side, but we ended up here :slight_smile:

edit. You can see the differences and similarities here

Very interesting stuff!

How real-time do you need the data to be? Would a delay of a few seconds be a problem? I’ve built various data collection methods for my own master’s thesis data, where my idea is to combine various unconventional data sources with machine learning for making investment decisions. In this context, I’ve also developed a bot that fetches real-time exchange rates from a certain marketplace. However, in my own calculations, I’ve tentatively outlined that I give the algorithm about 30 seconds to tolerate unrelated delays and perform calculations.

5 Likes

The acceptable delay in intraday data probably depends on the volatility and volume at which trading is done. However, if one wants to outsource intraday trading to an algo, then the stocks with the highest volatility and volume are chosen as trading targets. It is simply so difficult to actively trade low volatility and low volume targets.

High volatility and high volume stocks often involve other algos with whom one competes in the same league. Assuming that the algos use the same strategy, the biggest competitive advantage in this state-of-the-art algo trading is with those algos that have the freshest data.

Considering that, for example, Outokumpu and Stora Enso are heavily algo-traded, one wonders what the source for these anonymous algos is?

e. That feeling when Swedish is found in AV but not Finnish :neutral_face:

The importance of parsing various news sources is constantly growing. News sources can also be bought, meaning the big players buy news with a delta of a few milliseconds compared to others (I don’t know the pricing, but it’s certainly not cheap). Then there are Twitters, news video broadcasts, etc. There are starting to be many types of sources. It’s quite an evolution to get parsing to a point where it doesn’t lead to errors, but I know that bots are already doing this today.

3 Likes

That’s also true. However, some strategies and algorithms (perhaps the less successful ones) focus directly on price development, ignoring news. Too often, there’s good or bad news that has no impact on anything (see US stock markets).

I started to test an evolutionary model, based on the “survival of the fittest” principle, to see how such a model would suit number crunching.

The target was OMXH25 on a daily chart (maybe I’ll start running indexes). Dataframe from the beginning of the year, with 2000 iterations.


+5.929870 %

I thought that perhaps there were too many iterations in the training, so I tried a lighter batch of 500.


+5.168139 %

Not much difference in the result. Apparently, in these evolutionary models where a few ‘individuals’ are spawned and only the ‘fittest’ are allowed to survive and multiply for the next round, these iterations have a bit more significance than in other AI models.

The AI model in question could be explored a bit more, even if it still doesn’t beat basic strategies.

1 Like

Hi. . That’s really great, you’re a genius. I also have TA (Technical Analysis) under control, at least partly. But I see this as an excellent opportunity to take things to the next level.

  1. How are the overbought and oversold limits formed in the picture?
  2. Could buy (sell) signals be generated from three days’ (or two weeks’) worth of data, simply price and macro data? I’d like to try.
  3. Could you create buy and sell signals, what kind of price data do you need, and in what format?
  4. I know there’s a “machine learner” that’s fed 4000 companies’ quarterly reports, and the machine gives buy and sell suggestions, but it just seems old-fashioned. The reports have been fed for 20 years (?).
  5. I’d gladly buy those interfaces, or rather, they’re already in use with a millisecond delay.
  6. This works as a Finnish exercise, but the target is the US market.
1 Like

The overbought/oversold indicator in another post in this thread was calculated using Matplotlib, feeding it volume, time, and price fluctuations (open, high, low, close). This was used to create a smoothed graph with green and red horizontal lines plotted on top using NumPy’s percentile function. It illustrates quite well how markets overreact in different situations. If the above is Greek to you, you can think of it as a stock market earthquake meter.

Buy and sell points can be generated from any kind of data as long as there is enough of it. For example, a lot can be gained from a day’s worth of data during the training phase if the data is broken down into one-minute or three-minute timeframes. This means breaking a big loaf into smaller pieces. I’ve tried it a couple of times, but it’s quite tedious if you always have to get the data into a CSV. For training, however, it’s preferable to have as much data as possible. Intraday trading works best if you can directly feed reasonably good data from an API into a live-refreshing DataFrame.

If it’s an evolutionary model and we focus directly on a functional model, decision-making in this model is largely based on predicting trends from price movements. In the previous evolutionary example, both cases used a window size of 30 days for training, meaning the data was processed 2000 times in one session and 500 times in the other, in 30-day cycles. The evolutionary model uses a reward system, so the best-performing model is given a reward and a virtual pat on the head before being thrown into the next gladiatorial arena, and the same repeats. In the example, the reward is given to the model that best predicts the trend, which is able to (somewhat) predict the correct trend reversal and hopefully sell or buy at the right time.

An example of how to improve and tweak the model according to your own risk tolerance is by adding trading rules to it. For instance, even if the model has been trained to trade according to its own buy and sell estimates, it won’t execute trades until it receives confirmation from some other calculation. For example, it could look like the image below (great Paint copy-paste skills, I know) where the evolutionary model and the Turtle Strategy are overlaid. Can you spot where the signals are consistent for both?


This means that the AI sees an upcoming change (uncertain, but more certain than an average person’s guess) and gets confirmation from a much more cautious indicator.

I haven’t considered earnings reports or similar factors for these yet. I’m largely doing these for self-entertainment during the holiday season. In the winter, I’ll probably have more private life commitments again.

If you need help or tips, feel free to ask.

4 Likes

Good morning!

A few questions:

Is it possible to create a trading bot for an equity savings account (osakesäästötili), or is it only possible for an investment account (AO-tili)?

Does anyone have experience creating a bot for a Nordnet/Nordea account? Or is it even possible to create a bot for these platforms?

I dream of creating my own bot with either C# or Python. The goal of the project is to develop my coding skills with an interesting theme. Any pro tips are, of course, welcome! Regards, Read my username

Building a bot would require an API. Currently, Nordnet only offers a test API, which I believe is also relatively old. I haven’t been able to get these “dummy” orders to work through that API; all orders seem to get stuck in some kind of limbo. Otherwise, the API works according to the instructions.

I sent a question to Nordnet’s customer service asking if they had an API under development. The reply was simply that they don’t, and there are no plans to implement a functional one.

1 Like

Thanks for the answer. Maybe that project will remain a dream then. Do you have any other ideas on how to combine coding and investing?

I once set up a link from Google Finance to my own pages; it has an open API with 15-minute delayed quotes. This wasn’t really coding, but rather data collection for my own pages where I combined quote data with the latest balance sheet, etc., information and calculated key figures.

Google Finance’s operation was just a bit strange/unreliable in WordPress, so I stopped the project.

3 Likes

Interesting. I wonder if I could, for example, create my own website where I’d collect news related to my investments using a bot.

I might question the trading bots presented in the thread a little, in the sense that can we be sure that the bot hasn’t been overfitted during training to respond correctly to the data? I can’t say I know much more about AI than what I’ve seen in a couple of entertaining YouTube videos, but generally, AI shouldn’t be trained with the data that is used to test its performance. And for example, a linear regression for Nokia stock is, of course, easy to do in retrospect. But will the stock follow that same trend in the future?

There were some interesting observations, though. For example, the seasonality of prices. Those could be used well in trading.

Regarding the API: I’ve noticed that Nordnet actually has pretty neat APIs behind the user interface, even if there’s no official documentation for them. It wouldn’t be very difficult to build something on top of them. It probably breaks Nordnet’s terms of service, but that’s not a huge deal.

Sometimes I’ve thought about making a terminal interface for Nordnet so I could quickly make a few trades and check the portfolio status from the command line :smiley:. Not that a text-based interface would work very well for stocks where graphics are quite central. But it could give a bit of a retro vibe - trading with style.

Some small things could also be automated. For example, today I made a lot of pairs trades against the index (stock long, index short), so I could have easily had a bot take a short position in the portfolio enough to keep the net stock weighting below a threshold.

Another place for automation would be stop-loss orders for certificates and other leveraged products.

4 Likes