Using Artificial Intelligence/Machine Learning to improve Decision Making Skills and enhance Predictive Analysis

Some Introduction on AI/ML
I would like to start this new thread as I could not found relevant thread on the above topic. On day to day basis we are hearing about Artificial Intelligence and Machine Learning impact on our lives. If you are using Smart Phone you are pretty much getting affected by various segments of AI/ML. As per latest announcements Google has come up with some enhancements to its Assistant platform which uses AI at the backend and now it is more powerful and able make human like interactions over a voice call. We also familiar with Amazon Alexa for taking voice commands using Alexa. When we get recommendations on our mobile apps for purchasing so and so product on any E-commerce or Social Networking platform then AI is working at the background.
In this thread I would like to add some pointers on how we can use to improve our decision making process, it may be stock selection or sector analysis or prediction of stock/index movement for the future. I will try to summarize in as much as layman’s language. I would like to disclose that am also tinkering around AI and ML and not an expert.

Some info on Machine Learning
Machine Learning is a subset of Artificial Intelligence. We have to feed the data to the ML programs to derive outputs. ML uses various type of algorithms and try to throw the results. It is an adaptive system which can train itself as more and more data gets added in the system just like human brain works. It can be used for Predictive Analysis, Anomaly Detection, Fraud Detection, Recommendations. Anomaly and Fraud Detection can mainly be used in Financial Sector for Credit card fraud or Loan fraud. The Recommendations are mainly used in E-Commerce for recommending various products depending on user behavior. In Financial sector Predictive Analysis can be used to predict the future performance of stock price or Index or currency.
There are again different uses cases of ML like Image Recognition and Natural Language Processing. In Image Processing objects can be identified in picture or video and that captured data can be used for taking other decisions. Using Natural Language Processing, human conversation is captured and that data is used for taking decisions. Example of NLP can be analyzing the sentiments of people of a new product company has launched. The data source for NLP can be news feed from various News Papers or Twitter Feed.

Some concepts in AI/ML can be useful for Decision Making and Predictive Analytics. I am right now tinkering on two ideas one for using NLP to analyze the sentiments related to a company or sector or a product and second one is Predictive Analysis of Stock Price.

Now coming to the actual use case of NLP, Amazon has launched a product called Amazon Comprehend for Natural Language Processing few months back. I am trying to use the same to analyze the sentiment of a promoter/owner of the company regarding the future strategy of the company, industry overview. The Amazon Comprehend right now allows to use up to 5000 words and predicts the Entities, Key Phrases and Sentiments in the posted words. The sentiments can be Neutral, Positive, Negative and Mixed. It supports API as well and continuous feeds can also be given to analyze the data on real-time basis. Depending on the Analysis Comprehend throws the results. Here the data which you are inserting for analysis is important to gain the insights from it. In my model I tried to analyze the sentiments of Promoter/Owner of the company during the quarterly concall.
Analysis of Bajaj Finance and Rain Industries Q4 2018 Concall using Amazon Comprehend NLP
I used two companies here. One disclosure here, as I am not recommending any positive/negative about these companies here and just wants to showcase the use case. The results may be wrong as well but in near future we will be able to use it more aggressively as it adds more features and functionalities.
Why I used the above two companies? Bajaj Finance is giving good showing growth Quarter on Quarter and overall mood seems Bullish and Rain Industries in recent quarter has come up with not so good results and stock price has also come down.

Bajaj Finance Test
As you know Bajaj Finance is posting good results Quarter on Quarter. I wanted to test if it is getting reflected in the speech of his MD(Rajeev Jain). I am pasting the results of the same here. Here I copied the statements of Mr. Rajeev Jain from Q4 2018 Concall and pasted it inside the Amazon Comprehend dialogue box and analyzed. I had to post the data in two sets as the existing limit of Comprehend is to accept 5000 words only. Here Amazon Comprehend is showing 0.23 Positive in first data capture and 0.36 Positive in second data capture.

Rain Industries Test
In the Q4 Concall of Rain Industries Mr. Jagan Mohan Reddy has given a Company Overview and Mr. Gerald has given Industry Overview. I separately posted the statements of both these guys inside the Comprehend. I got 0.19 Positive for Mr. Jagan Mohan Reddy who has given a Company Overview and 0.03 and 0.06 Negative for Mr. Gerald who has given Industry overview. Here, I am not coming on a conclusion that there is something Negative about the company. I am just throwing the results which I have got during the tests. These results may be wrong also as I am just exploring the possibilities with AI/ML.
Mr. Jagan Reddys Statement on Company Overview


Mr. Geralds Statement on Industry Overview

The second use case I am working on it is the Predictive Analysis of a Stock Price of a Company. First to admit here is that the short term and long term movement of stock price of any company is hard to predict as it has many factors which affects it, ranging from Micro to Macro. Right now I am working on some concepts with one of my friend who is active in Trading. I am using Amazon Machine Learning as I find it easy compared to other products. If any Data Scientists has used or using any other tools then please share the same so we can work jointly.

Some of the resources which may be useful to explain the concepts
Stock Prediction using Machine Learning - Suchit Majumdar
https://www.youtube.com/watch?v=JgC9BEVS9Tk&t=4710s

Machine Learning meets Stock Trading

Disc: I do not have holdings in Bajaj Finance and Rain Industries. The tests are shown for learning purpose only

19 Likes

Hi @nityanandparab Very nice writeup.

Your post is really interesting and I’m very keen to learn these concepts could you share any books, tutorials etc for a beginner ?

Thanks

@initin as I mentioned in my post I am not an expert in this. I have learned about AI/ML by searching on Internet/Youtube and joining various dots.

Hi,
Nice to see opening the thread here. I have done some work in machine learning area and implemented the model to predict stock price etc both in ML as well as Deep learning. But not to discourage you, they don’t seem to work well. Then i have reached to the conclusion that it is because all ML models make a fundamental assumption about the data. They assume that all the examples you feed to model follow a fixed distribution. Unfortunately, if you use stock price as one of the features(also many others), this distribution continuously changes and goes out of assumed distribution depending on various conditions. In other words, “fixed distribution” assumption your ML model makes during training no longer holds good and soon your model becomes “not usable for prediction” or predicts very poorly.

One way to overcome this is some kind of meta learning model, where you give change in price distribution(and other features of course) as input to model and predict the price. Also, you can try doing sentiment analysis in real time from twitter/facebook/news article about the company you are predicting and use this sentiment as one of the features. I personally think its huge work and don’t think one can take this up as hobby project or as a project with ought funding. But I know there are people doing such amazing things and sharing the learning!

Nice thread. I am new to AI/ML and have always pondered over this idea. Good to see it in action. I havent got much to input now but surely will follow this thread.
Thanks for starting this.

@initin Python Machine Learning by Sebastian Raschka.
Andrew Ng’s course in Coursera

2 Likes

have you attempted RNN with lstm???. I am currently working on a RNN model that uses lstm cells (some other layers as well). I have been able to get accuracy of 72%. This when it just makes prediction every time, regardless of how sure it is about predictions.

1 Like

How do you define accuracy? 72% is something I would die for as it is 72% : 38%, almost 2:1 odds. :slight_smile:

I had tried using all sorts of ML and neural networks (CNN, LSTM combos with autoencoder) in financial data. The problem is the bias of the training data. If the training set itself has 70% positive results, the ML algorithm just need to be 50% accurate to yield 70% true positive response. A proper testing framework is very much essential for this. My best algos give about 60-63% accuracy.

1 Like

I know machine learning and natural language processing very well can you suggest what should I look for in annual reports and concall.
Could you/anyone please provide corpus of words or ideas to look for in these bulky text

Thanks in advance

Already there are tools available like needl.ai and Finsight.ai which are getting the data from Conference calls and Annual reports. You can help them.

1 Like

Any python script for downloading all annual reports/earnings call of all stock symbols available on nse?

The problem I am facing for creating such script is that the link parameters are not having symbols like www.example.com/sbin
In downloading bhavcopy it was like this only (it was having date, symbol) so you can easily run a for loop in python

Now its like example.com/100045/2366 because coming from company’s website

So any way to download all annual reports/earnings call?

I was thinking of doing web scraping but if any better way than please tell
Thanks in advance

Is there any way by which we can make baskets of stocks (on the basis of what goes up and what goes down together). For example, commodity-related stocks like stell will go up and together most of the time. So our algorithm should separate these stocks in form of different baskets.

I was thinking of implementing it by DBSCAN Algo for clustering(Which is part of ML). Any other approach by which we can do it?

Hi,

Sorry to bump this quite old thread, but I have done some research and implementation on this topic and thought it might be beneficial for the community to share some of it.

First of all, I use data from the interim reports of listed firms, collected from various sources (mainly from Europe). The reports are condensed to a high-dimensional vectors, which represent the textual information in the documents. These can be used in multiple ways, for example in clustering (with DBSCAN as you mentioned), which leads to similar companies appearing close to each other. There are also some academic research about companies that change the content of their reporting, leading to lower alpha in future quarters.

I have compiled this information to a website finsim, where the results can be found. I you find this interesting, feel free to give suggestions and feedback, the project is still quite rough around the edges, thanks.

1 Like

Few thoughts about financial reporting and possibilities of ML models:

  • Research has shown that companies that introduce a lot of changes to their reports between quarters are more likely to have negative stock price performance during the following period, see Cohen, 2020 and Adosoglou, 2021
  • The downward price effect is usually not happening right at disclosure time, rather some time after it, when other news emerge, such as publishing of negative outlook etc.
  • In other words, the information about risks etc. might have been provided in quarterly reporting, but in a more subtle way than a straight headline, for example. Therefore, it might have not been registered by investors, and is ignored in their decision making i.e. the stock price
  • Implication from this is that an investor should be aware of sudden changes in the content of a company’s reporting, if the stock is intended for a long holding period. It is especially the case when a change has happened, but it hasn’t affected the pricing immediately.
  • ML models can be used to quickly determine if semantically meaningful content change has happened.

Here is one case example, from a Finnish large cap Neste, using the tool at finsim:

In the graph there are some developments of Neste’s year 2024.

  • Dots represent quarterly reports, from Q4/23 to Q2/24. The brown reference line shows a similarity-score that 50% of companies don’t surpass during the quarter.
  • It can be seen, that in Q1 the share price has plummeted after the release, most likely due to subpar numbers. Note also the rise of negative tone.
  • However, a bigger slide in price happened some time after the release, when a lowering of outlook for the year was released. This could have been conveyed in Q4/23 report, which had little bit lower similarity-score than usual (84% similarity compared to Q3/23)
  • In Q2, a more prominent effect was seen. At first, the markets responded with optimism, and the share price had an upward trend after the first days. Then, about month later another lowering of guidance happened.
  • Here the similarity-score is around 82%, lower than usual, along with growing proportion of negative tone. Again, the Q2-report might have had some different wordings about risks etc, which were realized later during Q3.

At least in hindsight, the possibility of additional risks related to the company could have been picked out by using the similarity metric, and then delving into the details of the disclosure. In the Cohen study there is a similar type of example, they highlight the changes line by line, and show another turn of events. Here is also an interview, in which the study is discussed in more detail:

Interesting thread, complete novice here about investing but have been building AI for fintechs for last 14 years now, so happy to answer any questions about the subject.

Autoregressive models fail horribly to predict stocks due to the inherent nature of the domain, but information retrieval and profiling is still a big area where AI/ML is actively used in credit scoring or stock analysis. With GenAI it’s actually easier now to condense information faster specially data related to sentiments and key developments.

Looking forward to brainstorming ideas of where AI could augment investing.

Cheers.
A

1 Like

Good list of tools to utilize in stock and industry research covered in detail by Ishmohit.