Open-source GitHub Repo | Paper Describing the Process

Aside: If you want to take the course I did online, the full course is available for free on YouTube.

When I was a graduate student at Carnegie Mellon University, I took this course called Intro to Deep Learning. Don't let the name of this course fool you; it was absolutely one of the hardest and most interesting classes I've taken in my entire life. In that class, I fully learned what "AI" actually means. I learned how to create state-of-the-art AI algorithms – including training them from scratch using AWS EC2 clusters.

But, I loved it. At this time, I was also a trader. I had aspirations of creating AI-Powered bots that would execute trades for me.

And I had heard of "reinforcement learning" before.. I took an online course at the University of Alberta and received a certificate. But I hadn't worked with "Deep Reinforcement Learning" – combining our most powerful AI algorithm (deep learning) with reinforcement learning

So, when my Intro to Deep Learning class had a final project in which I could create whatever I wanted, I decided to make a Deep Reinforcement Learning Trading Bot.

Background: What is Deep Reinforcement Learning

Deep Reinforcement Learning (DRL) involves a series of structured steps that enable a computer program, or agent, to learn optimal actions within a given environment through a process of trial and error. Here’s a concise breakdown:

Initialize: Start with an agent that has no knowledge of the environment, which could be anything from a game interface to financial markets.
Observe: The agent observes the current state of the environment, such as stock prices or a game screen.
Decide: Using its current policy, which initially might be random, the agent selects an action to perform.
Act and Transition: The agent performs the action, causing the environment to change and generate a new state, along with a reward (positive or negative).
Receive Reward: Rewards inform the agent about the effectiveness of its action in achieving its goals.
Learn: The agent updates its policy using the experience (initial state, action, reward, new state), typically employing algorithms like Q-learning or policy gradients to refine decision-making towards actions that yield higher returns.
Iterate: This cycle repeats, with the agent continually refining its policy to maximize cumulative rewards.

This iterative learning approach allows DRL agents to evolve from novice to expert, mastering complex decision-making tasks by optimizing actions based on direct interaction with their environment.

How I applied it to the stock market

My team implemented a series of algorithms that modeled financial markets as a deep reinforcement learning problem. While I won't be super technical in this post, you can read exactly what we did here. Some of the interesting experiments we tried included using convolutional neural networks to generate graphs, and use the images as features for the model.

However, despite the complexity of the models we built, none of the models were able to develop a trading strategy on SPY that outperformed Buy and Hold.

I'll admit the code is very ugly (we were scramming to find something we could write in our paper and didn't focus on code quality). But if people here are interested in AI beyond Large Language Models, I think this would be an interesting read.

Open-source GitHub Repo | Paper Describing the Process

Happy to get questions on what I learned throughout the experience!

submitted by /u/Starks-Technology
[link] [comments]