How I built my Personal Robo-Advisor on AWS with Open Source Tools — 69% Expected Annual Return

Disclaimer: I am a software person and not a finance or investment expert. This post is for informational purposes only and should not be relied upon as investment advice.
UPDATED: I used Yahoo Finance Portfolio feature to provide further validation on portfolio performance. See data below:
Learning during Lockdown
One of the biggest lessons from the pandemic and the lockdown periods is almost everything can be expressed in 0s and 1s and made accessible to everyone across the globe, especially education. In the last 12 months, I have completed two executive education programmes from two universities from two continents — North America and Europe. The first being AI for Business by Wharton Online and the second & most recent INSEAD FinTech Programme.
In both courses, we learned the value of the data, the transformative effect technologies such as cloud and AI and the abundance of opportunities to put these knowledge to use. AI for Business by Wharton Online is a self-paced course as such I rushed through it with some much excitement, finishing a supposed 4-week curriculum in 4 days. However, INSEAD FinTech Programme took a classroom-like approach delivered digitally — clearly no way to rush through. The programme covered 4 themes in FinTech: Payments, Lending & Digital Banks, Blockchain & Cryptocurrencies, and Investment —the basis for this post.
Focus — Investment as FinTech Theme
From the Investment theme set a strong foundation for assessing the role of technology in investment management and the disruption platforms such as RobinHood, and e-Toro have caused the incumbents. The was the first time I learned about robo-advisors, exchange traded funds (ETFs) and the relationship between both. As I learned from the course, it is way more convenient and low risk to follow the market rather try to beat the market. A phrase I never believed until signed up on Trade Republic, Europe’s version of RobinHood, and lost almost between 10% to 20% of my invested capital in 3 weeks — and kept losing.
In bid to save my sinking investment ship, I put my programme learnings to use and bought some ETFs (sorry not sharing the tickers). With a bit of research, I invested in 10 exchange traded funds. My purchase criterion: between 30% to 60% increase in the last year (52-week period). I decided for this simplified condition because of the cognitive work I had to do if I wanted to build a steady portfolio. Investment in not my full time job, I do have two full-time jobs (permit me here) — raising a toddler and paid employment.
Application — Build a Robo-Advisor
As a software person, knowing that my choice of ETFs to buy were not free of bias, I sought out ways to use software and data to improve the outcomes of my decision. As a scientist, I wanted to validate my hypothesis as quickly as possible. I started by reading a number of blogposts (mostly from AWS Blog and Towards Data Science) and about the new service from AWS, Amazon FinSpace. Honestly, Amazon FinSpace is very promising, but the fact that it only contained US stocks data was quite limiting for my experiment.
Data Collection
To proceed, I decided to go with Amazon SageMaker Notebook Instance to rapidly prototype my idea. Thanks to Yahoo Finance, historical data was readily available but I needed to figure out the tickers for ETFs traded in Germany. For the ETF tickers, ETF Database website is an amazing, with a large amounts of information about ETFs across the world. The code snippet below shows how to scrape the list of ETFs with German exposure from ETF Database:
def get_etf_tickers():
"""
"""
symbols = []
for i in range(50):
if i % 25 == 0:
url = 'https://etfdb.com/?offset=' + str(i) # url is incorrect
response = requests.get(url)
tickers = response.json()['rows']
for ticker in tickers:
raw_symbol = ticker['symbol']
symbol = BeautifulSoup(raw_symbol).text
symbols.append(symbol)
print('Number of symbols: {}'.format(len(symbols)))
return symbols
With the tickers in place (530 ETF tickers), I needed to download the historical data from Yahoo Finance. The yfinace Python module from Ran Aroussi is unbelievable and works outta-the-box. With data in place, I needed to get some work done.
Approach
Quite a number of Python libraries exist in the stock trading space, as such developing mine would be a total waste of time. I started with Amazon SageMaker Reinforcement Learning example (available on GitHub) which seemed to be way more complex than I thought. Secondly, I found BackTrader, a Python module for developing trading strategies. Simpler to use but did not fit into the scope of my experiment. Finally, I found PyPortfolioOpt, a library that implements portfolio optimization methods, including classical efficient frontier techniques and Black-Litterman allocation. PyPortfolioOpt was simplest to use and fit my desired experiment outcomes.
Hypothesis Validation
Having an investment portfolio of 530 ETFs was convenient for a retail investor like myself. All I wanted was to create a portfolio of 10 ETFs with an annual return between 30% to 60%. If I had to select 10 ETFs at random, I would still express some bias and I did not want my computer laughing at me since this could be easily automated. Turns out there is a simple high school math technique for this, Combinations:
In mathematics, a combination is a selection of items from a collection, such that the order of selection does not matter (unlike permutations). For example, given three fruits, say an apple, an orange and a pear, there are three combinations of two that can be drawn from this set: an apple and a pear; an apple and an orange; or a pear and an orange. — Wikipedia.

Following the Wikipedia example, we can define my problem space as: Given 530 ETFs, generate X number of 10 combinations that can be drawn. Solution: Python standard library itertools.combinations
. Where X is 442485757298998966190 set of 10 ETFs combinations.

Running an experiment on this would be computationally expensive, I selected 2 sets at random for the purpose of this post. Please see results below.
Results
Below are the results of the experiment using the PyPortfolioOpt library:

To be sure, I decided to check the ETF with highest weight in Combination 2 on Yahoo Finance web page, this ETF grew by 64.78% from Oct ’20 to Jun ‘21.

At this point, I am reasonably convinced that as a retail investor, I can leverage compute to create a profitable ETF portfolio with little human bias. In addition to this, I tried out the Yahoo Finance Portfolio feature for both combinations and the results were outstanding using GSPC as a benchmark. What is GSPC? YourDictionary.com defines GSPC as:
This Standard & Poor’s Index is a capitalization-weighted index of 500 stocks. It is a popular index and is used to measure the performance of the large cap U.S. stock market. … Money managers often index their portfolios to match or beat the S&P 500. GSPC is the ticker symbol.
Combination 1’s annual performance follows the same trend as GSPC.

Combination 2’s annual performance outperforms GSPC by 213.63%, assuming our base expected annual return is 22% and our new is 69%. How can this be true?

More Experiments?
Yes, as a scientist, are there further experiments I can run to establish more evidence? Definitely, I am already considering experiments such as backtesting. According to Investopedia:
Backtesting assesses the viability of a trading strategy by discovering how it would play out using historical data. If backtesting works, traders and analysts may have the confidence to employ it going forward.
If it works, Why not Scale It?
While working on this, I realized that if this would be beneficial to me as I move towards financial freedom, why not open it up for a broader audience just like any other recommendation engine? Users are left to make their decision or seek more financial advice from investment experts.
In addition, this experiment was only run on ETFs traded in Germany. It can be scaled to more markets. We can even combine instruments from different markets (e.g Frankfurt and New York). Possibilities and opportunities are abundant, elevating numerous into financial freedom through technology.
I like it, is it free?
Yes it is. So, watch this space!
Near Future Work
We have put a lot of work into developing this solution and we strongly believe that not everyone should be involved in undifferentiated heavy-lifting, unless you really want to hack this together. Therefore, we are actively working on making this available as a managed service. Kindly drop a comment if you’re interested in a live demo.