Two sides of the same coin? Visualising crypto-currency price correlations

As of April 2024, there are around 10,000 crypto-currencies out there. Seemingly plenty of choice if you are looking to invest in something. But how can you go beyond simply clicking buy on something with a funny name? And is there in fact that much to choose from, or are all coins ultimately the same?

In this article, I will focus on the question of how different the biggest crypto coins are from a statistical perspective. I will leave the specific value propositions of different coins aside for now, and purely focus on a key statistical technique when comparing price trends, correlation analysis.

The illusion of diversification

I should start by saying I am not a financial analyst nor in any way qualified to give advice. This is not advice, but only my findings from a brief statistical analysis on crypto-currencies.

Let me start by clarifying my understanding of the term diversification, i.e. the practice of spreading money across multiple investments.

First and foremost, the purpose of diversification is to reduce risk and provide more stability to a portfolio overall. If one of the investments goes south, this should only account for part of the money, and hopefully other investments will make up for the loss.

By spreading investments across different stocks, different industries, different global regions, and different assets, we reduce the impact of any one event on our bottomline.

But we can also create the illusion of diversification. If I buy a stake in several different vinyards all from the same region, I may feel like I have spread out my risk, but if that region is hit by a cold snap, all those vinyards are likely to suffer, and it’s still my whole portfolio that’s affected.

Instead, we could invest in vinyards from different regions, so that not all of them will be affected by the same weather events. But this still leaves us exposed to changes in for example the global demand in wine. Better to also invest in other agricultures, or better yet, something completed unrelated, like a stake in some real estate, a tech company, or government bonds.

But let’s say we’ve got our eye on crypto, and we have set aside some (not all) of our capital to invest. What do we buy and to what extent can we still diversify our portfolio?

A statistical angle

Once you make it onto a crypto exchange, there are several hundred tokens available to buy at the click of a button.

How do we know which one to buy, and does the choice even matter? That second question is where the stats come in.

Let’s focus on the top 30 cryptocurrencies by market cap. Together, they account for about 90% of the crypto market. After removing near-identical tokens such as staked Ether (StETH) and wrapped Bitcoin (wBTC), we are left with 25 top tokens. They are presented below sized by market cap and coloured by type of token (blue for transactions, green for smart contracts, etc…)

./market_caps.png
Top 25 crypto-currencies sized by market cap

The first thing to note is that the market is still dominated by BTC, accounting for about 60% of the market. This should already raise some concerns about diversification, because whatever happens to bitcoin is bound to affect the rest of the market.

Working off the last 3 years of daily prices of these tokens, we can start looking for strong correlations, i.e. tokens that systematically move up together, and move down together.

Let’s check correlations between top tokens. When exploring correlations across several features, a popular way to present this as a heatmap, where a red square indicates a strong relationship between a pair of tokens. But when we try to compare many tokens against each other, this becomes quite a mess:

./heatmap.png
Correlations between pairs of the top 25 tokens. Deep red is strong correlations

This is not easy to look at, at all, but a few things do stand out:

  • the whole graph is quite red, indicating that, in general, all cryptos coins are quite strongly correlated to each other
  • there are a few tokens that are relatively immune to correlations, typically the stablecoins (USDT, USDC, DAI)

At this stage, we already get the sense that, in terms of price fluctuations, we are exposing ourselves to very similar patterns. Take for example the very top-left corner. The boxes for bitcoin-Ether are deep red. These coins have very similar behaviours when it comes to price fluctuations. Splitting your investment 50/50 between those two would hardly reduce your risk at all.

If we want to properly spread our risk, we need to identify groups of tokens that behave similarly, and spread our investment across the groups. Those groups are hard to find in a heatmap, but we can do better.

A hierarchical approach to clustering

Rather than plotting correlations by colour in a grid, we can treat correlations as a measure of distance. We can think of strongly correlated coins as ‘close together’ and weakly correlated coins as ‘far away’. Treated like this, coins can form groups of similarity, which we can then visualise clearly. Let’s use a tree diagram:

./tree.png
Crypto-currencies correlations clusters

In a tree diagram like this, all the tokens are listed at the bottom. The closer to the bottom the branches of currencies join up, the more strongly correlated they are. This allows us to detect groups of currencies that tend to behave similarly.

Encouragingly, the natural correlations we see in price fluctuations seem to have some reasonable overlap with the different purposes / value propositions of the coins:

  • the left-most group, in orange, seem to be the stablecoins. These are somewhat correlated to each other, as they are all linked to the US dollar, but they don’t join up with the other currencies until the very top of the tree. They are very independent from the rest when it comes to price fluctuations.

  • Most of the transaction-focused tokens appear in the green group: ADA, DOT, BCH, XRP, LTC. These are all very strongly correlated to each other, to the point where investing in multiple of these is an illusion of diversification, you might as well just buy one.

Interestingly, DogeCoin, a meme coin with really no value proposition at all, is very strongly linked to the transactional tokens. As transactional tokens make up a significant portion of the market, this might indicate interest in DogeCoin follows general interest in the crypto market.

Let’s see one more alternative to heatmaps, visualising the ‘coin network’ of the crypto market.

Correlations as a network

Similarly to the tree diagram, we can interpret correlations as links between coins, so that more strongly linked coins can be visualised as closer together. This is an ideal scenario for the use of network graphs:

./network.png
Crypto-currencies correlations as a network. Three major groups form.

Each coin is now a dot (a node), which is linked to its most strongly correlated partners. The thicker the connection, the stronger the correlation. For clarity, the network only shows the strongest correlations, which is why the stablecoins don’t show: they’re correlations were too weak to be included.

We can see that the same groups as before form: utility tokens, transaction tokens, and big players.

Wrap up

Correlation analysis is a very quick way of identifying (linear) relationships and to spot patterns or groups within a dataset. There are many ways of visualising these beyond giant heatmaps that are more intuitive, more readable, and allow us to spot patterns more easily.

This is only one angle of analsyis. A more quantitative analysis like this should be balanced with research about the actual value propositions of the different coins. What purpose do they serve? Do experts see them as equivalent? Are there logical reasons why some coins are more linked than others. Many coins for example are derivatives from the original Bitcoin and Ether concepts, so we should expect them to have stronger ties to those coins.

Following this exploration, we are one (small) step closer towards being better informed about how to spread a crypto investment. As we combine different analyses, we gain a better view on what is likely to best reduce some of the inherent risk of a very volatile market.


Got questions? Spotted any issues in the code? Or do you want to share your own examples of implementations? Drop me a message on LinkedIn.


Photo Credit: Harrison Broadbent, Unsplash