Here I will be creating a simple price correlation heat map that will compare the two most popular crypto currencies, Bitcoin & Ethereum ,using Python. Before getting started with the code lets give a quick run down on what a correlation heat map is and how it works.
A correlation heatmap uses colored cells, to show a 2D correlation matrix between two event types. For the example, the event types in this project will be the Open, Close, High,Low Volume and Adj Close of both Bitcoin and Ethereum Price.
- Adj Close
This project uses 5 years worth of Historical Prices. A correlation heat map is ranked by 1.0 to .oo with 1.0 being the Highest correlation between variables and .00 as the Lowest correlation. Typically, anything greater than .5 would be considered highly correlated.
Lets move on to the code portion of this project! To get the Heat Map the library Seaborn must be imported. Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. I have recently started using Seaborn and defiantly recommend checking if out if you do a lot of data analysis.
#libraries used in this project import pandas as pd import matplotlib.pyplot as plt import seaborn as sns
Reading in Data File
df = pd.read_csv(r'C:\Users\comp\Desktop\BTC.csv')
Creating Heat Map
# Correlation matrix corrmat = df.corr() fig = plt.figure() hmap = sns.heatmap(corrmat, vmax = 1, annot = True,vmin=0, cmap='Accent',fmt='.1g',linewidths=1, linecolor='black',xticklabels= True, yticklabels=True) hmap.set_xticklabels(corrmat, rotation=30) plt.show()
And there you have it! With six lines of code a very simple and elegant heat map. So lets take a look, notice the pink square. This is the only spot of high correlation between Bitcoin and Ethereum. The area where the two Cryptos share the highest correlation is in their Volume. Logically this does make senses, because normally if society as a whole is purchasing crypto currencies you can expect the other smaller cryptos to follow the trend setters like Bitcoin and Ethereum which are arguably the most popular crypto currencies. Similar to the stock market, say if the Nasdaq was at an all time all you would probably expect Apple and Microsoft to be trading at higher prices too.
However, while this model does look really cool its not the most accurate. A good rule of thumb when working with historical data or volatility is to convert the change to the natural log . The reason for this is because natural logarithm equal the percentage change in the original series. Logs follow the slope of a trend line that is fitted to logged data. This data equals the average percentage growth in the original series. Meaning is helps provide better accuracy when calculating correlation because the percentage change is made equal between the two, think of it as converting Fahrenheit to kelvins.
Thanks for reading! Follow me to never miss a post. Also check out Code It on Instagram and Twitter.
#importing Libraries import pandas as pd import matplotlib.pyplot as plt import seaborn as sns #Loading in the data file df = pd.read_csv(r'C:\Users\comp\Desktop\BTC.csv') # Correlation matrix corrmat = df.corr() fig = plt.figure() hmap = sns.heatmap(corrmat, vmax = 1, annot = True,vmin=0, cmap='Accent',fmt='.1g',linewidths=1, linecolor='black',xticklabels= True, yticklabels=True) hmap.set_xticklabels(corrmat, rotation=30) plt.show()