이 포스팅은 시각화 정리 시리즈 16 편 중 9 번째 글 입니다.
변수들의 상관관계를 한눈에 파악할 수 있는 Correlation plot을 알아본다.
연습 kaggle notebook
# Useful for:
# The correlation plot helps us to comparte how correlated are 2 variables between them
# More info:
# https://en.wikipedia.org/wiki/Covariance_matrix#Correlation_matrix
# ----------------------------------------------------------------------------------------------------
# get the data
PATH = '/kaggle/input/the-50-plot-challenge/mtcars.csv'
df = pd.read_csv(PATH)
# ----------------------------------------------------------------------------------------------------
# instanciate the figure
fig = plt.figure(figsize = (10, 5))
ax = fig.add_subplot()
# plot using matplotlib
# https://matplotlib.org/3.2.1/api/_as_gen/matplotlib.axes.Axes.imshow.html
ax.imshow(df.corr(), cmap = 'viridis', interpolation = 'nearest')
# set the title for the figure
ax.set_title("Heatmap using matplotlib");
단순히 이렇게만 plot하면 알아보는 것이 어려우니, x, y축에 변수이름을 적어 나타내어 보자.
# Useful for:
# The correlation plot helps us to comparte how correlated are 2 variables between them
# More info:
# https://en.wikipedia.org/wiki/Covariance_matrix#Correlation_matrix
# ----------------------------------------------------------------------------------------------------
# get the data
PATH = '/kaggle/input/the-50-plot-challenge/mtcars.csv'
df = pd.read_csv(PATH)
# ----------------------------------------------------------------------------------------------------
# prepare the data for plotting
# calculate the correlation between all variables
corr = df.corr()
# create a mask to pass it to seaborn and only show half of the cells
# because corr between x and y is the same as the y and x
# it's only for estetic reasons
mask = np.zeros_like(corr) # 0행렬을 만든다.
mask[np.triu_indices_from(mask)] = True # upper triangle 부분을 true로 바꾼다.
# ----------------------------------------------------------------------------------------------------
# instanciate the figure
fig = plt.figure(figsize = (10, 5))
# plot the data using seaborn
ax = sns.heatmap(corr,
mask = mask,
vmax = 0.3,
square = True,
cmap = "viridis")
# set the title for the figure
ax.set_title("Heatmap using seaborn");