이 포스팅은 시각화 정리 시리즈 16 편 중 15 번째 글 입니다.
하나 또는 두개의 데이터에 대한 흐름을 파악할 때 유용한 Area chart를 알아보자.
연습 kaggle notebook
# Useful for:
# Area chart is really useful when you want to drawn the attention about when a series is below a certain point.
# The area between axis and line are commonly emphasized with colors, textures and hatchings.
# Commonly one compares two or more quantities with an area chart.
# More info:
# https://en.wikipedia.org/wiki/Area_chart
# ----------------------------------------------------------------------------------------------------
# get the data
PATH = "/kaggle/input/the-50-plot-challenge/economics.csv"
df = pd.read_csv(PATH)
# ----------------------------------------------------------------------------------------------------
# prepare the data for plotting
# create the variation between 2 consecutive rows
df["pce_monthly_change"] = (df["psavert"] - df["psavert"].shift(1))/df["psavert"].shift(1)
# convert todatetime
df["date_converted"] = pd.to_datetime(df["date"])
# filter our df for a specific date
df = df[df["date_converted"] < np.datetime64("1975-01-01")]
# separate x and y
x = df["date_converted"]
y = df["pce_monthly_change"]
# calculate the max values to annotate on the plot
y_max = y.max()
# find the index of the max value
x_ind = np.where(y == y_max)
# find the x based on the index of max
x_max = x.iloc[x_ind]
# ----------------------------------------------------------------------------------------------------
# instanciate the figure
fig = plt.figure(figsize = (15, 10))
ax = fig.add_subplot()
# ----------------------------------------------------------------------------------------------------
# plot the data
ax.plot(x, y, color = "black")
ax.scatter(x_max, y_max, s = 300, color = "green", alpha = 0.3)
# annotate the text of the Max value
ax.annotate(r'Max value',
xy = (x_max, y_max),
xytext = (-90, -50),
textcoords = 'offset points',
fontsize = 16,
arrowprops = dict(arrowstyle = "->", connectionstyle = "arc3,rad=.2")
)
# ----------------------------------------------------------------------------------------------------
# prettify the plot
# fill the area with a specific color
ax.fill_between(x, 0, y, where = 0 > y, facecolor='red', interpolate = True, alpha = 0.3)
ax.fill_between(x, 0, y, where = 0 <= y, facecolor='green', interpolate = True, alpha = 0.3)
# change the ylim to make it more pleasant for the viewer
ax.set_ylim(y.min() * 1.1, y.max() * 1.1)
# change the values of the x axis
# extract the first 3 letters of the month
xtickvals = [str(m)[:3].upper() + "-" + str(y) for y,m in zip(df.date_converted.dt.year, df.date_converted.dt.month_name())]
# this way we can set the ticks to be every 6 months.
ax.set_xticks(x[::6])
# change the current ticks to be our string month value
# basically pass from this: 1967-07-01
# to this: JUL-1967
ax.set_xticklabels(xtickvals[::6], rotation=90, fontdict={'horizontalalignment': 'center', 'verticalalignment': 'center_baseline'})
# add a grid
ax.grid(alpha = 0.3)
# set the title
ax.set_title("Monthly variation return %");