이 포스팅은 시각화 정리 시리즈 16 편 중 15 번째 글 입니다.

  • Part 1 - 01: 라이브러리
  • Part 2 - 02: matplotlib 알아보기
  • Part 3 - 03: 산점도
  • Part 4 - 04: Circling을 통한 버블 플롯
  • Part 5 - 05: 선형 회귀 선을 포함한 산점도
  • Part 6 - 06: Strip plot
  • Part 7 - 07: Counts plot
  • Part 8 - 08: Marginal Histogram
  • Part 9 - 09: Correlation plot
  • Part 10 - 09: Marginal Boxplot
  • Part 11 - 10: Pair plot
  • Part 12 - 11: Diverging Bars
  • Part 13 - 12: Diverging lines with text
  • Part 14 - 13: Diverging Lollipop Chart with Markers
  • Part 15 - This Post
  • Part 16 - 15: Ordered Bar Chart
▼ 목록 보기

하나 또는 두개의 데이터에 대한 흐름을 파악할 때 유용한 Area chart를 알아보자.
연습 kaggle notebook

# Useful for:
# Area chart is really useful when you want to drawn the attention about when a series is below a certain point.
# The area between axis and line are commonly emphasized with colors, textures and hatchings.
# Commonly one compares two or more quantities with an area chart.

# More info:
# https://en.wikipedia.org/wiki/Area_chart

# ----------------------------------------------------------------------------------------------------
# get the data

PATH = "/kaggle/input/the-50-plot-challenge/economics.csv"
df = pd.read_csv(PATH)

# ----------------------------------------------------------------------------------------------------
# prepare the data for plotting
# create the variation between 2 consecutive rows
df["pce_monthly_change"] = (df["psavert"] - df["psavert"].shift(1))/df["psavert"].shift(1)

# convert todatetime
df["date_converted"] = pd.to_datetime(df["date"])

# filter our df for a specific date
df = df[df["date_converted"] < np.datetime64("1975-01-01")]

# separate x and y
x = df["date_converted"]
y = df["pce_monthly_change"]

# calculate the max values to annotate on the plot
y_max = y.max()

# find the index of the max value
x_ind = np.where(y == y_max)

# find the x based on the index of max
x_max = x.iloc[x_ind]

# ----------------------------------------------------------------------------------------------------
# instanciate the figure
fig = plt.figure(figsize = (15, 10))
ax = fig.add_subplot()

# ----------------------------------------------------------------------------------------------------
# plot the data
ax.plot(x, y, color = "black")
ax.scatter(x_max, y_max, s = 300, color = "green", alpha = 0.3)

# annotate the text of the Max value
ax.annotate(r'Max value',
             xy = (x_max, y_max),
             xytext = (-90, -50),
             textcoords = 'offset points',
             fontsize = 16,
             arrowprops = dict(arrowstyle = "->", connectionstyle = "arc3,rad=.2")
           )
# ----------------------------------------------------------------------------------------------------
# prettify the plot
# fill the area with a specific color
ax.fill_between(x, 0, y, where = 0 > y, facecolor='red', interpolate = True, alpha = 0.3)
ax.fill_between(x, 0, y, where = 0 <= y, facecolor='green', interpolate = True, alpha = 0.3)

# change the ylim to make it more pleasant for the viewer
ax.set_ylim(y.min() * 1.1, y.max() * 1.1)

# change the values of the x axis
# extract the first 3 letters of the month
xtickvals = [str(m)[:3].upper() + "-" + str(y) for y,m in zip(df.date_converted.dt.year, df.date_converted.dt.month_name())]

# this way we can set the ticks to be every 6 months.
ax.set_xticks(x[::6])

# change the current ticks to be our string month value
# basically pass from this: 1967-07-01
# to this: JUL-1967
ax.set_xticklabels(xtickvals[::6], rotation=90, fontdict={'horizontalalignment': 'center', 'verticalalignment': 'center_baseline'})

# add a grid
ax.grid(alpha = 0.3)

# set the title
ax.set_title("Monthly variation return %");

다운로드 (18)

Reference

Plotting with Python: learn 80 plots STEP by STEP