Replication Exercise

Replication Exercise#

In this exercise we show the validity of the package in replicating Phillips et al. (2015).

We download the CAPE data from the Shiller website to perform the exercise.

[1]:
import pandas as pd

url: str = (
    "https://img1.wsimg.com/blobby/go/e5e77e0b-59d1-44d9-ab25-4763ac982e53/downloads/02d69a38-97f2-45f8-941d-4e4c5b50dea7/ie_data.xls?ver=1743773003799"
)

data: pd.DataFrame = (
    pd.read_excel(
        url,
        sheet_name="Data",
        skiprows=7,
        usecols=["Date", "P", "D", "E", "CAPE"],
        skipfooter=1,
        dtype={"Date": str, "P": float},
    )
    .rename(
        {
            "P": "sp500",
            "CAPE": "cape",
            "Date": "date",
            "D": "dividends",
            "E": "earnings",
        },
        axis=1,
    )
    .assign(
        date=lambda x: pd.to_datetime(x["date"].str.ljust(7, "0"), format="%Y.%m"),
    )
    .set_index("date", drop=True)
)

We look for the existence of bubbles in the Price-Dividend ratio.

[2]:
pdratio: pd.Series = data["sp500"] / data["dividends"]
pdratio = pdratio.dropna()
[3]:
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_theme(
    context="notebook",
    style="whitegrid",
    font_scale=1.5,
    rc={"figure.figsize": (12, 6)},
)

plt.plot(pdratio)
plt.title("Historic P/D Ratio")
plt.show()
../_images/notebooks_replication_4_0.png

Using the psytest package, we first initialize the object using the PSYBubbles function.

[4]:
from psytest import PSYBubbles
from numpy import datetime64

psy: PSYBubbles[datetime64] = PSYBubbles.from_pandas(
    data=pdratio, minwindow=None, lagmax=0, minlength=None
)

Then we calculate the test statistics and critical values. We will be using a significance level of 5% using the available tabulated data by setting fast=True.

[5]:
stat: dict[datetime64, float] = psy.teststat()
cval: dict[datetime64, float] = psy.critval(alpha=0.05, fast=True)

Using these objects, we find the occurances of bubbles in the data:

[6]:
bubbles: list[tuple[datetime64, datetime64 | None]] = psy.find_bubbles(alpha=0.05)
[7]:
plt.plot(stat.keys(), stat.values(), label="Test Stat.")
plt.plot(cval.keys(), cval.values(), linestyle="--", label="Crit. Val(95%)")
for i, bubble in enumerate(bubbles):
    plt.axvspan(
        bubble[0],
        bubble[1] if bubble[1] is not None else pdratio.index[-1],
        color="gray",
        alpha=0.3,
        zorder=-1,
        label="Bubble" if i == 0 else None,
    )
plt.legend()
plt.title("Test Stat. and Critical Value")
plt.xlabel("Date")
plt.ylabel("Test Stat.")
plt.show()
../_images/notebooks_replication_11_0.png
[8]:
plt.plot(pdratio, label="P/D Ratio")
for i, bubble in enumerate(bubbles):
    plt.axvspan(
        bubble[0],
        bubble[1] if bubble[1] is not None else pdratio.index[-1],
        color="gray",
        alpha=0.3,
        zorder=-1,
        label="Bubble" if i == 0 else None,
    )
plt.legend()
plt.title("Historic P/D Ratio with Bubbles")
plt.xlabel("Date")
plt.ylabel("P/D Ratio")
plt.show()
../_images/notebooks_replication_12_0.png
[9]:
bubbles_table: pd.DataFrame = pd.DataFrame(bubbles, columns=["start", "end"]).assign(
    duration=lambda x: x["end"] - x["start"],
)
bubbles_table
[9]:
start end duration
0 1879-11-01 1880-06-01 213 days
1 1917-11-01 1918-05-01 181 days
2 1929-01-01 1929-11-01 304 days
3 1955-07-01 1956-03-01 244 days
4 1959-02-01 1959-10-01 242 days
5 1987-02-01 1987-11-01 273 days
6 1996-11-01 2001-10-01 1795 days

Which match with the ones on the original paper (p.p. 1066).