This tutorial assumes that you already know the basics of object-oriented programming in general and have a working knowledge of Python. If you are not familiar with object-oriented programming, you should read the tutorial on Python classes first.
Useful links: https://realpython.com/python3-object-oriented-programming/
Dataclasses are a way to create classes that are lightweight and have a lot of the functionality of a class without the overhead of a class. They are useful for creating classes that are used to store data. They are also useful for creating classes that are used to store data that is used to create other classes. The dataclasses are available through built-in dataclasses module in Python 3.7+.
from dataclasses import dataclass
Let us create a dataclass that represents a bond. This can be achieved as follows:
@dataclass
class Bond:
"""
The dataclass that represents a bond.
## Parameters:
rate: The rate of interest on the bond.
duration: The duration of the bond.
face_value: The face value of the bond.
## Properties:
price: The present value of the future cash flows.
"""
rate: float
duration: float
face_value: float
@property
def price(self) -> float:
return self.face_value / (1 + self.rate) ** self.duration
The property decorator is used to store a computed attribute within a dataclass. The same code logic as above but without the dataclass decorator.
class Bond:
def __init__(self) -> None:
self.rate = 0.0
self.duration = 0.0
self.face_value = 0.0
def CalculatePresentValue(self) -> None:
self.price = self.face_value / (1 + self.rate) ** self.duration
One of the most powerful features with the Python dataclasses are that they support many features of object oriented programming. This would become clearer with the example of mortgages below. Mortgages are also a subclass of bonds, therefore we will simply inherit Bonds into a new dataclass called mortgages.
@dataclass
class Mortgage (Bond):
pass
Now you can probably see that we have a structure. We have implemented programmatically that a mortgage is actually a bond and it can also have additional features. Mortgages will also have attributes that are specific to mortgages. We will add the down payment and the monthly payment to the mortgage class. Let us add those features and define the mortgage class again.
@dataclass
class Mortgage(Bond):
down_payment: float
@property
def monthly_payment(self) -> float:
monthly_rate = self.rate / 12
return (self.face_value - self.down_payment) * monthly_rate / (1 - (1 + monthly_rate) ** (-self.duration * 12))
# We also need to override the price property. The price of the mortgage is the present value of the monthly payments plus the down payment. This is equal to the face value or the amount borrowed.
@property
def price(self):
return self.face_value
Now let us create a mortgage object. Note that we need to supply 4 parameters: the rate, duration, face value and down payment. Three of these parameters are inherited from the Bond class. The down payment is a new parameter that is specific to the Mortgage class. Mortgage class also has an additional property attribute called monthly payment.
fixed_mortgage = Mortgage(rate=0.05, duration=30, face_value=100000, down_payment=20000)
fixed_mortgage.monthly_payment
But mortgages can either be fixed rate or floating rate. We will now inherit from mortgage and create a floating rate mortgage. Now you could be probably getting a feel of how powerful dataclasses could be.
@dataclass
class FloatingRateMortgage(Mortgage):
@property
def monthly_payment(self) -> float:
if(len(self.rate) != self.duration * 12):
raise ValueError("The number of monthly rates is not equal to the number of months in the mortgage duration.")
# Now we will have a list of monthly payments. We will have to calculate the monthly payment for each month based on the monthly rate.
return [(self.face_value - self.down_payment) * (rate/12) / (1 - (1 + (rate/12)) ** (-self.duration * 12)) for rate in self.rate]
We will now create a floating rate mortgage object. We will assume that the monthly annualised rate of interest is uniformly distributed between 4% and 12%. Note that the rate of interest is annualised and this annualised rate changes every month.
from numpy.random import uniform
floating_mortgage = FloatingRateMortgage(duration=30, face_value=100000, down_payment=20000, rate=uniform(0.04, 0.12, 30*12))
floating_mortgage.monthly_payment[0:12] # We will print the first year of monthly payments for sake of brevity.
In the above example, we assumed that the rate of interest is uniformly distributed between 4% and 12%. This is quite unrealistic, but that is not the point. The point is to show, how powerful and convenient datatclasses can be. We used dataclasses to create a class (eg: Mortgage) that inherits from another class (eg: Bond) and add additional attributes and methods to the inherited class. By the way, you can also use your own interest rates or source the list of interest rates from any API such as FRED. You can also extend this code to create a useful application that would simulate different interest rate environments and show you how the mortgage payments would change or maybe something else (imagination is the limit!). That is the beauty of Python! You can do powerful things with simple and readable tools.
Now let us look at other features that dataclasses provide. Suppose you are a small bank and you have given out 5 housing loans, with amounts of 50000, 90000 and $100000 respectively. All of these loans have a fixed interest rate of 5% and and a fixed duration of 30 years. The down payment for each loan is 20% of the loan amount.
loans = [10000, 50000, 60000, 90000, 100000]
mortgage_portfolio = [Mortgage(rate=0.05, duration=30, face_value=face_value, down_payment=0.2 * face_value) for face_value in loans]
mortgage_portfolio
We have created a list of mortgage objects. Suppose we want to filter out the mortgages that have a face value greater than $50,000. Filtering the list of dataclasses is same as filtering any other list.
filtered_portfolio = list(filter(lambda loan: loan.face_value > 50000, mortgage_portfolio))
filtered_portfolio
Or maybe you want to filter only those portfolios that have a monthly payment greater than $100
filtered_portfolio = list(filter(lambda loan: loan.monthly_payment > 100, mortgage_portfolio))
filtered_portfolio
We can also sort dataclasses using the sorted function. We will sort the mortgage portfolio based on the monthly payment in descending order.
sorted_portfolio = sorted(mortgage_portfolio, key=lambda loan: loan.monthly_payment, reverse=True)
sorted_portfolio
Anything that you can do with python lists, you can do with the list of dataclasses. For example, we can use the map function to add an attribute of amortization schedule to each mortgage object in the portfolio.
Let us define a function that will calculate the amortization schedule for a mortgage.
def CalculateAmortizationSchedule(mortgage: Mortgage) -> list[dict]:
monthly_rate = mortgage.rate / 12
monthly_payment = mortgage.monthly_payment
balance = mortgage.face_value - mortgage.down_payment
amortization_schedule = []
for i in range(0, mortgage.duration * 12):
interest = balance * monthly_rate
principal = monthly_payment - interest
balance = balance - principal
amortization_schedule.append({"period": i+1, "interest":interest, "principal": principal, "unpaid_balance":balance})
return amortization_schedule
Now let us use the map function to add the amortization schedule to each mortgage object in the portfolio.
for mortgage in mortgage_portfolio:
mortgage.amortization_schedule = CalculateAmortizationSchedule(mortgage)
mortgage_portfolio[0].amortization_schedule[0:4] # We will print the first four months of the amortization schedule for the first mortgage in the portfolio.
import matplotlib.pyplot as plt
schedule_plots = [plt.plot([schedule["period"] for schedule in mortgage.amortization_schedule], [schedule["unpaid_balance"] for schedule in mortgage.amortization_schedule], label=f"Loan {i+1}") for i, mortgage in enumerate(mortgage_portfolio)]
plt.title("Amortization Schedule for all the loans in the portfolio")
plt.xlabel("Months")
plt.ylabel("Unpaid Balance")
plt.legend()
plt.show()
There are multiple ways to achieve the same thing in Python. Dataclasses are useful but are not a substitute for pandas dataframes. Dataclasses do not have many out-of-the-box features as pandas dataframes provide such as rich support for date-time libraries and plotting. If the data makes more sense in a tabular format, then pandas dataframes are the way to go. For example, if you need time-series features like moving averages, differencing, etc., then do not use dataclasses, unless you are willing to write the code to implement these features.
However, dataclasses are useful when we want to create a simple data structure that is easy to understand and maintain.