#Example Loss Reserving data sets can be found here http://www.casact.org/research/index.cfm?fa=loss_reserves_data
#This example uses the other liability data from the site given above.
#This example uses the ChainLadder and the plyr package. Both need to be installed in order to be called.
#Loads ChainLadder package
library(ChainLadder)
#Loads plyr package
library(plyr)
#Reading in data (comma separated format) from the above website
#Tell what file to be read.
#header is TRUE if there are column names in the file being read in
#Separation is what separates each individual value being read in
OthLiabData <- read.csv("http://www.casact.org/research/reserve_data/othliab_pos.csv", header = TRUE, sep = ",")
###Summarizing the Data###
#Instead of having the data separated by insurance company, this function will sum the data subsets of the reported
#incurred losses, cumulative paid losses, and the earned premiums; then it will arrange the information by accident year,
#development year, and development lag.
#Note: The incurred loss data in the Excel file is the total loss reserve, and the bulk loss data is the IBNR data.
#Therefore, to get the reported incurred data subtract the incurred loss data by the bulk loss data.
#Arguments in the function:
#OthLiabData is the data frame that will be split.
#.(…) is the variables to spilt the data frame
#summarise (not a misspelling) is a function that creates new columns for the summed data
#Sum each subset of data by the variable given in quotations and rename the new columns of output
SumData <- ddply(OthLiabData,.(AccidentYear,DevelopmentYear,DevelopmentLag),summarise,
IncurLoss=sum(IncurLoss_H1-BulkLoss_H1),CumPaidLoss=sum(CumPaidLoss_H1), EarnedPremDIR=sum(EarnedPremDIR_H1))
#Changing data into a triangle format####
#To make loss triangle, exclude any developments after 1997
OL <- SumData[SumData$DevelopmentYear<1998,]
#Arguments for the as.triangle:
#Organizes data table into triangle format
#Origin is the row names of the triangle
#Dev is the column names of the triangle
#Value is the column and rows input, generally the claim amounts, paid losses, etc.
LossTri <- as.triangle(OL, origin="AccidentYear", dev="DevelopmentLag", value="IncurLoss")
PaidTri <- as.triangle(OL, origin="AccidentYear", dev="DevelopmentLag", value="CumPaidLoss")
#Prints data in triangle format
LossTri
PaidTri
###Mack Chain Ladder Function###
#The MackChainLadder model uses the chain ladder approach for predicting ultimate and IBNR values for each
#row(in this case accident year)for a cumulative loss triangle. This model provides several methods for predicting the
#ultimates, IBNRs, and the standard error of the IBNRs. The default method of the model predicts the ultimate values
#using chain ladder ratios with the assumption of no tail factor, and the standard error of the ultimates are
#approximated using a log linear model. The model also has the option to use two other ratios in the place of the chain
#ladder ratio, which are the simple average and the weighted average of the development ratios. There is the Mack
#method available to estimate the standard error of the ultimates, and this method is described in detail in the
#following paper found at the address http://www.actuaries.org/LIBRARY/ASTIN/vol29no2/361.pdf. A tail factor can also
#be included in the predictions of the ultimate values, with the options of entering in the tail estimations manually
#or using a log linear regression for the tail estimations. Below is an example of the MackChainLadder model used to
#predict the ultimates, IBNRs, chainladder ratios, and their standard errors of CAS medical malpractice data.
#Note: In order to use Mack's estimation for standard error, three assumptions need to be true. These are found on
#MackChainLadder info page in the details section, which is accessible by entering ?MackChainLadder into R.
#MackChainLadder can take in the following arguments for the model:
#LossTri is the cumulative loss triangle.
#Alpha-it is the ratio used in the prediction of ultimate values, alpha=1(default)is the chain ladder ratio, alpha=0
#is the simple average of the development ratios, and alpha=2 is the weighted average of the development ratios.
#Weights-the default is 1, which sets the weight for each triangle loss amount as 1. Alternatively, the weight for a
#particular entry can be set to 0 to exclude loss amounts from the calculation of the chain ladder ratios. In order to do
#this, a separate triangle with the same dimension as the Loss triangle must be created with the desired weight values,
#1 or 0, entered for each claim amount.
#tail-can be logical or a numeric value. If tail=FALSE no tail factor will be applied (default), if tail=TRUE a tail
#factor will be estimated via a linear extrapolation of log(chainladderratios - 1), if tail is a numeric value(>1)
#than this value will be used instead.
#tail.se and tail.sigma-They are the standard error and the variation of each tail factor. Both are only needed if there
#is a tail factor and you have a numeric (>1) value to enter. Otherwise they are NULL and the model estimates them by a
#loglinear regression.
#est.sig-it is the method used to estimate the standard error of the IBNRs. Default is "log-linear", or it can be
#estimated by Mack's method mentioned above by making the argument "Mack".
#Keep in mind by not passing in arguments, then the defaults are assumed
MCL <- MackChainLadder(LossTri,est.sig="Mack")
MCL
##Prints the following columns of information per accident year(origin period):
##Latest-the claim amount for the last development period
##Dev.To.Date- The development to date or the ratio of the latest over the predicted ultimate
##Ultimate-predicted ultimate claim
##IBNR- the predicted IBNR reserve
##Mack.S.E.-is the standard error, or the standard deviation of the bounds for the predicted ultimate and IBNR since the
##estimate is unbiased(shown in Mack's 1999 paper). In other words, since the S.E given is equal to one standard deviation,
##a confidence interval for the true ultimate value can be found using the standard error and the predicted ultimate.
##CV(IBNR)-coefficient of variation, or the ratio of the standard error over the predicted IBNR
#The bottom output gives a total or sum of the latest, ultimates, IBNRs. It also gives the standard error of the total
#ultimate(this is not the total of the standard errors).The development to date factor is the ratio of the total latest
#against the total ultimate, and the CV(IBNR) is the percentage of the total standard error in the total IBNR
#If the CV(absolute value) is greater than 25%, then another model or a log linear regression should be used.
plot(MCL)
#Plots six different graphs
#Starting from the top left with a stacked bar-chart of the
#latest claims position plus IBNR and Mack’s standard error by origin period; next right to it is a
#plot of the forecasted development patterns for all origin periods (numbered, starting with 1 for the
#oldest origin period), and 4 residual plots. The residual plots show the standardised residuals against
#fitted values, origin period, calendar period and development period.
#The residual plots should be scattered with no pattern or direction for Mack's method of calculating the standard
#error to apply. Patterns could be a result of a trend that should be investigated further. More information on that
#can be found at this address http://www.casact.org/pubs/proceed/proceed00/00245.pdf
###Information contained in the MackChainLadder Model###
#The MackChainLadder also stores other valuable data that you can print such as the chain ladder ratios, the standard
#error of the chain ladder ratios, explained and unexplained variability, etc.
#This lists the data contained in the MackChainLadder variable
names(MCL)
#Any values listed under the "names" can be accessed and printed. Simply type the model name(MCL) followed by
#the "$" sign and then type the name like the examples below.
#Prints the chain ladder ratios
MCL$f
#This prints the entire forecasted triangle
MCL$FullTriangle
###Munich Chain Ladder###
#The Munich-chain-ladder model predicts ultimate claims based on a cumulative paid and incurred claims triangle.
#This model uses the correlation between incurred losses and paid losses to make future projections for both the total
#paid and incurred ultimate. The "Munich" ratios are calculated using chain ladder ratios, paid/incurred ratios, and
#the slope of the regression line in the residual plot of the incurred (or paid) losses. For a better idea on how these
#ratios are calculated, please read this paper http://www.variancejournal.org/issues/02-02/266.pdf. The standard error
#of the incurred and paid ultimate for this model can be calculated either by a log linear regression(default),
#or by Mack's method which is the same as in the MackChainLadder model, or a combination of the two. This model also has
#the option for the inclusion of a tail factor in ether ultimate calculation.
#MunichChainLadder takes in the following arguments:
#A cumulative incurred loss triangle and a cumulative paid triangle
#est.sigmaI - How the standard error for the incurred loss triangle is calculated, either "loglinear"(default) or "Mack"
#est.sigmaP - How the standard error for the paid loss triangle is calculated, either "loglinear"(default) or "Mack"
#tailP - Defines how the tail for the paid loss triangle is calculated, if TRUE then a log linear regression is used to
#estimate the tail factor, or a numeric value(>1) can be entered for a tail factor. Otherwise FALSE(default) and no tail
#is included in the model.
#tailI - Defines how the tail for the incurred loss triangle is calculated, if TRUE then a log linear regression is used
#to estimate the tail factor, or a numeric value(>1) can be entered for a tail factor. Otherwise FALSE(default) and no
#tail is included in the model.
#Recall that if any argument is not present then the default value is assunmed.
MuCL <- MunichChainLadder(PaidTri,LossTri)
MuCL
#Prints the following output for each accident year(origin year):
#Latest Paid/Incurred - It is the latest development for the paid and incurred triangles
#P/I Ratio - It is the latest paid over the latest incurred developments
#Ult. Paid/Ult. Incurred - They are the predicted ultimate values in the paid and incurred triangles
#Ult. P/I Ratio - It is the ratio of the predicted paid ultimate over the predicted incurred ultimate
#At the bottom under "Totals", the output gives the sums of the incurred and paid last developments and the predicted
#total ultimate for each. It also gives the total latest P/I ratio and the predicted total ultimate P/I ratio.
#It is important to note that the ultimate P/I ratios should be close to 1. If they are not, then further investigation is
#needed in order to determine if this is a good model for the data.
#Plotting the Munich results serves to give a quick overview of the data, and to check the residual plots
plot(MuCL)
#Prints four graphs
#starting from the top left with a barchart of forecasted
#ultimate claims costs by Munich chain ladder (MCL) on paid and incurred data by origin period; the
#barchart next to it compares the ratio of forecasted ultimate claims cost on paid and incurred data
#based on the Mack chain ladder and Munich chain ladder methods; the two residual plots at the bottom
#show the correlation of "Munich"(P/I) ratios against the paid chain ladder ratios and the correlation
#of "Munich"(P/I) ratios against the incurred chain ladder ratios.
#The residual plots should be random and show no pattern or direction. Otherwise, a better model might be needed and this
#matter should be investigated more.
#Like the Mack Model, multiple values are stored in the Munich model.
#For example the Munich forcasts of the Paid and Incurred triangles.
MuCL$MCLPaid
MuCL$MCLIncurred
#There are several other values contained in the Model as well, these can found by the function names(MuCL).
#To print any of the values, add the name after "MuCL$" like in the Mack example above.
###BootChainLadder###
#The BootChainLadder is a model that provides a predicted distribution for the IBNR values for a claims triangle. However,
#this model predicts IBNR values by a different method than the previous two models. First, the development factors are
#calculated and then they are used in a backwards recursion to predict values for the past loss triangle. Then the
#predicted values and the actual values are used to calculate Pearson residuals. The residuals are adjusted by a formula
#specified in appendix 3 in the follow paper (http://www.actuaries.org.uk/system/files/documents/pdf/sm0201.pdf). Using
#the adjusted residuals and the predicted losses from before, the model solves for the actual losses in the Pearson
#formula and forms a new loss triangle. The steps for predicting past losses and residuals are then repeated for this new
#triangle. After that, the model uses chain ladder ratios to predict the future losses then calculates the ultimate and
#IBNR values like in the previous Mack model. This cycle is performed R times, depending on the argument values in the
#model (default is 999 times). The IBNR for each origin period is calculated from each triangle(the default 999) and used
#to form a predictive distribution, from which summary statistics are obtained such as mean, prediction error, and
#quantiles.
#The BootChainLadder model takes in the following arguments:
#The cumulative claims triangle
#R-the number of bootstraps(the default is 999)
#process.distr-or the way the process error is calculated for each predicted IBNR values with the options of
#"gamma"(default) and "od.pois" (over dispersed Poisson)
B <- BootChainLadder(LossTri, R=5000)
B
#The output has some of the same values as the Munich and Mack models did.
#The Mean and SD IBNR is the average and the standard deviation of the predictive distribution of the IBNRs for each
#origin year
#The output also gives the 75% and 95% quantiles of the predictive distribution of IBNRs, in other words 95% or 75% of
#the predicted IBNRs lie at or below the given values.
###Actual Developments###
#Actual incurred loss triangle to compare with models
#The TrueUlt variable is the sum of all the actual ultimates to compare to the predicted total ultimate
actualTri <- as.triangle(SumData, origin="AccidentYear", dev="DevelopmentLag", value="IncurLoss")
ActualUlt <- sum(actualTri[,10])
actualTri
ActualUlt
#The actual losses can be compared to the predicted losses by the chain ladder method