#Example Loss Reserving data sets can be found here http://www.casact.org/research/index.cfm?fa=loss_reserves_data #This example uses the other liability data from the site given above. #This example uses the ChainLadder and the plyr package. Both need to be installed in order to be called. #Loads ChainLadder package library(ChainLadder) #Loads plyr package library(plyr) #Reading in data (comma separated format) from the above website #Tell what file to be read. #header is TRUE if there are column names in the file being read in #Separation is what separates each individual value being read in OthLiabData <- read.csv("http://www.casact.org/research/reserve_data/othliab_pos.csv", header = TRUE, sep = ",") ###Summarizing the Data### #Instead of having the data separated by insurance company, this function will sum the data subsets of the reported #incurred losses, cumulative paid losses, and the earned premiums; then it will arrange the information by accident year, #development year, and development lag. #Note: The incurred loss data in the Excel file is the total loss reserve, and the bulk loss data is the IBNR data. #Therefore, to get the reported incurred data subtract the incurred loss data by the bulk loss data. #Arguments in the function: #OthLiabData is the data frame that will be split. #.(…) is the variables to spilt the data frame #summarise (not a misspelling) is a function that creates new columns for the summed data #Sum each subset of data by the variable given in quotations and rename the new columns of output SumData <- ddply(OthLiabData,.(AccidentYear,DevelopmentYear,DevelopmentLag),summarise, IncurLoss=sum(IncurLoss_H1-BulkLoss_H1),CumPaidLoss=sum(CumPaidLoss_H1), EarnedPremDIR=sum(EarnedPremDIR_H1)) #Changing data into a triangle format#### #To make loss triangle, exclude any developments after 1997 OL <- SumData[SumData$DevelopmentYear<1998,] #Arguments for the as.triangle: #Organizes data table into triangle format #Origin is the row names of the triangle #Dev is the column names of the triangle #Value is the column and rows input, generally the claim amounts, paid losses, etc. LossTri <- as.triangle(OL, origin="AccidentYear", dev="DevelopmentLag", value="IncurLoss") PaidTri <- as.triangle(OL, origin="AccidentYear", dev="DevelopmentLag", value="CumPaidLoss") #Prints data in triangle format LossTri PaidTri ###Mack Chain Ladder Function### #The MackChainLadder model uses the chain ladder approach for predicting ultimate and IBNR values for each #row(in this case accident year)for a cumulative loss triangle. This model provides several methods for predicting the #ultimates, IBNRs, and the standard error of the IBNRs. The default method of the model predicts the ultimate values #using chain ladder ratios with the assumption of no tail factor, and the standard error of the ultimates are #approximated using a log linear model. The model also has the option to use two other ratios in the place of the chain #ladder ratio, which are the simple average and the weighted average of the development ratios. There is the Mack #method available to estimate the standard error of the ultimates, and this method is described in detail in the #following paper found at the address http://www.actuaries.org/LIBRARY/ASTIN/vol29no2/361.pdf. A tail factor can also #be included in the predictions of the ultimate values, with the options of entering in the tail estimations manually #or using a log linear regression for the tail estimations. Below is an example of the MackChainLadder model used to #predict the ultimates, IBNRs, chainladder ratios, and their standard errors of CAS medical malpractice data. #Note: In order to use Mack's estimation for standard error, three assumptions need to be true. These are found on #MackChainLadder info page in the details section, which is accessible by entering ?MackChainLadder into R. #MackChainLadder can take in the following arguments for the model: #LossTri is the cumulative loss triangle. #Alpha-it is the ratio used in the prediction of ultimate values, alpha=1(default)is the chain ladder ratio, alpha=0 #is the simple average of the development ratios, and alpha=2 is the weighted average of the development ratios. #Weights-the default is 1, which sets the weight for each triangle loss amount as 1. Alternatively, the weight for a #particular entry can be set to 0 to exclude loss amounts from the calculation of the chain ladder ratios. In order to do #this, a separate triangle with the same dimension as the Loss triangle must be created with the desired weight values, #1 or 0, entered for each claim amount. #tail-can be logical or a numeric value. If tail=FALSE no tail factor will be applied (default), if tail=TRUE a tail #factor will be estimated via a linear extrapolation of log(chainladderratios - 1), if tail is a numeric value(>1) #than this value will be used instead. #tail.se and tail.sigma-They are the standard error and the variation of each tail factor. Both are only needed if there #is a tail factor and you have a numeric (>1) value to enter. Otherwise they are NULL and the model estimates them by a #loglinear regression. #est.sig-it is the method used to estimate the standard error of the IBNRs. Default is "log-linear", or it can be #estimated by Mack's method mentioned above by making the argument "Mack". #Keep in mind by not passing in arguments, then the defaults are assumed MCL <- MackChainLadder(LossTri,est.sig="Mack") MCL ##Prints the following columns of information per accident year(origin period): ##Latest-the claim amount for the last development period ##Dev.To.Date- The development to date or the ratio of the latest over the predicted ultimate ##Ultimate-predicted ultimate claim ##IBNR- the predicted IBNR reserve ##Mack.S.E.-is the standard error, or the standard deviation of the bounds for the predicted ultimate and IBNR since the ##estimate is unbiased(shown in Mack's 1999 paper). In other words, since the S.E given is equal to one standard deviation, ##a confidence interval for the true ultimate value can be found using the standard error and the predicted ultimate. ##CV(IBNR)-coefficient of variation, or the ratio of the standard error over the predicted IBNR #The bottom output gives a total or sum of the latest, ultimates, IBNRs. It also gives the standard error of the total #ultimate(this is not the total of the standard errors).The development to date factor is the ratio of the total latest #against the total ultimate, and the CV(IBNR) is the percentage of the total standard error in the total IBNR #If the CV(absolute value) is greater than 25%, then another model or a log linear regression should be used. plot(MCL) #Plots six different graphs #Starting from the top left with a stacked bar-chart of the #latest claims position plus IBNR and Mack’s standard error by origin period; next right to it is a #plot of the forecasted development patterns for all origin periods (numbered, starting with 1 for the #oldest origin period), and 4 residual plots. The residual plots show the standardised residuals against #fitted values, origin period, calendar period and development period. #The residual plots should be scattered with no pattern or direction for Mack's method of calculating the standard #error to apply. Patterns could be a result of a trend that should be investigated further. More information on that #can be found at this address http://www.casact.org/pubs/proceed/proceed00/00245.pdf ###Information contained in the MackChainLadder Model### #The MackChainLadder also stores other valuable data that you can print such as the chain ladder ratios, the standard #error of the chain ladder ratios, explained and unexplained variability, etc. #This lists the data contained in the MackChainLadder variable names(MCL) #Any values listed under the "names" can be accessed and printed. Simply type the model name(MCL) followed by #the "$" sign and then type the name like the examples below. #Prints the chain ladder ratios MCL$f #This prints the entire forecasted triangle MCL$FullTriangle ###Munich Chain Ladder### #The Munich-chain-ladder model predicts ultimate claims based on a cumulative paid and incurred claims triangle. #This model uses the correlation between incurred losses and paid losses to make future projections for both the total #paid and incurred ultimate. The "Munich" ratios are calculated using chain ladder ratios, paid/incurred ratios, and #the slope of the regression line in the residual plot of the incurred (or paid) losses. For a better idea on how these #ratios are calculated, please read this paper http://www.variancejournal.org/issues/02-02/266.pdf. The standard error #of the incurred and paid ultimate for this model can be calculated either by a log linear regression(default), #or by Mack's method which is the same as in the MackChainLadder model, or a combination of the two. This model also has #the option for the inclusion of a tail factor in ether ultimate calculation. #MunichChainLadder takes in the following arguments: #A cumulative incurred loss triangle and a cumulative paid triangle #est.sigmaI - How the standard error for the incurred loss triangle is calculated, either "loglinear"(default) or "Mack" #est.sigmaP - How the standard error for the paid loss triangle is calculated, either "loglinear"(default) or "Mack" #tailP - Defines how the tail for the paid loss triangle is calculated, if TRUE then a log linear regression is used to #estimate the tail factor, or a numeric value(>1) can be entered for a tail factor. Otherwise FALSE(default) and no tail #is included in the model. #tailI - Defines how the tail for the incurred loss triangle is calculated, if TRUE then a log linear regression is used #to estimate the tail factor, or a numeric value(>1) can be entered for a tail factor. Otherwise FALSE(default) and no #tail is included in the model. #Recall that if any argument is not present then the default value is assunmed. MuCL <- MunichChainLadder(PaidTri,LossTri) MuCL #Prints the following output for each accident year(origin year): #Latest Paid/Incurred - It is the latest development for the paid and incurred triangles #P/I Ratio - It is the latest paid over the latest incurred developments #Ult. Paid/Ult. Incurred - They are the predicted ultimate values in the paid and incurred triangles #Ult. P/I Ratio - It is the ratio of the predicted paid ultimate over the predicted incurred ultimate #At the bottom under "Totals", the output gives the sums of the incurred and paid last developments and the predicted #total ultimate for each. It also gives the total latest P/I ratio and the predicted total ultimate P/I ratio. #It is important to note that the ultimate P/I ratios should be close to 1. If they are not, then further investigation is #needed in order to determine if this is a good model for the data. #Plotting the Munich results serves to give a quick overview of the data, and to check the residual plots plot(MuCL) #Prints four graphs #starting from the top left with a barchart of forecasted #ultimate claims costs by Munich chain ladder (MCL) on paid and incurred data by origin period; the #barchart next to it compares the ratio of forecasted ultimate claims cost on paid and incurred data #based on the Mack chain ladder and Munich chain ladder methods; the two residual plots at the bottom #show the correlation of "Munich"(P/I) ratios against the paid chain ladder ratios and the correlation #of "Munich"(P/I) ratios against the incurred chain ladder ratios. #The residual plots should be random and show no pattern or direction. Otherwise, a better model might be needed and this #matter should be investigated more. #Like the Mack Model, multiple values are stored in the Munich model. #For example the Munich forcasts of the Paid and Incurred triangles. MuCL$MCLPaid MuCL$MCLIncurred #There are several other values contained in the Model as well, these can found by the function names(MuCL). #To print any of the values, add the name after "MuCL$" like in the Mack example above. ###BootChainLadder### #The BootChainLadder is a model that provides a predicted distribution for the IBNR values for a claims triangle. However, #this model predicts IBNR values by a different method than the previous two models. First, the development factors are #calculated and then they are used in a backwards recursion to predict values for the past loss triangle. Then the #predicted values and the actual values are used to calculate Pearson residuals. The residuals are adjusted by a formula #specified in appendix 3 in the follow paper (http://www.actuaries.org.uk/system/files/documents/pdf/sm0201.pdf). Using #the adjusted residuals and the predicted losses from before, the model solves for the actual losses in the Pearson #formula and forms a new loss triangle. The steps for predicting past losses and residuals are then repeated for this new #triangle. After that, the model uses chain ladder ratios to predict the future losses then calculates the ultimate and #IBNR values like in the previous Mack model. This cycle is performed R times, depending on the argument values in the #model (default is 999 times). The IBNR for each origin period is calculated from each triangle(the default 999) and used #to form a predictive distribution, from which summary statistics are obtained such as mean, prediction error, and #quantiles. #The BootChainLadder model takes in the following arguments: #The cumulative claims triangle #R-the number of bootstraps(the default is 999) #process.distr-or the way the process error is calculated for each predicted IBNR values with the options of #"gamma"(default) and "od.pois" (over dispersed Poisson) B <- BootChainLadder(LossTri, R=5000) B #The output has some of the same values as the Munich and Mack models did. #The Mean and SD IBNR is the average and the standard deviation of the predictive distribution of the IBNRs for each #origin year #The output also gives the 75% and 95% quantiles of the predictive distribution of IBNRs, in other words 95% or 75% of #the predicted IBNRs lie at or below the given values. ###Actual Developments### #Actual incurred loss triangle to compare with models #The TrueUlt variable is the sum of all the actual ultimates to compare to the predicted total ultimate actualTri <- as.triangle(SumData, origin="AccidentYear", dev="DevelopmentLag", value="IncurLoss") ActualUlt <- sum(actualTri[,10]) actualTri ActualUlt #The actual losses can be compared to the predicted losses by the chain ladder method