Insuranceciooutlook

Adopting Innovative Insurance Data Models

Thomas Fletcher, PhD, VP, Data Analytics, North America Life, PartnerRe [NYSE:PRE-J]

Thomas Fletcher, PhD, VP, Data Analytics, North America Life, PartnerRe [NYSE:PRE-J]

While data has always been used in insurance underwriting, much of the industry has undergone a significant transformation. In the Life Insurance industry, for example, data were used from medical exams, medical records, prescription histories, motor vehicle records and so forth. In more recent years, many more sources of data have emerged and are being leveraged for more cases (lower face amounts, acceleration of the process, and foregoing some medical tests). 

Many years ago, my then boss said in reference to the evaluation of a vendor-produced model, “protect the company; but, while they may build the model differently than you would, that doesn’t make it incorrect”. There are a few seemingly conflicting messages here. As such, below, I share my experiences over the last couple of decades in evaluation of innovative tools brought forth. Data can be viewed much like penicillin. Revolutionary in its potential impact, but not all mold is valuable. In a world where barbers are more heavily regulated than data scientists, what should you do if your boss asks you to protect the company and ensure the product you are evaluating is worthwhile?

Data Availability and Consistency

It is not uncommon to start a project where one realizes the data used for modeling isn’t available in production. This is, of course, a necessary and critical condition. Likewise, the source of the data and the data themselves must be consistent. Think if we were tracking some phenomenon and the measuring stick kept moving (imperial vs metric). The same info from day to day might have different meanings. When evaluating whether some new source (data, model, tool) is contributing value, we should ensure that the lineage and reliability of the data are known.

Data Relevance

To the extent possible, one should make an effort to link any data used to the outcome in a conceptual sense. The causal chain may be long and full of dependencies, but data should be relevant, nonetheless. Relevance can usually be determined by domain experts. As a stretch example, consider contextual factors such as poverty, family history, and access to healthcare. It might be a long chain of events, but one could work through the logic to show relevance. Contextual factors lead to certain lifestyle characteristics, which then lead to specific behaviors (exercise, eating habits) that, in turn, impact the body (BMI, Cholesterol, blood sugar). I have seen many people make the argument that the correlation is good enough. This may be true until a realization that other factors are driving the correlation or one is defending the use of such data to a regulator.

Is the tool valid? 

Validity of the tool simply refers to whether or not the tool/ model measures what it purports to measure. If the model is designed to identify non-disclosure (e.g., smokers who claim to be non-smokers), then it should do so. Generally, a model is designed with only one target in mind, and there should be solid empirical evidence that the model indeed captures that target (and not something else). There are means to show validity in an empirical as well as rational sense. Likewise, such validity should be considered in concert with other tools. Will the model or tool have incremental validity above any current processes?

"Data can be viewed much like penicillin. Revolutionary in its potential impact, but not all mold is valuable" 

Utility – will it be useful? 

Not all models have to be empirically strong to yield value. To the extent that a model adds (as in incremental validity above) unique information, it may prove useful. I have seen strong models go unused because the case wasn’t made as to what it would solve or the benefits gained from its use, and I have seen moderate models put into place because of the utility they bring.

The issue of the usefulness of the model is somewhat dependent on how any thresholds used are set. The same model could be deployed at two different companies, each with different goals and different thresholds, and one company may find utility whereas the other may not. Likewise, setting thresholds could impact any fairness concerns – see below. Goals in terms of model impact may also vary widely depending on the application or function (e.g., marketing vs underwriting). In life insurance, typically, the model would be helpful to (a) achieve similar mortality results with greater efficiency and, therefore, more throughput (higher number of applications resulting in underwritten policies) or (b) lower mortality results from a similar number of applications. Of course, other goals can be met, but the point is that careful attention to the threshold setting will directly impact the achievement of these two separate goals. To be useful, the model should be sound enough to have the flexibility to achieve its stated goals. 

Are there any fairness concerns?

Finally, to ensure a data/model/tool will add to an organization’s goals, one must ascertain the extent to which the model may introduce unfair discrimination. While, historically, insurance companies have not collected protected class information, this is an emerging requirement. Does the model work differently for some protected classes? Does the model contain data that masks certain classes without actually being linked to the outcome of interest (e.g., more strongly related to race than to the target)? Will the use of the model yield disparate outcomes that are not justified by the underlying risk? These are all considerations when adopting a new model or tool. 

Hopefully, following these key considerations will allow you to evaluate new tools and data products in a way that allows you to protect your company, whether they be developed internally or ingested from a vendor. For the latter, there should be a partnership in such evaluation.

Weekly Brief

Read Also

Protect or Innovate? Cutting Through the Noise When Evaluating Predictive Models

Protect or Innovate? Cutting Through the Noise When Evaluating Predictive Models

Tom Fletcher, PhD, VP, Data Analytics, North America Life, PartnerRe
Optimizing Innovation Initiatives by Artfully Managing Change

Optimizing Innovation Initiatives by Artfully Managing Change

Lori Pon, Director, Claim Contact Center and Claim Handling Unit at AAA-the Auto Club Group
 Digital Ecosystems and Insurance - A Winning Partnership

Digital Ecosystems and Insurance - A Winning Partnership

Sean Ringsted, Chief Digital Officer, Chubb
Data Governance Systems Undergoing Ongoing Evolution

Data Governance Systems Undergoing Ongoing Evolution

Paul Pries, Director – Data Governance, West Bend Mutual Insurance Company
People as Decision-Makers; Technology  as an Enabler

People as Decision-Makers; Technology as an Enabler

Ralph LaSpina, EVP, Chief Marketing & Underwriting Officer, FCCI Insurance Group