Machine learning for the insurance industry: predictive models, fraud detection and fairness
At the heart of their mission, the insurance industry strive to satisfy their customers and offer them the insurance products that most adequately match their needs. Thanks to a vast amount of corporate data accumulated through the years, to the availability of impressive computational resources, and to the current state of knowledge of machine learning research, insurance companies can now attemps to build effective predictive models about some aspects of client behaviour and their needs. However, insurance companies are also accountable to our society and, in particular, this implies that they should not offer any service and coverage that is, in some way, discriminatory in term of race, skin colour, ethnic origin, or other irrelevant characteristics that are, arguably, immoral to use. In that sense, the insurance industry should also be fair in the service that they provide.
Consequently, this research proposal aims at advancing the current state of knowledge in areas of machine learning research wich are mostly relevant to the insurance industry. More precisely, from the corporate data at SSQ, we aim at building the most accurate, and fair, predictive models for customer needs of insurance products and for some aspects of customer behaviour, such as the likelihood that a client will not renew a given insurance policy. We also aim at building accurate, and fair, fraud detectors with the ability to detect fraud at an early stage and the ability to detect new types of fraud. To meet these objectives, we will need to adapt existing machine learning algorithms in novel ways and design new ones such they can use and combine different data sources during learning, some of wich are sequential in nature. Moreover, we will also need to find ways to enforce fairness into machine learning algorithms such that the predictors output by them will not be using irrelevant sensible attributes (such as race, thnic origin, religion, etc.) in a way that makes them perform unevenly across different groups of individuals.