Automation at Allstate using Large Language Model's to transform customer's experiences

by Barry Dillon, Machine Learning Engineer.

Published: August 10, 2023

About Barry

Barry joined Allstate NI in March of this year after ten years working in academia. In his research he worked on theoretical particle physics, focusing on the application of machine learning algorithms to physics analyses at the Large Hadron Collider. During his research he developed expertise in topic models, anomaly detection, self-supervised models, and transformer networks. Since joining Allstate he has been working on the application of Large Language Models such as GPT4 to solving business problems.

Barry Dillion presentation at digital DNA Belfast.

An Introduction to Automation

Businesses have been using simple automation tools for a long time. There is Robotic Process Automation, which allows users to automate repetitive rule-based tasks on a computer, for example data entry. We have automated pricing models which take customer data and estimates the cost of insurance, getting prices to customers as quickly as possible. Then we have basic rule-based chatbots that can, for example, help users find answers to frequently asked questions. With these tools, engineers program the automation tools to complete the tasks they need to do. For many years now Allstate have also been using more advanced automation tools based on machine-learning technology. Here, engineers provide the tools with data containing examples of how the task is done, and the tools learn the automation process on their own. There are many examples of machine-learning automation tools in use within Allstate, for example we have voice transcription, sentiment modelling, and legal automation tools (GIA), to name a few.

The release of OpenAIs ChatGPT just six months ago has opened up many new opportunities for automation in business. The technology underpinning ChatGPT are the Large Language Models. While these have been in development for some time and known about widely in the tech community, modern language models have only been around now for about six years. The leap forward in progress made by OpenAI recently, followed closely by the open-source community, has demonstrated the importance of these tools and the impact they will have on the future of enterprise-level automation.

What is a Large Language Model?

A Language Model is a function that takes a sequence of words and predicts the next word. It is not programmed to do this explicitly, instead it learns how to do this from texts that are input to the model. Language models are able to learn from the texts in this way because they are built from neural networks, a widely used tool in machine-learning. Learning to predict the next word in a sequence might sound like a pointless task, and far short of the ChatGPT capabilities. But in learning to do this the language models learn about fluency, context understanding, synonyms, conversation topics, etc. Once the language model has learned this, it can then learn from more specific task-focused datasets to do various other things like follow instructions, participate in conversation, write computer code etc. These task-focused datasets are expensive to curate, which is why language models first learn the basics of language fluency by learning to predict the next word in a sequence.

We measure the size of a language model by the number of parameters it has. More parameters typically leads to a better performing and higher quality model. In recent years we have seen an explosion in the size of language models being built, leading to the coining of the term Large Language Model. Take OpenAI for example, in June 2018 they released GPT-1 with 117 million parameters. In Feb 2019 they released GPT-2 with thirteen times more parameters at 1.5 billion. Then with Microsofts backing they released GPT-3 in June 2020 with 175 billion parameters, this is the model that ChatGPT is based on. OpenAI have not said how many parameters the recently released GPT-4 model has, but given the trend, it’s likely in the trillions.

Automation with Large Language Models (LLM's) at Allstate

The first LLM application we are using in Allstate is called Intelligent Search, where we use LLM's as AI-based search engines. As we said before, LLMs are neural networks, and we can think of neural networks as consisting of layers or blocks. So we can literally just cut the end off an LLM by removing the last layer of the neural network. When we do this, what we have left is an embedding model. The LLM no longer predicts the next word in a sequence, instead it simply takes a chunk of text, and converts it (or embeds it) into a vector of numbers. From these vectors we can calculate a distance between each document or chunk of text. The key feature of the LLM is that it embeds documents (i.e. chunks of text) which are similar to each other, close together. This is very important. Let's say we have a bunch of documents of different types. We have compliance documents, contact information, policy documents, and pricing information. We pass every document we have through the model separately and get a vector of numbers (i.e. an embedding) for each one. Then documents which have a similar context, similar language, that discuss similar topics, will get embedded closer to each other by the LLM. This is in effect, a kind of sorting. Now say our user has some question, such as "Does home insurance cover flooding?". If we also embed this using the model, it will get embedded close to documents which discuss similar topics that appear in the question. Then the "search" is as simple as finding documents which are closest to our question in the embedding, and sorting them. This is how we can use LLM's as search engines. We can go further by asking the LLM to read the documents returned by the search, answer the users question, and provided references to where it found the information. We already have several use-cases within Allstate for this technology.

What's next for LLM's?

In the near-future we expect to see much more impressive LLMs from OpenAI, competitors (such as Google, Amazon, Anthropic), and the open-source community. For example we will see LLMs that can process much more text at once, and multi-modal LLMs that can process text, audio, and video simultaneously. The opportunities for enterprise-level automation here will grow quickly in the coming years, and within Allstate we are well poised to take advantage of it.