Large Language Model (LLM) is a kind of Artificial Intelligence algorithm that is mainly used to understand human languages or text using self-supervised learning techniques and neural networks. Tasks like language generation, summary writing, conversational AI (chatbot) are based on these LLMs these days. ChatGPT by OpenAI and Bard by Google are some popular examples powered by LLMs.
Off-the-shelf LLMs are trained on enormous amounts of data across many domains, and therefore tend to perform poorly on domain-specific tasks. But there are several ways to coax what you need from them. The custom LLMs are built on top of off-the shelf, often free pre trained LLMs. But they stay within the organizational premises like local server or private cloud. Organizational data never leaves its secure premises.
Building a custom LLM requires a lot of in-depth understanding of NLP space and its evolution over time as well as current offerings in the landscape. Y Point has set up a Center of Excellence to keep pace this fast-changing space as early as 2017, even before LLMs. The findings of research are continuously fed back into our solutions and know how. Some examples of our research so far are listed here:
This research was about gathering useful information from publicly available internet textual data and make it available for analysis using BI tools. This was done using Word2Vec and GloVe word embeddings that pre-date LLMs. These two models were the first to encode semantic relationships into their models. The following picture shows that similar words are placed closer in these word embeddings.
Y Point has chatbots built using ChatGPT underneath. Check Chatbot for more information.
LLMs can be treated as the next generation of word embeddings. Researchers came to realize that as the number of parameters increase, the models started giving much better results. The term LLM came to address the models with these models with large number of parameters. As shown in the figures below, the term has become recently popular in Google searches. An elbow point is seen around ChatGPT where the interest had suddenly gone up sharply.
From “A Survey of Large Language Models by Wayne Xin Zhao et al” [arXiv:2303.18223v11 [cs.CL] 29 Jun 2023]
Even though ChatGPT caught public imagination, that is not the only model. Google came up with Bard and LLaMa and LLaMa2 from Meta have caught up. Stanford research labs have developed HELM, a bench mark for these LLMs.