Emerging Large Language Model (LLM) Application Architecture

I’m currently the Chief Evangelist @ HumanFirst. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

Why do I say LLMs are unstructured? LLMs are to a large extent an extension of Conversational AI.

Due to the unstructured nature of human language, the input to LLMs are conversational and unstructured, in the form of Prompt Engineering.

And the output of LLMs is also conversational and unstructured; a highly succinct form of natural language generation (NLG).

LLMs introduced functionality to fine-tune and create custom models. And an initial approach to customising LLMs was creating custom models via fine-tuning.

This approach has fallen into disfavour for three reasons:

  1. As LLMs have both a generative and predictive side. The generative power of LLMs is easier to leverage than the predictive power. If the generative side of LLMs are presented with contextual, concise and relevant data at inference-time, hallucination is negated.
  2. Fine-tuning LLMs involves training data curation, transformation and cost. Fine-tuned models are frozen with a definite time-stamp and will still demand innovation around prompt creation and data presentation to the LLM.
  3. When classifying text based on pre-defined classes or intents, NLU still has an advantage with built-in efficiencies.

The aim of fine-tuning of LLMs is to engender more accurate and succinct reasoning and answers. This also solves for one of the big problems with LLMs; hallucination, where the LLM returns highly plausible but incorrect answers.

Read More