Thoughts on LLMs

LLMs enable a whole lot that was not possible before. The amount of cognitive automation now possible is quite fabulous. They're quirky - have their strengths and weaknesses.

So much is still to be done in this field to get the best out of this technology. This post covers some thoughts and observations from working with them.

1. The fast follow era

A great advantage of this technology is how today's weak points become training data for the next model.
What would have been considered the "last mile" requiring manual work can be minimal because of a general reasoning entity that can set itself up for future improvement.
Smart synthetic data generation and other techniques play a huge part here.
Examples: letter counting in words, comparing decimals - all much easier for LLMs today unlike a while ago, thanks to training on corrected erroneous data and tool augmentation.

2. Tool augmentation

LLMs are great routers.
Augment them with helpful deterministic tools and you can reliably do a whole lot more.

3. Jagged intelligence

The above example of decimal number comparison (without tools) is a case of jagged intelligence as it has been termed.
LLMs are not great at everything - or what most people would consider simple.
Advances in model interpretability will likely help us deal with this.
"When can I consult an LLM for this problem?" and "Which parts of my problem can I let the LLM confidently solve?" are meaningful questions that better interpretability can help answer.
Circuits get activated. Consider a circuit to be an expert. How can we selectively consult different experts as and only when needed?

4. Sycophancy

LLMs have a tendency to agree with the user's viewpoint or please them - this can be a huge problem.
Try noticing how your chatbot answers when you ask "Which is better: Option A or Option B?" versus "Which is better: Option B or Option A?" - the order often influences the answer.
This is quite a problem because we often use these chatbots as experts and accept their conclusions.
Recursive questioning and factual grounding will become important pieces to solve this.

5. Context engineering

These language models have finite memory and it quickly becomes critical to manage it for optimal performance.
Models are inherently lossy.
When building systems with LLMs, we have to take this into account and channel non-determinism carefully.
Passing the right knowledge between subagents is hard. How can you architect systems so that LLMs don't have to decide what's relevant?
LLM performance varies with the amount of information in its context - irrespective of whether the current problem has anything to do with what's in there.
Adherence to instructions and tendency to complete a task list drops as context keeps increasing.
Retrieval systems are tricky to build. Using vector embeddings over arbitrary chunks is the opposite of what most people should be doing for general RAG needs.
Playing to the strengths of LLMs becomes important. Recent solutions do a much better job at this - for example, leveraging the model's coding and filesystem knowledge to accomplish a variety of tasks.

What is needed

Voice-native products: Voice is a fantastic medium to communicate with AI and hear back.
Structures and abstractions to channel non-determinism.
Selectively directing model's attention.
Metaprompting and tools to naturally get more out of you - help you think.
Generative UI - dynamic interfaces (ephemeral and persistent) for effective Input-Output
Personalization that only enhances answers when appropriate and does not skew otherwise.
Very important time to have AI Engineers actively thinking about and solving these problems.