I hold a belief that LLMs can do practically anything, despite the fact that they're actually quite limited.

Problem

LLMs are very limited. They can reliably do some, simple, small reasoning tasks.
But they struggle to do large, complex tasks.

Solution

Break big, complex tasks down into an orchestrated set of small tasks.
Identify limitations of the LLM by writing a naive, one-shot instruction for a complex task.
Allow failure cases to inform the breakdown of the complex task into a series of simple tasks.
Arrange the simple tasks in sequence with traditional code, managing the flow of information between them.

Limitations

With enough elbow grease applied to the decomposition of complex tasks into series of simple tasks, LLMs can do anything.
But just because they can doesn't mean they should.
Breaking down big, complex tasks in such a way that respects LLM limitations is a lot of work.
And the final result may be too costly and/or slow to be worth the effort.

I've prodded at this belief in my own experiments, orchestrating a dozen plus prompts together to create a text adventure, escape room type of game.

To simulate real-world constraints, a task checks whether or not the User Action is possible given current conditions.
To enable player exploration, a set of tasks checks whether or not the User Action involves movement to a new location and generates that location (and an accompanying image).
To enable interactivity between the player and environment, a set of tasks generates the result of the action, updates health and inventory, and checks whether or not the player has died.
To enable a dynamic win condition, a task checks whether or not the result of the action fulfills a loosely defined win condition.

I won't link to either my full writeup or the game itself, since this is my first post here, but I am curious about where the community has seen this approach fail.

submitted by /u/nitroviper
[link] [comments]