forward.tools

Legendary Letter Mashup

Dec 2023 • Game Concept

Where legends are written, not just told — in this essay, we explore the integration of Large Language Models (LLMs) into games, focusing on their generative capabilities.

After providing a high-level perspective on the challenges and prior art, we will dive into a detailed game concept. It utilizes the unpredictability of LLMs to our advantage, placing it at the forefront as the main attraction. Players relinquish direct control, instead indirectly influencing the outcome of each round.

But first, let's consider why designing with generative AI is so challenging today.

Dealing with randomness

Generative AI is inherently random, often yielding different answers to the same question. By controlling their output in terms of format, reliability, and safety, we can better integrate them into existing systems for daily use and even more serious applications, allowing them to act as autonomous agents. That's why substantial engineering effort is invested in making generative AI more deterministic.

In the art world, which includes writing, brainstorming, or games, the creative and random nature of AI can be embraced, often leading to unexpected and surprising results. This "yes, and..." approach represents a new capability in computing, creatively applicable to various forms of art. This area is still largely unexplored, with much potential for fantastic ideas.

Focusing on games, many well-known ones like League of Legends, Counter-Strike, or StarCraft are competitive with clear rules, where skill often determines the winner, not luck. These games create a fair playing field.

However, there's a different category of games centered on the experience, designed to evoke emotions and offer new perspectives on life. Here, generative AI will be at home, allowing players' imagination and creativity to play an even bigger role than before possible.

Static and dynamic surprises

Games historically used algorithms to generate assets, interactions with NPCs, and worlds. Building and maintaining these technologies is challenging, and they often lack the desired dynamic responsiveness. Examples include Minecraft and Spelunky, which create new worlds for players to ensure a surprising experience. A key advantage is reproducibility: using a specific seed (usually a number), the algorithm can recreate the same world, a feature not possible with Large Language Models.

LLMs are pretty good at basic reasoning and excel in generating, mixing, and transforming text and images. There are three main ways this capability can be utilized by game designers and engineers.

The most apparent use, often discussed in headlines, is in generating game assets during production. While there are concerns about job loss, I view this as a shift towards an era where artists are unburdened from mundane tasks. Instead of painstakingly drawing, building, and painting each asset, artists can use AI to handle the more tedious and repetitive tasks, allowing them to concentrate on the creative aspects that require taste and a distinct point of view.

The second method involves incorporating Large Language Models into games for direct interactions. Many hope for NPCs in role-playing games to converse naturally with players, reacting with context and awareness of previous in-game events. This development promises more immersive gaming experiences, not just in RPGs like Skyrim, but also in simulation games like Sim City and digital pets like Tamagotchi.

The third approach is the indirect use of LLMs, where game content is dynamically created based on user input. This can be seen in "choose your own adventure" games or in crafting entire storylines and worlds influenced by players' decisions. This reminds many of real-world role-playing games, where a game master designs the story, and players' choices only subtly shape the journey. In the next part, we will explore this type of generative AI in games, with the game concept for "Legendary Letter Mashup – LLM."

Competing with letters

Legendary Letter Mashup was co-created by brainstorming with a large language model, like most of my ideas these days. You can read more about my current creative process assisted by generative AI in this article.

For this game, we're leveraging two strengths of LLMs: transforming any text input into a coherent story and generating images in many different styles.

Storytime

In the distant future, humanity has perfected virtual world creation, blurring the line between reality and virtuality. A legendary game developer, known as "The Embedded," crafted the ultimate game. They championed the idea that text, transcending cultural and linguistic boundaries, represents the purest human thought and creativity - a universal interface. In this game, players use words to shape and interact with its reality. But during the final development stages, something went wrong, leading to The Embedded's disappearance into his creation, along with numerous reality fragments.

Players, acting as engineers, aim to restore reality. They find that these fragments are crucial to understanding reality's alteration - and possibly how to reverse it. Using a vector teleporter, they enter the game's latent space, exploring dimensions reflecting various gaming eras and styles.

Their task involves creating and battling creatures to recover real-world fragments lost within the game's database. The ultimate goal is to collect enough fragments to unlock the final dimension - "Reality." Here, players face the ultimate test: restoring the balance between virtual and real worlds and possibly uncovering The Embedded‘s fate.

Basic game loop

The game is a 1v1 multiplayer, played in rounds. It begins with a familiar matchmaking mechanism, where the server connects two players of similar skill levels. Players start in the first dimension, and by collecting reality fragments, can travel to other dimensions. Each round begins in a pre-fight scene set in a randomly generated world within the current dimension, each with unique rules impacting the round.

Pre-fight scene — prepare, set, type!
The countdown continues while players describe their fighters.

Players are given a textbox, a set number of letters, and 90 seconds to describe their creature. Here, a mix of creativity and strategy is essential, especially to use or survive in the environment. When both players are ready or time runs out, the input is locked, and the AI reveals the other player's creature description and both (generated) names.

This leads to a dynamically generated fight between the creatures.

A dramatic fight takes place with a compelling voice narrating.

This fight is presented in stages, incorporating different story arcs for suspense and plot twists, narrated by an emotional text-to-voice system. The winner of the round is then announced, earning reality fragments and moving up the leaderboard.

Each round is a surprise, revealing the environment, the opponent's creature and the ensuing battle. Players effectively co-write a mini-story through their creative inputs, with the excitement of seeing these stories materialize in various graphic styles.

LLMs excel in generating diverse graphic styles from the same story prompt, ranging from Pixel Realm and Polygonal Plane to High-Def Haven and Ultra-Real Universe.

Due to server-based computation and simple text interaction, this game could be a perfect fit for a casual game on smartphones, played on the go.

Different game modes

The game could offer various types of fights. For example, in one mode, players can only choose the weakness of their opponent's creature, while their own creature's strength is predefined and revealed later. This mode encourages players to predict their opponent’s strategies and choose weaknesses that could be most advantageous.

Different art styles

The game can also experiment with different art styles for graphics, such as pencil scribbles, watercolors, oil paints, abstract art, graffiti, collage, or clay.

The dynamic nature of LLMs can be creatively utilized here in many ways. It's also easy to include seasonal themes like summer, winter, Halloween, Christmas, and New Year's Eve in the stories and images generated for each fight.

Learnings

Cost and speed

When integrating LLMs into games, considering cost and speed is key. The output of these models isn't quick enough for real-time interactions (yet!), like talking to NPCs, but can be suitable for games like out litle Letter Mashup. Here, the delay is hidden with animations or step-by-step generation for a smooth experience.

Regarding cost, it varies significantly. Using OpenAI's standard LLM API, one round costs 24 cents — that's for one text generation and five images. This might be too high for a commercial game, but for an MVP prototype, it's acceptable. There's also the possibility to attract model providers as sponsors.

Another approach is using open-source models on self-hosted servers, significantly cutting costs and making such a game financially viable. This requires more effort in server scaling and model management. But also allows for better model fine-tuning based on player feedback and analytics.

Both speed and cost will surely improve as the generative AI field evolves, potentially allowing high-quality models to run quickly enough for gaming applications, on-device.

Limitations

A current limitation in image generators is inconsistent characters across scenes. Some techniques can partially address this, but overall, technological advancements are much needed. This area, currently underinvested, may develop as demand for image generation grows.

Streamer friendly

The game's slower pace and creative aspects are well-suited for streaming platforms like Twitch. Streamers can engage their audience in creating creature descriptions, thus involve a broader audience in the positive and creative aspects of AI, challenging the often negative view on AI.

Risks

With user-generated content, managing inappropriate content is crucial. Clearly defined rules and potential game bans are necessary.

Tools like Langchain and Guardrails AI add safety layers to model API calls, but challenges like prompt injection remain, along with other potential, yet undiscovered risks in future LLM integrations into games.

For copyright issues, services like OpenAI might reject requests that describe copyrighted characters. Local or self-hosted open-source models bypass this, but then the game developer would have to address any copyright concerns themselves.

Artificial brainstorming partner

Finally, I discovered generative AI as a fantastic brainstorming partner for developing ideas, refining details, and generating names. It helped maintain my creative flow by assisting with research, listing idea variants, blending story concepts, and creating different idea versions for selection and further development. As mentioned, you can find more on my creative progress here.

Variants

The fundamental concept of this game can be easily adapted to different areas. For instance, players might face coding challenges like debugging or extending a piece of code. However, they can't modify the code directly – their only tool is a single, well-thought-out prompt. The player who solves the challenge the best or faster wins the round.

Examining the basic interaction – players indirectly solving a problem by prompting an LLM – opens up many similar ideas.

Looking forward

Thinking ambitiously and long-term, imagine the potential changes in 5 to 50 years if models run entirely locally, are ultra-fast, and produce high-quality output. This evolution could significantly impact gaming.

For our game concept, we could see the creation of entire mini-movies, complete with music, narration, and dialogue, generated on the fly, rather than just a few images with text. Simple player prompts might lead to interactive games where you can play with your imagined character. This technology could also support games with many more players, introducing new dynamics.

In terms of multiplayer experiences, the distinction between human and AI players might blur. Imagine joining an online server with 200 players, interacting via text and voice, and having a great time, only to realize later that you were the sole human player. Would this realization lessen the experience's value, or would it be irrelevant? The integration of advanced AI in gaming could reshape the definition of multiplayer interactions and challenge our perceptions of human versus AI engagement.

Inspired?

I hope this essay inspires you to think outside the box when integrating generative AI in games. This very detailed game concept description is an attempt to give you a graspable and practical example of integrating LLMs in your game today.

Did I miss something valuable or get something wrong? Please let me know, as these concepts can only advance our field when you speak and I listen.

I am a child of the 80s, growing up with CRT monitors and 2D games, and have ridden the rollercoaster of new tech unfolding to today's immersive 3D graphics and virtual worlds. This game concept shows glimpses of the start of a new era of games.

Has this concept sparked any new ideas for you? I'm eager to hear them. If you're interested in discussing your generative AI-enhanced game or know of any existing games that use LLMs, my inbox is open for discussion.

Further research

I want to conclude with some open questions that arose while working on this project. Maybe you want to tackle one of these?

  • Is it possible to integrate a small-scale LLM directly into a game, running all model calls on the device? What levels of speed and quality can we achieve?
  • How is generative AI currently used in professional game production, and what impact does it have on industry jobs?
  • What effect do LLMs have on bots in games?
  • In what ways are LLMs utilized in game jams, and what are their effects? Could they lead to higher quality games, smaller team sizes, or more fun due to less time spent on bug fixing?
  • What other risks might be associated with using generative AI in and for games?

Resources

Midjourney

Creating open-source models for image generation.

AI Learning and Playing Games

Article reviews deep learning advances in playing various video game genres, including FPS, arcade, and RTS games.

How Minecraft Generates Worlds

Explains the Minecraft world generation process, which uses a pseudorandom number generator and procedural generation.

Voyager: An Open-Ended Embodied Agent with Large Language Models

LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention.

“This Action Will Have Consequences”: Interactivity and Player Agency

"[...] By allowing the player to make morally-heavy decisions and making it seem that those decisions truly shape the narrative outcome, The Walking Dead, and games like it, foster an incredibly strong illusion of player agency. [...]"

Illustrator Kelly McKernan reveals the raw impact of AI on artists' lives

How generative AI and AI art is affecting artists.

If you find yourself intrigued or inspired by this project, I'd love to hear from you. Your ideas, feedback, or questions could be the next inspiring spark.