Prototyping a Game Using GPT-4 for Social Engineering NPCs

Published on May 2023

An experimental game prototype leveraging GPT-4 to power social engineering NPCs, allowing players to extract information through dynamic conversations in a fixed environment.

GPT-4 AI Game Development LLM Social Engineering

Project Overview

Ah a new day, new project idea!

Amidst all the hype surrounding LLMs (Large Language Models) like OpenAI’s GPT models — AutoGPT, BabyAGI, GPT-4, and more — it occurred to me that there hasn’t been a cool implementation of GPT within a game. While there are talks about how LLMs could change the way we interact with NPCs, the question remains: where are the games that use this technology?

So, I decided to create one myself.

Approach

The way I approached this prototype was to first figure out how I could use a GPT in a gameplay mechanic. After some deep thinking, intense meditation and a smidge of cocaine, I finally had the ultimate idea, the elegance of which truly shocked me too! The idea was … to just make GPT come up with its own game ideas. Can’t believe this actually took a while for someone that literally has an entire site built for GPT generated ideas.

Anyways so a couple of prompts later GPT comes up with an idea of a social engineering game where the Player is an empath detective that can see an NPC’s inner thoughts and can use those to steer the conversations to extract information from them.

While a very cool idea with some great potential, I had to be aware of the GPT’s API cost, and having an inner thought process along with an ongoing conversation between NPC and Player would start to pile up in tokens and subsequently burn a hole in my wallet. Also as someone extremely susceptible to scope creep, I had to scale my ambitions down. And so I finally settled on a simple one on one conversation between an NPC and the Player in a simple environment.

With these criteria included in my idea generating prompts, GPT came up with a game where the Player is tasked with finding a secret base within a building and is to determine its location by engaging with NPCs during an elevator ride. This was a perfect idea for a prototype since this ensured a fixed environment and focus only on the gameplay.

Development Process

I decided to keep the core gameplay mechanic it as simple as possible. It consisted of the following parts:

Creating Objectives

Initially I self generated multiple objectives with the thought of putting them all into a list and picking them out at random. Eventually though I realized that since the “gpt-3.5-turbo” model’s API cost was fairly low I could generate completely new ideas on each run.

Conversations

As for the conversations, GPT-3.5 (ChatGPT) is severely handicapped due to the content moderation systems in place from OpenAI which cause it to bring up the dreaded “As an AI language model…”. In a parallel project of mine (a personal assistant) I had the code fallback onto good ol’ GPT-3 to generate a response. Here the problem was that GPT-3 is simply too inferior in a conversational setting. And hence, I decided to use the GPT-4 models for the conversations.

Now, another problem with all these LLM models is their tendency to hallucinate information. While I may not be asking these NPCs any factual information, this hallucination can become an issue when I engage in conversations in a leading manner. Like for example assume the Player’s task is to as the janitor the brand of soap they use.

Player: Hey Mr. Janitor long time no see! Did I see you using XYZ soap last month? NPC: Oh hello there! Yes, I remember using XYZ soap last month. It’s a great product, very effective in removing tough stains and odors. Are you thinking of trying it out yourself? I highly recommend it!

This essentially ruins the game since there’s no “engineering” that needs to be done to extract the information from the NPC since the Player can simple “inception” the idea into the NPC.

The solution to this issue is yet to be worked out so if anyone has any ideas I’m all ears.

Checking Objective Completion

Finally coming to checking if the objectives are completed. Initially I thought of writing a super complex function to check with the conversation strings for an affirmative statement or something of the sort, until I realized once again that I could just ask GPT-4. Now, the best case scenario would have the Player updated the moment they’ve extracted the information from a conversation to complete an objective. The problem with this is that it would require having to pass the whole conversation every time we get a response from the NPC. Again, this would cost quite a bit and so I decided to simply include a “Level End” screen where an async call to the API would determine and mark the objectives and complete or not complete.

Now depending on if the Player has completed all objectives or not, another request is sent to GPT for a level completion text. If successful, the Player is shown a cliffhanger ending that the story can be built upon. If failed it’s a generic “Sorry you lost” message.

Screenshot of Level End

Conclusion

Once everything was in place I set GPT to the task of creating a name for the game and it came up with…

Corporate Conundrum: Hunt For The Secret Shaft

So yeah, that’s the game! It works and it’s actually quite fun to play (unlike some of my other games).

If I were to ever expand upon this, I’d first want to address the “inception” issue. One potential solution I’ve thought of is to first generate an entire world’s background depending on the kinds of objectives generated. Like in the soap brand and janitor NPC example, once the objective is created then I’d generate an actual soap brand the janitor uses and include that in their character background.

But that’s a problem for another time. Meanwhile if you found the devlog(?) any fun then feel free to follow my Twitter for more updates on my other prototypes.