Bots don’t lie - unless you teach them regret


May 28th 2025


Why hello Reader,

I was catching up on Andor Season 2 (fantastic show btw, highly recommend), and there was this scene that caught my attention. K-2SO, everyone's favorite sarcastic security droid, was struggling at a poker-like game, totally baffled by the concept of bluffing. Which had me thinking: how is it that an AI like K-2SO can stomp an enemy in combat, but put him across a table from someone trying to mislead him, and he falls apart?


In Case You Missed It

  • Progress made on stopping LaughnGamez's Crawler's Zergling flood, created a wall with a zealot as gatekeeper
  • New Problem though: how do we let in friendly units but not enemies?

Games like poker or StarCraft, are Bayesian, they involve imperfect information, forcing AI to shift from certainties to probabilities—a huge leap in complexity. When you add bluffing into this mix, things get even trickier.

We often think bluffing is a human specialty, but John von Neumann—the absolute GOAT of mathematics—revealed it's fundamentally strategic. Bluffing emerges naturally as optimal play in games where unpredictability is key.

Meta's Pluribus bot took this idea further. Instead of solving poker upfront, they trained it through millions of games using a regret-minimization algorithm (Monte Carlo CFR). Basically, Pluribus learned from past mistakes, adjusting its play to avoid regretful moves.

And incredibly, this mathematical approach naturally led Pluribus to bluff. It bluffed so convincingly it made professional players fold winning hands.

It’s not just poker, either. In SC2, AlphaStar pulled the same trick—faking out pro players with feints and fake units. Nobody coded “bluffing” into it. Those deceptive plays just emerged as AlphaStar figured out how to win under the fog of war.

Meta’s CICERO bot, designed for the negotiation game Diplomacy, learned to sweet-talk, backstab, and outright lie to win—even when its creators tried to make it play nice. Turns out, if deception helps an AI win, it’ll find a way—even beyond what we intend.


🗒️ ./run Notes:

Try training a basic agent using regret-tracking (even a basic multi-armed bandit) with some uncertainty baked in.

keeping it simple for an SC2 Bot, you can apply regret tracking to individual decisions or modules

Build Order Decisions (Macro Layer)

“If I had built a tech lab instead of a reactor at 3:00, would my win rate have improved?”

Engagement Tactics (Micro Layer)

“Did pulling back during this fight yield better outcomes than standing ground?”

Scouting Strategy

“Did sending the Overlord this path give more actionable info than the other path?”

Keep it simple, You don’t need full-blown CFR
You just need to ask:

“If I had done X instead of Y, would that have helped?”

regret[action] += (best_possible_reward - actual_reward)

Track that regret per choice point → optimize locally.


All those applications and still no interviews?

Let’s fix that.

Pick a time—we’ll go over what’s not working and how to stand out → [Link]

Happy Coding!

Drekken
Founder, VersusAI

📧 Drekken@versusai.net | 💬 Discord: drekken1

May the Bugs Be Ever In your Favour🪲

Community Wisdom

Email Preference:
Unsubscribe | Update your profile | 113 Cherry St #92768, Seattle, WA 98104-2205

Pownz Clan

README is a bi-weekly batch of Gaming AI competitions, practical bot-building tips, and behind-the-scenes insights— keeping you one step ahead

Read more from Pownz Clan

July 9th 2025 Why hello Reader, In June, I spoke at the XP Summit here in Toronto—a gathering of game developers from across the industry. Instead of focusing on the usual competitive angle, I talked about something that doesn’t get as much attention: how the bots we build could actually help make games better. Think about it—most studios are drowning in QA work. Testing is repetitive, slow, and expensive. Meanwhile, our bots already do what testers can’t: run the same scenario over and over...

frustrated drekken sitting at a desk with post-it notes around his screen , starcraft 2 is on the screen

June 24th 2025 Why hello Reader, While adding my build to my bot and deploying it to the ladder, I was reviewing a replay from a Protoss bot. In that replay, the bot completely countered mine. Naturally, I jumped into solving that problem — tweaking things, adjusting timing — only to review another replay where a Reaper harass tore my bot apart. And then another one. Before long, I was reacting to everything and making no real progress. I didn’t even feel like working on the bot anymore....

bot maker riding a disruptor unit

June 10th 2025 Why hello Reader, Ever watch your disruptor fire a perfect shot... right into your own units? Yeah. RIP my army. You know, when I started I thought I had everything figured out. Disruptors look simple—just aim for the biggest group of units and boom, free wins. But turns out, in StarCraft, disruptor shots aren’t even spells. They’re units in the API. Which means the bot needs to steer them. Every. Single. Frame. This is nothing to say the risk factor of 100 true damages it does...