Daniel Balcarek

Posted on Jun 18

Tower Before Dusk: I Built a Puzzle Game for Humans and AI

#devchallenge #gamechallenge #gamedev #ai

June Solstice Game Jam Submission

This is a submission for the June Solstice Game Jam

It's interesting how the most exciting ideas always arrive when I have basically no time to work on them.

A few weeks earlier, I had finished my submission for the GitHub challenge by bringing an old WinForms game back to life. That project turned out to be a lot of fun. Then Sylwia Laskowska published a great article about Google's WebMCP. The idea fascinated me, but I wasn't sure where I could actually use it. Then the June Solstice Game Jam was announced. The idea hit me like a lightning bolt: What if I made a game that both humans and AI could play?

Let's do it.

What I Built

I created a puzzle game with a solstice theme called Tower Before Dusk. The goal is simple: reach your home tower before sundown. Every action costs time. Every step brings sunset a little closer. Rivers block your path, rocks force detours and the only way across water is to collect enough wood and build bridges. Move too much, collect unnecessary resources, or choose the wrong path, and night will arrive before you make it home.

The challenge isn't just solving the puzzle. It's solving it efficiently.

And apparently, that's difficult for both humans and AI.

Video Demo

In this demo, Gemini 3.1 Flash-Lite tries to solve the level using the exposed game tools. It fails, then I restart the level and solve it manually. That failure is part of the point: the tools worked, but reasoning through the puzzle was still hard for the lightweight model.

tower-before-dusk.gramli.workers.dev

Code

Gramli / tower-before-dusk

A TypeScript puzzle game demonstrating WebMCP, where humans and AI solve the same challenges under the same rules.

Tower Before Dusk

Tower Before Dusk is a tile-based puzzle game about reaching the tower before sunset. Plan each route carefully: every move spends daylight, trees provide wood, and water can only be crossed by building bridges.

The game is built as a modern browser app with TypeScript, HTML canvas, and Vite. It also exposes a small model-context interface so an assistant can read the current map and submit a complete action plan for replay in the UI.

Features

Three handcrafted puzzle levels with different tower layouts and move limits
Daylight system that tracks the move budget from morning through sunset
Trees that are collected automatically for wood when entered
Bridge building over water, consuming two wood per water tile
Rocks, water, bridges, towers, and sprite-based terrain rendering
Responsive canvas scaling for different browser sizes
Keyboard-driven play with restart and help shortcuts
HUD for level name, wood count, moves used…

View on GitHub

How I Built It

Since WebMCP was completely new to me, I didn't want to jump straight into building a game without understanding how it worked first.

So I generated a simple Vite application and experimented with a tiny counter tool:

const incrementCounterTool = {
  name: "incrementCounter",
  description: "Increments the counter by a specified value.",
  inputSchema: {
    type: "object",
    properties: { value: { type: "number" } },
  },
  execute: async ({ value }: { value: number }) => {
    const counter  = document.getElementById('counter') as HTMLElement;
    if (counter) {
      const currentValue = parseInt(counter.innerText, 10) || 0;
      counter.innerText = (currentValue + value).toString();
    }
  },
  annotations: {
    readOnlyHint: false,
    untrustedContentHint: true
  },
};

When the AI successfully incremented the counter and I saw the value changing in the browser, I knew I could continue.

Of course, my game would be a little more complicated than a counter. At first, I considered letting the AI inspect the game state after every move, but then I realized I would burn through tokens incredibly fast. So I came up with another approach.

Instead of playing move by move, the AI would receive the entire game state, understand the rules, and generate one complete plan to reach the goal, but then another thought appeared:

"How do I make it look like the AI is actually playing?"

The answer was surprisingly simple. The AI would return a sequence of actions and my game loop would replay them with a short delay between moves. From the player's perspective, it would look like the AI was thinking and playing in real time.

Even better, it fit perfectly with the game's architecture, because human players already interact through keyboard actions that modify the game state.

With that idea in mind, I built the MVP.

I did it the "old-fashioned" way: player first. (Almost like mobile-first, except with fewer trendy conference talks.)

I also have to admit that I stole some core ideas from my previous EasterGame project. At this point, I'm starting to suspect I accidentally built the beginnings of a tiny puzzle game engine.

The first playable level looked like this:

The game worked, You could reach the tower and win. It was finally time to bring AI into the picture.

Based on the original idea, I created two MCP tools:

getGameState
submitPlan

getGameState provides the complete state of the current level, including objectives, rules, available actions, and the visible map:

export const gameState: GameState = {
  objective:  "Reach G before sunset using as few moves as possible. Do not collect unnecessary wood.",

  legend: {
    P: "player start position",
    ".": "land / walkable tile",
    W: "wood / walkable tile, can be collected",
    "~": "water / blocked unless player has enough wood",
    R: "rock / blocked tile",
    B: "bridge / walkable tile created after entering water",
    G: "goal / walkable tile",
  },

  rules: {
    map:
      "visibleMap is an array of map rows from top to bottom. The first symbol in each row is x=0, and rows start at y=0. Symbols are separated by spaces for readability.",

    movement:
      "The player can move one tile up, down, left, or right. Each movement costs 1 move.",

    rock:
      "Rock tiles marked R are blocked and cannot be entered.",

    wood:
      "Tree tiles marked W are walkable, but entering W automatically collects the tree. This costs 1 extra move, adds 1 wood, and removes W from the map. Because collecting wood costs an extra move, avoid W unless the wood is needed to cross water.",

    water:
      "Water cannot be entered unless the player has at least 2 wood.",

    bridge:
      "When the player moves into a water tile with at least 2 wood, a bridge is built automatically on that single water tile. This costs 1 extra move, consumes 2 wood, and changes only that one water tile to B. Other connected water tiles remain water.",

    bridgeLimit:
      "Each bridge covers only one water tile. If there are multiple water tiles in a row, the player needs enough wood to build one bridge per water tile.",

    strategy:
      "Use the minimum number of actions needed to reach G. Do not collect wood unless it is required to build enough bridges. Avoid stepping on W unless that wood is necessary. Extra wood has no value at the end.",

    goal:
      "The player wins immediately when reaching G using no more than the maximum allowed moves.",

    lose:
      "The player loses if the move budget is exhausted before reaching G, or if no valid action can reach G.",
  },

  actions: [
    "MOVE_UP",
    "MOVE_DOWN",
    "MOVE_LEFT",
    "MOVE_RIGHT",
  ],

  remainingMoves: 30,
  wood: 0,

  visibleMap: [
    "P . W W W W ~ ~ G",
  ],
};

The second tool, submitPlan, accepts the AI's proposed solution:

    inputSchema: {
      type: "object",
      properties: {
        actions: {
          type: "array",
          items: {
            type: "string",
            enum: [
              "MOVE_UP",
              "MOVE_DOWN",
              "MOVE_LEFT",
              "MOVE_RIGHT",
            ],
          },
        },
        summary: { type: "string" },
      },
      required: ["actions"],
      additionalProperties: false,
    },

The AI returns an array of actions such as:

["MOVE_UP","MOVE_DOWN","MOVE_LEFT","MOVE_RIGHT"]

Then submitPlan feeds those actions into the game loop, which replays them with a short delay so players can watch the AI attempt to solve the puzzle.

Pretty neat, right?

Well... It worked. The AI successfully called both tools and then it immediately exposed another problem: my level design was too difficult. Even Level 1 turned out to be surprisingly challenging for the models I tested.

For development and testing, I used the WebMCP Inspector with the Gemini models available through the free API tier:

Gemini 3 Flash Preview
Gemini 3.1 Flash-Lite
Gemini 3.5 Flash

All three models correctly called both tools, but none of them managed to generate a valid solution for just Level 1. At that moment, I realized that perhaps I had been a little too optimistic about my puzzle design, so I lowered the difficulty. Eventually, AI finally managed to reach the tower and complete the first level.

Victory ...Well... a small victory. I'm fairly sure stronger models would perform better on the harder levels, but I also didn't want to discover how much puzzle-solving curiosity could cost in API tokens.

If you'd like to try it yourself, this is the prompt I used:

You are playing Tower Before Dusk.

First call getGameState. Study the objective, legend, rules, remainingMoves, wood, and visibleMap.

Create one complete plan to reach G before sunset. Use only the listed actions. Account for move costs, automatic bridge building, wood collection, rocks, water, and remainingMoves.

Then call submitPlan exactly once with the full action list. Do not submit partial plans.

Interesting Thoughts

Going into this project, I assumed the hardest part would be integrating WebMCP into the game, but it wasn't.

The real surprise was discovering that even simple puzzle levels weren't trivial for AI models. The tools worked almost immediately, but designing levels that felt straightforward to humans while making AI struggle turned out to be an interesting challenge.

It made me realize that puzzles we consider "easy" often rely on intuition and reasoning patterns that aren't as obvious to language models as I had expected.

The Sunset Arrives

And that's how Tower Before Dusk came to life. I set out to build a game for the June Solstice Game Jam and explore an experimental technology, discovering that simple-looking puzzle games aren't necessarily simple for AI and creating something that humans and language models can both struggle to beat.

Honestly, I think that's a pretty fitting result for a game about racing against the setting sun.

Top comments (36)

Hemapriya Kanagala • Jun 18

Daniel, this is a really creative idea. Making a game that both humans and AI can play is not something you see every day.

I'll definitely give it a try when I get some time. Curious to see whether I can beat the AI on the harder levels 😄

Daniel Balcarek • Jun 18

Thanks, Hemapriya! ❤️

The first three levels are intentionally easier because the lightweight models were already struggling with them. Levels 4 and 5 should feel more like normal puzzle difficulty.

And I’m hoping to add a few genuinely hard ones over the weekend too. 😅

Sylwia Laskowska • Jun 18

Wow, another addictive game! 😄 Saving this one for after work. BTW, Google should probably send us some stickers for all the free webMCP promotion we're doing 😂

Daniel Balcarek • Jun 18

That would actually be amazing! 😄 But without your article, I probably would not even know WebMCP existed, so most of the credit goes to you! ❤️

Sylwia Laskowska • Jun 18

Deal! 😄 In that case, I'll take two stickers! 😂❤️

Daniel Balcarek • Jun 18

Absolutely! You definitely deserve both of them!🏆 😂

Aliaksei Zelianouski • Jun 18

The "AI struggled" result might be model tier plus format more than AI in general. A move-budgeted tile puzzle leans on exactly what trips LLMs up - spatial reasoning, counting, and one-shot planning with no feedback loop - and you tested the lightweight Flash models, which is where that breaks first. Worth a frontier run before lowering difficulty: I've seen people report Fable 5 is genuinely strong at spatial reasoning in games now, so the gap might be tier, not a ceiling.

Either way it comes back to balance, and that's why I build conversational games. Mixing human and AI players is just easier when the game advances through dialogue - the challenge becomes language, the one thing these models are genuinely good at, so they sit near human level instead of way below or way above. Assuming you can prompt them to stay on track, of course.

Daniel Balcarek • Jun 18

Yep, I agree and I actually described it in the article that this is a limitation of the models I used. Stronger models would probably perform much better.

And as you said, conversational games are a much more natural fit for these models, but then it would not be as much fun for me to challenge them with something outside their comfort zone. 😄

Utkarsh Bansal • Jun 18

Loved the game, it's really addictive. I have a few suggestions though.
Right now, the game feels a bit too much like calculating moves, similar to a puzzle like checkers. Instead of giving a strict move limit and placing the castle far away, you could move the castle closer and give players some extra moves.

Example: if reaching the castle takes 32 moves, give the player 40 moves.
The extra 8 moves could be used to collect extra resources that could be used in the future levels. This will add a layer of strategy on top of it instead of forcing a single optimal path.

With this you can also create a leader board or personal best score on the min moves taken to reach the castle.

Daniel Balcarek • Jun 18

Thanks, Utkarsh, I really appreciate it and I’m glad you like it!

Those are both great ideas. The game would definitely become much more interesting. My only concern is the AI side: the lightweight models already struggle with the simplest levels and adding optional objectives plus long-term resource decisions could make them lose track even more easily.

But for a human-focused mode, I think this would be a very fun direction. 👍️

Eryc Tri Juni S • Jun 18

the hard part wasn't WebMCP. it was the puzzle. 🎯
one question though — did the model actually count the moves, or just vibes-based sequence and pray?
because those are very different failure modes. 👀

Daniel Balcarek • Jun 18

I included remainingMoves in the game state, and Gemini 3.1 Flash-Lite especially often did not use the full budget. For example, it could have 28 moves available but submit only around 20 actions and stop before reaching the goal.

So it was not always a case of running out of moves, sometimes it simply produced an incomplete plan. The models I tested were lightweight, though, so stronger models would probably perform much better.

Eryc Tri Juni S • Jun 18

so it had budget left and still stopped short — that's not a counting problem, that's the model not knowing it failed until after it submitted.

it thought it was done. that's the scarier failure mode.

Harsh • Jun 18

Cool😎 Puzzle game for humans and AI such a unique angle Curious what's one puzzle type that AI solves faster than humans and one that humans consistently beat AI at? Would love to hear about the design process.

Thanks for sharing! 🚀

Daniel Balcarek • Jun 18

Thanks, Harsh!

I developed and debugged it with Google’s WebMCP Model Context Tool Inspector. It currently offers only three model options: Gemini 3 Flash Preview, Gemini 3.1 Flash-Lite, and Gemini 3.5 Flash, so there was not much room for broader experimentation.

From what I tested, Flash-Lite often fails even on level 1. The other two can finish level 1, but they already struggle with level 2.

So, for now, humans win. 😄

I think stronger models would probably do better, but testing those would mean building my own agent outside the Inspector. That could be a fun next step. 🤔

Web Developer Hyper • Jun 18

The idea of AI thinking for itself and playing the game is unique and fun. Good game! There might be many other great ways to use AI that we haven't discovered yet. 🤔

Daniel Balcarek • Jun 18

Thanks! Glad you like it. ❤️

Yeah, you’re right. Discovering those new possibilities is one of the most fascinating and interesting things to do.

Web Developer Hyper • Jun 18

Looking forward to seeing what unique AI idea you come up with next! 😄

Daniel Balcarek • Jun 19

❤️A lot of my ideas are inspired by articles from the DEV Community, so I’m always curious to see what interesting things people build and write about next! 😄😅

Web Developer Hyper • Jun 19

You are surely one of the most creative and highly skilled engineers in the DEV Community! 👍

Daniel Balcarek • Jun 19

Oh, thank you! That’s really encouraging and it warms my heart. 😊

I wouldn’t go that far, though, there are plenty of excellent engineers in the DEV Community, including you! 🙌

Mykola Kondratiuk • Jun 19

curious how the AI actually navigates the puzzle - does WebMCP give it a structured state dump, or does it read the DOM like a human would? the move order would look really different depending on that

Daniel Balcarek • Jun 19

WebMCP exposes a gameState tool, and that returns a structured state with the objective, rules, legend, remaining moves, current resources, and visibleMap:

visibleMap: [
  "P . W ~",
  ". R . G"
]

So the AI sees the puzzle as structured data, not as DOM. I included the full state earlier in the article, probably the longest code example there. 😀

Mykola Kondratiuk • Jun 19

structured state makes the agent reasoning legible in a way raw DOM never could — you can actually trace a bad move back to the input. does the visible map ever mislead it when fog is only partial?

Daniel Balcarek • Jun 19

Ah, sorry for the confusion, there is no fog. visibleMap exposes the full map, so the failures are more about planning/counting mistakes than partial visibility.

Yep, bad variable naming on my side. 😅

Mykola Kondratiuk • Jun 19

got it, that's actually cleaner to debug - if the map's fully visible and it still miscounts, the failure is clearly in the planning layer. easier to isolate

Marina Eremina • Jun 18

Really cool game, I even reached level 5! The previous one was entertaining as well, great job! 🎉

Daniel Balcarek • Jun 18

Thanks! That was fast, you must be a good player! 😄

I made the first three levels easier on purpose so the AI would have a chance, but levels 4 and 5 were meant to be more challenging. Maybe I should add a few tougher ones. 😅

Marina Eremina • Jun 18

I just like this type of game. The only thing is they usually come packed with ads. What about yours? Should we expect ads to show up later? 😅

Daniel Balcarek • Jun 18

Never! 😄 Or at least until Cloudflare’s free tier is no longer enough to host the game. 😅

Just joking, I’d rather find another solution before adding ads.

𝓣𝓱𝓮𝓛𝓪𝔃𝔂 𝓰𝓲𝓻𝓵 ◕⁠‿⁠◕ • Jun 18

Bro, you're making so many awesome games these days, I wouldn't be surprised if GTA 6 turns out to be your next project!😅

Daniel Balcarek • Jun 18

That’s a great one! 😂 I’d love to say “Challenge accepted,” but I think GTA 6 might be slightly out of scope for the next DEV challenge. 🤣

View full discussion (36 comments)