Back Original

PacCam: Pacman controlled with your face

I made a game. It’s called PacCam. It’s Pacman, but you control it with your face.

Looking very cool

You can play it here (or see the code here).

You chomp your mouth to move and turn your face to steer. You look…pretty dumb while playing it. At the end it tries to give you the dumbest possible gif of yourself (it tracks when your mouth is the most open, and builds the gif around that moment).

I could fill a book with dumb gifs of me after all the testing I did

I’m really pleased with how the game turned out, but making it was way more work than I expected. Let’s look at why.

Wait, how does this work

Good question.

I use a couple of additional tools like Framer motion, radix, GIF.js, and styled-components. But nothing crazy. The game is largely Just Code.

So the high-level answer to “how does this work” is mostly: MediaPipe is an easy way to do face tracking in the browser and otherwise this is just a website on the internet. We live in the future!

That said, there were some challenges gluing all this tech together. The biggest early challenge was getting React to play nice with MediaPipe.

MediaPipe and React

Here is a simple model of how you use MediaPipe:

“Information” means the location of “landmarks” (e.g. the tip of your nose) as well as the value of “blend shapes” (e.g. how open your mouth is). But the point is, MediaPipe sprays lots of data at you (think hundreds of floats tens of times a second).

Here is a simple model of how you use React:

So maybe you can see a potential problem here. Naively combining React and MediaPipe and stuffing all your MediaPipe state into React causes React to do lots of updates. This gets slow.

My first React / MediaPipe project solved this with a bunch of convulted React code that was very careful about when it updated React state (which triggers an update), and relied heavily on what React calls “refs” (values that can be updated imperatively without triggering an update). It was a huge mess, very hard to think about, and still pretty slow.

To solve this issue for PacCam I took a different approach: I put all my MediaPipe logic in a plain javascript file and had the relevant React components subscribe to updates - for example, the PacMan component listens for updates around which direction you’re looking and whether your mouth is open.

I originally thought binding React code to non-React code would be a challenge, but it ends up being pretty simple. A toy implementation looks like this:

class MediaPipeEngine {
  doFaceLandmarkThings() {
    // get facelandmarks somehow
    const faceLandmarks = this.getFaceLandmarks();
    // This depends on your current jaw open state, because we
    // use different thresholds for opening and closing your mouth
    const jawOpen = this.calculateJawOpen(faceLandmarks);
    this.notifyJawOpenSubscribers(jawOpen);
  }

  subscribeToJawOpen({ id, callback }) {
    this.jawOpenSubscribers.push({ id, callback });
  }

  notifyJawOpenSubscribers(jawOpen) {
    this.jawOpenSubscribers.forEach(({ callback }) => callback(jawOpen));
  }

  unsubscribeFromJawOpen({ id }) {
    this.jawOpenSubscribers = this.jawOpenSubscribers.filter(
      (subscriber) => subscriber.id !== id
    );
  }
}

function Component({ mediaPipeEngine }) {
  const [jawOpen, setJawOpen] = React.useState(0);
  const id = React.useId();
  React.useEffect(() => {
    mediaPipeEngine.subscribeToJawOpen({ id, callback: setJawOpen });
    return () => {
      mediaPipeEngine.unsubscribeFromJawOpen({ id });
    };
  }, [mediaPipeEngine, id]);
  // do something based on jaw being open
}

I found this to be a much easier approach than forcing everything into React; it’s a very natural way to have inherently imperative logic update your UI.

Games and React

Here is a simple model of a game engine backend

Here is a simple model of how you use React

So maybe you can see a potential problem here. Maybe you can see that it’s very similar to the MediaPipe problem we just talked about.

So you might ask: can we solve the problem in the same way? Well - yes, we can! My gameloop notifies a pacman component of its position by doing something like this:

class Engine {
  startGameLoop() {
    const loop = () => {
      this.calculatePlayerPositions();
      this.notifyPlayerPositionSubscribers();
      requestAnimationFrame(loop);
    };

    requestAnimationFrame(loop);
  }
  subscribeToPlayerPosition({ id, playerNumber, callback }) {
    this.playerPositionSubscribers.push({ id, playerNumber, callback });
  }

  notifyPlayerPositionSubscribers() {
    this.positions.forEach(({ x, y, playerNumber }) => {
      this.playerPositionSubscribers
        .filter((subscriber) => subscriber.playerNumber === playerNumber)
        .forEach(({ callback }) => callback({ x, y }));
    });
  }

  unsubscribeFromPlayerPosition({ id }) {
    this.playerPositionSubscribers = this.playerPositionSubscribers.filter(
      (subscriber) => subscriber.id !== id
    );
  }
}

function Pacman({ engine, playerNumber }) {
  const [position, setPosition] = React.useState({ x: null, y: null });
  const id = React.useId();
  React.useEffect(() => {
    engine.subscribeToPlayerPosition({
      id,
      callback: setPosition,
      playerNumber,
    });
    return () => {
      engine.unsubscribeFromPlayerPosition({ id });
    };
  }, [engine, playerNumber, id]);
  // move based on position
}

In addition to making it easier to control the state updates that React gets, this is a much easier way to run a game loop at all in React. One of the rules of React is that a function is redefined whenever the data that it references changes. This makes lots of sense for UI updates, but it’s a huge pain for a game loop. My game loop needs to reference lots of data that it’s also updating - but I don’t want to redefine the game loop on every frame. With this pattern I don’t have to.

What else was hard

Well, I spent a lot of time writing my little engine and making the controls feel good and making a tutorial and all of the other things that go into making a game.

But to be honest, “making a website” was a whole lot more challenging for me than the imperative backend-y game engine stuff. I’ve been a backend software engineer for something like 12 years. I learned how to use Flexbox like 8 months ago. The web is hard! But I’m not sure how to turn my webdev challenges into a good blog post - I’m not sure you want to read about me learning that backdrop-filter needs a vendor prefix to work on Safari.

But let’s talk about some things that I found interesting.

What did I find interesting

A lot! Too much! Too much to put in a blog post. But here are a few highlights that we can talk about:

Adding bots

I added bots to PacCam about a week before finishing it. I originally wasn’t planning to have bots at all.

Waiting to add bots was a mistake - and releasing without them would have been an awful idea.

I added bots to make it easier for some friends to test the game solo, and that gets at why having bots is so important: people often don’t have someone to play with! I was thinking of PacCam as a local multiplayer game (and that’s how I did all my early testing) but…local multiplayer games are not in a great place these days! There’s a reason games mostly do online play!

But in addition to improving the final product, bots made testing the game - looking for bugs and thinking about balance - easier. Before adding bots I did all my testing using a gamemode that let me control multiple players with a single face (ok for finding bugs, awful for balance testing).

4 bots going at it. My money's on pink.

So anyway. Bots are great. How do they work?

My bot logic relies on composing a couple of handy primitives:

Every game tick, bots check whether they’re allowed to update their plan (bounded by f) and potentially do so; for example, if a bot is now capable of eating other players, they will probably transition to the hunting plan.

When executing a plan, bots potentially update the target of their plan (e.g. the player to hunt, or flee from, the dot they want to eat) bounded by f. When they choose a target they do so using g. For example, when a bot is hunting other players, it computes its distance to all other players and turns that into a score that is 1/(distance**3) (higher is better). And then it makes a weighted random choice from that list; if there are two players who are 1 and 2 units away from the bot, it will hunt the closer player 8 times out of 9.

In practice that looks something like this
class BotStateMachine {
    maybeUpdatePlan({ now, superState }) {
        if (superState === "am-super" && this.plan !== PLAN.HUNTING) {
            // start hunting players after eating a power pellet 
            // - but not immediately
            const shouldMoveToHunt = this.pastRandomNearFutureTimestamp({
                currentTime: now,
                stateKey: "movedToHunting",
                targetFrequency: 350,
                });
            if (shouldMoveToHunt) {
                this.plan = PLAN.HUNTING;
            }
        }
        // other update logic
    }

    maybeExecutePlan({ now, position, playerPositions}) {
       if (this.plan === PLAN.HUNTING) {
       const canPickNewTarget = this.pastRandomNearFutureTimestamp({
          currentTime: now,
          stateKey: "lastHuntTargetChange",
          targetFrequency: 2250,
        });
        if (canPickNewTarget) {
          // If the another player is close, we should be allowed to 
          // chase them instead. But only sometimes
          const distanceToOtherPlayers = this.distanceToOtherPlayers(
            position,
            playerPositions);
          this.target = this.weightedRandomChoiceFromList({
            list: distanceToOtherPlayers,
            logKey: "choose hunting target"});
        }
        
        const canReorient = this.pastRandomNearFutureTimestamp({
          currentTime: now,
          stateKey: "lastHuntReorient",
          targetFrequency: 500,
        });

        // By default, this moves in the current direction 
        // until we overlap with the target in the relevant 
        // axis (e.g. move up till we overlap horizontally).
        // `canReorient` lets us pick a new direction even 
        // when that's not true, to keep things interesting
        this.orientTowardsTarget({
          position,
          target: player.position,
          distanceScaleFactor: 2.5,
          targetState: this.huntingState,
          pickNewEvenIfAlreadyChoseDirection: canReorient,
          targetDirection: player.direction,
        });
      }
      // other execution logic
    }
}
And here are the helper functions
// Weighted choice from `list` respecting score
function weightedRandomChoiceFromList({ list, logKey, scoreScaleFactor = 1 }) {
if (list.length === 0) {
    console.warn(
    `Asked to choose a direction but no valid choices (key: ${logKey})`
    );
    return null;
}
const scaled = list.map((item) => ({
    ...item,
    score: item.score ** scoreScaleFactor,
}));
const total = scaled.reduce((acc, item) => acc + item.score, 0);
const rand = Math.random() * total;
let runningTotal = 0;
for (let i = 0; i < list.length; i++) {
    runningTotal += scaled[i].score;
    if (rand < runningTotal) {
    return list[i];
    }
}
}

// Pick a time in the near future at which something can occur
function pastRandomNearFutureTimestamp({
currentTime,
stateKey,
targetFrequency,
jitterFactor = 0.25,
runOnSuccess = () => {},
}) {
if (!this.smoothRandomState[stateKey]) {
    this.smoothRandomState[stateKey] = {
    lastTimeSomethingHappened: currentTime,
    targetDelta: null,
    };
}
const lastTimeSomethingHappened =
    this.smoothRandomState[stateKey].lastTimeSomethingHappened;

if (this.smoothRandomState[stateKey].targetDelta === null) {
    let threshold = targetFrequency;
    const smoothVariation = Math.sin(currentTime / 1000) * jitterFactor;
    const randomVariation = (Math.random() * 2 - 1) * jitterFactor;
    threshold *= 1 + (smoothVariation + randomVariation) / 2;
    this.smoothRandomState[stateKey].targetDelta = threshold;
}

const delta = currentTime - lastTimeSomethingHappened;
if (delta > this.smoothRandomState[stateKey].targetDelta) {
    this.smoothRandomState[stateKey].lastTimeSomethingHappened = currentTime;
    this.smoothRandomState[stateKey].targetDelta = null;
    runOnSuccess();
    return true;
}
return false;
}

I really enjoyed thinking in these terms. Separating “what are we doing” from “how do we do that” kept the logic simple, as did having a single answer to “how does the bot make a decision and when does it change its mind.”

I think adding this randomness also makes the bots feel more human - they make reasonable decisions most of the time, but sometimes they chase a player that’s far away or try to run across the screen to grab a dot that they saw.

This logic is also full separate from the game engine; all the bot state machine can do is tell the game which direction a bot is facing and whether its mouth is currently open. I never tried integrating the bot logic directly into the game engine, but I have to imagine it would have been a headache.

Working on any display size

Getting sites to work across display sizes stresses me out. This makes sense - it’s a hard problem and I’m pretty new to webdev.

The problem was particularly challenging for me with PacCam because it’s more of an application than a site. I wanted the game to run full screen, never require scrolling, and be played on top of a webcam that filled the entire screen.

“Make a video element fill the screen” turned out to not be that bad here, relative to some of my other MediaPipe work3 - the video element lives in a wrapper div with width and height both set to 100%4. The video element also has width and height set to 100%, and object-fit set to cover. This expands the video element to fit its container without distorting it.

This approach cuts off some of the webcam if its dimensions don’t match the dimensions of the screen, but for PacCam I think that’s ok: your window is probably close-ish to your webcam dimension, and the game can still track you if you’re in the webcam frame but not visible on screen.

the game doesn't handle adjusting the window well, but the webcam looks ok!

The bigger challenge here was determining how big the game elements should be (and how many pellets should be in a row and column). The approach I took there was a little messy. When the game loads, I:

To make that concrete, lets say that our window is 2100x1450.

I've made the spacing elements red here; they ensure our height has a round number of "slots"

Since these calculations are done at page load time, the game totally breaks if you resize the window while playing. Don’t do that! I wasn’t sure how to address this (since recalculating on window resize would change the number of pellets we should display).

This approach still feels a little awkward and messy to me, but it seems to work fine in practice and I really like having the game fill the screen regardless of dimensions.

Tweaking game mechanics

A big part of designing some games is picking good numbers. And a thing I was surprised to learn last year is that I hate picking good numbers.

This is surprising to me because I like numbers! And I like playing games with good numbers! So I don’t really like doing this, and I struggle to find the energy to pick my numbers in principled ways.

I mostly dodge this problem by building games that don’t need good numbers: I’m much happier thinking about how to nudge humans to not be jerks. But PacCam needed some numbers.

And I…did not pick my numbers in a principled way! Shocker! I found my numbers by picking something, playing a lot solo, adjusting, playing with friends, adjusting, and then playing some more. Numbers I picked this way include:

Sorry if you don’t like the numbers I picked. Feel free to send me a pull or fork the code.

Teaching the controls

When I explain PacCam I ask people some simple questions - are they familiar with the videogame Pacman? Are they familiar with the human face?

They consistently say yes; most people are familiar with Pacman and know what the human face is. I explain that my game straightforwardly combines those things. They nod.

But it turns out that putting your knowledge of the human face into controlling Pacman is hard! That’s for a few reasons.

I quickly picked up on this during playtesting and learned how to give people the right intuitions for the controls. I’d stress that you only need to make small face movements and that you should try opening your mouth more but slower - and that worked pretty well.

But transmitting that information to strangers on the internet is harder. I tried to address this in a few ways:

well, the game's not meant to make you look *flattering*

Recording the video was pretty fun. I originally recorded it with a white background, but that put an ugly white blob on the how to play screen. The white wasn’t quite consistent enough to trivially edit out, so I reshot the tutorial with a super jank greenscreen:

This made it pretty easy to edit out the background to make a “transparent” video - until I realized that the Mp4 I exported it as didn’t support transparency. So to the video is actually played in a hidden video element and copied to a canvas that alphas-out the relevant pixels.

After doing all of this I realized I could have just changed the game to have a green background instead of showing the webcam in the background, removing the need for a physical greenscreen. But this was more fun anyway.

Wrapping up

So that’s it! That’s PacCam. It’s the most “normal” game I’ve made in like a year (that is, it’s a playable game that isn’t also a large scale experiment about human behavior). I hope you enjoy it!

To be honest, I’m a little nervous that I spent too long on the game relative to the number of people who will play it. I’m not sure that games like this are my strong suit. But I learned a whole lot about web development while making it and had fun along the way, so it’s hard to be too sad :)

If you’re interesting in mediapipe, I’ve got a bunch of undocumented examples of how to use it on Glitch - I’m planning to turn those examples into a proper library / blog post.

And I’ll be back with more stuff soon!