We Become What We Behold opens with a quote attributed to Marshall McLuhan: “We become what we behold. We shape our tools, and then our tools shape us.” Then it hands you a camera and says nothing else. There is a crowd of circle-headed and square-headed figures walking around a small screen. You can point the camera at any of them and take a photo. The photo appears on a small television in the corner. That is the entire interface. What happens next — the escalation, the violence, the single couple who survives with heart hats at the end — follows from each picture you choose to take. The game takes about five minutes and was designed to be uncomfortable.
We Become What We Behold starts in a state of ordinary noise. Circles and squares wander the screen. A circle is wearing a hat. A couple stands together with heart symbols above them. An angry square-headed figure paces at the edge of the crowd, sporadically yelling at passing circles. The game offers no instruction about who to photograph first. First-time players typically do what feels natural: they capture interesting moments. The hat is interesting. The couple is interesting. The angry square is interesting. Each photo broadcasts to the small television, and the crowd reacts to what it sees.
Photograph the hat and hat-wearing becomes a trend. Then, because trends saturate quickly, the crowd abandons hats and hat-wearing declines. Photograph the couple and the crowd scrutinizes them — sweetness on a news screen gets reframed, and the romantic moment becomes awkward under observation. Photograph the angry square yelling at a circle and something different happens: a circle-headed person watching the broadcast starts to be afraid of squares. That fear appears as a visual state change. The square who was yelled at begins to regard circles with contempt. The broadcast caption — “Crazed Square Attacks” — was accurate about the moment and false about what it implied. The crowd cannot tell the difference.
This is the feedback loop that We Become What We Behold is built around. The game’s tension ratchet only rolls in one direction. Photographing circles and squares getting along produces no audience engagement — the TV displays the image and the crowd ignores it. Photographing hostility generates reactions. Reactions generate more hostility. More hostility generates more photographable moments. The photographer is not recording events. The photographer is producing them.
We Become What We Behold has a small cast whose individual arcs are easy to miss on a first run because the crowd movement is busy and the camera is always demanding attention. The spiky-haired square is the figure whose anger initially triggers the circle-fear cascade. He is also the figure whose arc is the game’s most pointed detail: a couple wearing heart hats bumps into him mid-game and gives him one of their hats. He spends the rest of the game wearing that hat with a visible smile, no longer scaring anyone. The crowd never photographs his transformation. It does not go on the broadcast. It does not spread.
The dapper circle with the mustache and bowler hat is the figure the TV Tropes community has called the “gentlecircle” — charming, apparently harmless, barely noticed in the mid-game chaos. He is the one who draws a gun at the end and fires the shot that ends things. The crowd stops completely when this happens. Every figure on screen pauses. The hashtag on the broadcast, instead of the usual sensationalized caption, reads simply “…”. The game itself does not know what to say about this moment, and that silence is more precise than any headline could be.
The couple with the heart hats are the only figures who survive the ending intact. They appear briefly in the epilogue among the debris. Their presence is the game’s final statement, and it lands differently depending on whether you find it hopeful or whether you find it cruel — two people in a crowd where everyone else is dead, having spent the whole game being largely irrelevant to what was being photographed.
We Become What We Behold has one ending regardless of how you play. Players who discover this — usually mid-run, when they try photographing only peaceful moments and find that the game simply stops generating broadcast-worthy content and waits — understand that the game is not offering a choice. It is demonstrating a process. The ratchet advances when you photograph hostility. If you refuse to photograph hostility, the ratchet does not advance. But the game has nowhere else to go. Eventually players give the game what it wants, because the alternative is sitting in a static crowd watching nothing happen.
This is probably the most debated aspect of We Become What We Behold in the communities where it gets discussed — game design forums, media studies classrooms, Reddit threads under news and politics subreddits. The argument against the design is that making the violence inevitable removes player agency and therefore undercuts the message. The argument for it is that the inevitability is the message: the system does not require your malice, only your participation. The photographer who only wanted interesting images still ends up with a body count.
Nicky Case has been explicit in interviews that the feedback loop only rolling toward conflict is a deliberate simplification. The real world has counter-examples — media that de-escalates, stories that produce empathy rather than division. The game omits these not because they do not exist but because the game is demonstrating the mechanism of the loop, not modeling the full complexity of media systems. Whether that simplification is a strength of the design or a limitation of it is a debate the community has not resolved.
We Become What We Behold does not use text to explain its argument. There is no narration, no ending screen with bullet points, no lesson summary. The argument arrives through the mechanic: you made choices, choices had effects, effects compounded. The sense-making happens retrospectively, after the screen has gone quiet and you are sitting with what just happened. Players who describe the game in community discussions almost always describe their own experience of complicity — “I thought I was observing, but I was driving the chaos” is the sentence that appears in some variation across nearly every first-play account.
The fact that We Become What We Behold takes five minutes matters. It is long enough for the feedback loop to complete but short enough that the whole arc is visible as a single experience rather than something that accumulates across sessions. Players who return to replay it are usually testing whether different photograph choices change the pacing — they do, slightly — or whether the loop can be interrupted. It cannot, in any meaningful way. The lesson does not change with replay. What changes is how quickly you accept that it will not.
The game was made in 2016 and the community continues to share it with a regularity that suggests it has not dated. Every few months it resurfaces in discussions about media literacy, social media algorithms, and news cycle dynamics. The specific context changes. The mechanism the game is describing does not.
No. We Become What We Behold has a single ending regardless of which photographs you take or in what order. Attempting to photograph only peaceful interactions — the couple, the hat-wearing figure, the crowd milling together quietly — delays the escalation but does not prevent it. The game requires photographs of the hostility between circles and squares to advance, and if you withhold them long enough, the game simply waits until you comply. The inevitability of the ending is a deliberate design decision: the violence is not a punishment for bad choices, it is the demonstrated outcome of a feedback loop that the photographer participates in regardless of intent.
The couple wearing heart hats are the only figures visible in the epilogue among the aftermath. They appear briefly amid the debris as the ending resolves, and the game cuts to its final state with them still present. The spiky-haired square — who received a heart hat from the couple mid-game and spent the rest of the playthrough in a visibly calmer state — does not survive. The dapper mustached circle who fires the gun is not shown among the dead in the ending sequence, leaving his fate ambiguous. The couple’s survival is the game’s final image, and players consistently differ on whether it reads as hope or as irony.
The central argument is that media does not passively reflect events — it actively shapes them by selecting which moments become visible and therefore real to an audience. In We Become What We Behold, each photograph you take changes the behaviour of circles and squares based on what the broadcast shows them about each other. Conflict escalates because conflict gets photographed. Reconciliation goes unbroadcast because it generates no audience engagement. The game’s creator, Nicky Case, has described the feedback loop as intentionally simplified — the ratchet only turns toward hostility — to isolate and demonstrate the mechanism clearly, even though real media systems contain counter-forces the game does not model.
We Become What We Behold is five minutes long and does not leave you the same way it found you. The spiky-haired square’s unreported transformation — the hat, the smile, the moment the couple gave him something the broadcast would never show — sits alongside the gentlecircle’s gun and the “…” hashtag as the three images the game leaves you holding after the screen clears. Nicky Case built a tool that demonstrates how tools shape us, and in doing so, made the argument with the thing itself rather than about it.