Narrative Design Testing

A month ago I did a quick Twitter post about a couple of situations in which repeating testing with the same players is helpful. It got me thinking about what I look for when testing.

Some context: along with testing my own interactive projects (whether they be installations, Apps, live games, or card games), and consulting and the mentoring at industry labs around the world, for the last four years or so I have been overseeing the conception and production of 100-200 interactive projects every year. These projects have varied from 2 hour jams, half-day jams, 2-day labs, 1 week labs, 2 week, 6 week, 1 year, 3 year projects. They’ve been school-setting, client projects, industry labs, commissions, crowdfunded, self-funded, no budget, small budget, medium-budget (what is this?), and multi-million dollar budget projects. They’ve been videogames, interactive fiction, VR, installations, Apps, live games, ARGs, tabletop games, transmedia (film, TV, books, theatre, docos & interactive), projections, websites, playful books and more. Technologies have included proprietary systems, industry tools, indie tools, CD-Roms, actors, paper, chalk, Lego, lollies, and more.

So I want to share some of the patterns I’ve observed in what I look for regarding narrative design when I’m testing other peoples’ and my own projects .

Note, I will alternate in my use of the terms “tests” and “passes” — as I see them as interchangeable but don’t always suit each word combination. For instance, a “Complicity Pass” has a different meaning than intended! “Passes” helps though, as these are particular things we look for during testing but don’t necessarily create a separate test for. And they’re things we don’t necessarily see all at once, and so we have to look for them. I love “passes” too, because some people still see testing as something that happens at the end of project and one that is conducted purely with the “final form”. Whereas “passes” makes it clear that it is something we’re doing along the way, especially earlier. It also helps unclamp the belief some people have that a professional creative work comes out fully-formed. They think in terms of false leads, dead ends, and wasted time, and don’t see how these actions are additive not subtractive. Or they think if it works for them at a moment in time, it will work for everyone across time. Both don’t realise that creation is an interactive process, not a static product. Jack Epps’ book Screenwriting is Rewriting provides a helpful intervention: “If there is one skill that separates the professional from the amateur, it is the ability to rewrite successfully.” It isn’t about being so good you don’t need to rewrite, you’re good because you rewrite with purpose. Epps uses the term “passes” throughout his book.

So, the following are some of the things I look for in narrative design testing or passes:

Avoiding Testing
Theme-Action Alignment
Promise Test
Agency — Mental Model Test
Agency — Freedom Affect Test
Doing, Not Telling
Emotion Design Pass
Bestseller Test
Complicity Test

Avoiding Testing

The Avoiding-Testing Test! There are those who avoid testing in the first place. When I see this happening, I look at two factors: the person and the environment. On the latter: I have entered environments where there isn’t a testing culture and I so far have found it has been matched with an antagonistic environment. In other words, people are scared of testing because they think they will be subject to scathing criticism. And they will. I have seen this in companies and universities alike. Cynics and critics do not make projects better, projects happen in spite of them.

Why do I do this? While I’ve been working professionally on creative projects for 30 years, it has only been the last decade that I’ve realised the beauty of testing. I avoided it too. My works are better now because I let the world breathe on them. My goal then is to facilitate this happening. But not force it on them. Some are happily making by themselves.

What do I do? Regarding environments, I create spaces of reciprocal vulnerability. I find whenever there are people issuing vile feedback and fear-based settings, they’re not personally putting themselves into positions of vulnerability. That is the point, isn’t it? So reciprocal vulnerability is important. If those people creating a critical environment have nothing to test (which is unsurprisingly often the case), then I try to make sure there are role models for vulnerability (including testing my own projects with them), and they know how to interpret antagonistic feedback and attitudes. The “skin in the game” argument that Nassim Nicholas Taleb put forward is helpful: “Never trust anyone who doesn’t have skin in the game. Without it, fools and crooks will benefit, and their mistakes will never come back to haunt them.“

I have also actively made addressing problems fun and fulfilling, as opposed to terrifying to see. This helps associate pleasure and growth with testing, as opposed to associating testing with pain and shame. I try to make discussions about the obstacles we face in getting to testing part of the process. And I refer to Brené Brown’s super helpful insights into fear-based leadership in Dare to Lead, and how we can shift out of “perfectionism” (which is a common obstacle), “cynicism,” and “avoidance” to name a few. One of mine has been “numbing”!

Theme-Action Alignment Pass

This refers to checking that what the project is about aligns with the logics of the narrative and interactions systems employed. This includes dramatic structure choices, genre choices, mechanics, all interactions, even artforms employed, and whether they align with what the work is about at a thematic level: the core message, “universal” concern, argument or area of exploration.

Unlike in literature, plays, and screenwriting, games are still relatively new to the idea of a theme (premise). Cliff Blesinski’s 2007 “Ludonarrative Dissonance” helped move the concept forward with the notion of a story clashing with the mechanics. But there are many, particularly in tabletop games (and VR) that look only as far as checking the logic of the setting and/or topic and align the actions from there. Mary Flanaghan and Helen Nissenbaum’s 2005+ work on Values in games has helped many developers move deeper. But there aren’t necessarily checks against the dramatic structures and mechanical choices involved. So for me, a Theme-Action Alignment check includes these setting, story, and action elements as well as, and importantly including, the structural choices against the deeper message of the work.

Why do I do this? I didn’t consciously do this myself, I don’t think, until I read filmmaker Sidney Lumet’s book Making Movies in the early 2000s. Lumet, who directed movies such as Network, 12 Angry Men, Dog Day Afternoon, and The Verdict, talks about how there are “no unimportant decisions.” The sets, costumes, lighting, and lens choices are all part of the meaning-making:

“Making a move has always been about telling a story. Some movies tell a story and leave you with a feeling. Some tell a story and leaving you with a feeling and give you an idea. Some tell a story, leave you with a feeling, give you an idea, and reveal something about yourself and others. And surely the way you tell that story should relate somehow to what that story is. [NP] Because that’s what style is: the way you tell a particular story. After the first critical decision (“What’s this story about?”) comes the second most important decision: “Now that I know what it’s about, how shall I tell it?” And this decision will affect every department involved in the movie that is about to be made.” (original emphasis)
Sidney Lumet (1995) Making Movies

As part of my PhD, I wrote about how the choices of artforms we employ are part of thematic meaning too. That is: even the choice of a making it as a film as opposed to a play or interactive experience relates to what your work is about. In the last five years or so I have expanded this to include all the processes involved in making a work: the dramatic structure that is employed, the mechanics, team, collaborative design, and so on. It all affects the experience.

What do I do? So when checking for a theme-action alignment, I need to know what the theme is. I don’t need to know before I test other peoples’ works, as it is super helpful to see what I and other players come up with what we think it is based on what you’ve shown us. If the designers don’t have a theme they’re exploring yet, then I talk about what the project is saying with the choices they’ve made so far. We go through ways to explore what they would like theme to be, and how their expressive choices can align with it. But one thing I’ve found too from analysing lots of works, is that every element doesn’t have to be on theme. It isn’t about quantity, it is about insight and decisions.

Promise Test

What promises does the work make to the audience/players? These promises can be signalled with the title, description, other promotional materials, and opening experience. I find there is sometimes a big gap between what is being signalled and what is being offered. This can include promises about genre, topic, emotional journey, and experience goals.

Why do I do this? This is something that took me a long time to recognise. But it is a critical aspect of our relationship with our audience/players. If it is so important, then how is it that these promises can be missed? It is in large part due to a lack of consideration of the audience and player experience during the process of making. Not thinking about what is going through their mind, what they think is happening, what they expect and want to be happening?

Promises are also super helpful in providing guidance to the creators if we’re/they’re unsure what narrative structures to use. You’re signalling a crime story? Okay, which kind of crime story, and what are the existing dramatic and mechanics that audiences and players have come to expect with them? But, note, it isn’t about doing exactly what people want. It is about being aware of how your signals are part of a conversation that will be thought about and acted upon. That is part of the canvas you’re working with.

What do I do? Documentation, pitch decks, and promotional materials help creators clarify and communicate what their promises are. What genre? What format? What is the synopsis? What does the imagery (if applicable), depict? Then I check that against what is actually being provided. When I have the time, my favourite way to playtest a project — whether it is in script, design document, low-fidelity or medium-fidelity prototype — is the Think Aloud method. That is, I either say (or write down) my thoughts as I go through the work. I find it is one of the most insightful methods to get that insight into what we’re thinking and feeling and expecting as we move through a work on a moment-by-moment basis. I find it can provide more insight than a few questions at the end of the experience. Indeed, I was taught the counter-intuitive method by a professional UX Tester years ago that interrupting a user to ask questions was often more informative than asking questions at the end. But not all people can do this and still be immersed, so use as appropriate.

https://youtu.be/Z5SURn7Kjd4

Ashley Brown, Assistant professor at EAE, University of Utah, shares Think-Alouds on Twitch.

Agency – Mental Model Test

This is where I check if the mental model I have created of the creative world as a player correlate with the way the creative world works? In other words, do I have a strong idea of the logic of the world you’ve created, and do my interactions with it increasingly confirm this? This kind of “agency” is in the realm that Janet Murray and Daniel Cooke talk about. For me, it doesn’t matter what degree of “interactivity” there is. An interactive project may just allow you to click through one pre-written hyperlink, but this does not mean the feeling of agency is non existent. Where that hyperlink is placed, what that word is, what we think it means and what happens in response to that action is what makes up the pleasurable experience of mental model agency.

Agency results when the interactor’s expectations are aroused by the design of the environment, causing them to act in a way that results in an appropriate response by the well-designed computational system. This matching of the interactor’s participatory expectations and actions to the procedural scriptings of the machine creates the pleasurable experience of agency. Bad design frustrates the interactor by creating confusing or unsatisfiable expectations, or by failing to anticipate actions by scripting the machine with appropriate responses.
Janet Murray, Inventing the Medium

And Cook talks about the actual process as “loops”:

The player starts with a mental model that prompts them to…
Apply an action to…
The game system and in return…
Receives feedback that…
Updates their mental model and starts the loop all over again. Or kicks off a new loop.

Loops are very good at building ‘wisdom’, a holistic understanding of a complex system. The player ends up with a mental model that contains a thousand branches, successes, failures and nuances that lets them approach new situations with confidence.
Daniel Cook, “Loops and Arcs“

Why do I do this? The Loop Test is trying to address the same problems as the Promise Test. It is exacerbated in the interactive context, and is a common issue to address. It is common in part because of creators not thinking about the audience/player thought processes in response to their work, and also because we can’t possibly know all the ways people experience our work. So we have to test for it.

A big one for me are games in which the mental model is constructed around the player playing a game, not the player relocating their imagination to the setting of that game. They have anti-deicetic-shifts. This thrusts me out of diegetic immersion. I personally see the positioning of the player as a powerful outsider as an Imperialist and Externalist impulse.

What do I do? All interactions and instructions need to be checked with different people. Especially “fresh eyes” and people who are quite different from the creator’s sensibilities. We want to find out how that word is interpreted to different people, and how a situation encourages a course of action that is not offered. Once again, the Think Aloud method works well (either verbal or written), to highlight what is happening with choices. This includes what I expect when I choose something, when I select a dialogue option (this is compounded when there is a verb-based UI, as the possible meanings increase). There are a whole lot of choice design and dialogue checks you can do as well, such as Karen and Theresa Tanenbaum’s “commitment to meaning” approach to agency with speech acts (thanks Leena for the reminder about this paper!); and Tracey Fulleton’s “Decision Scale” in Game Design Workshop.

Agency – Freedom Affect Test

Am I affecting the world at all? In the previous kind of agency I spoke about a person having the feeling of agency when there is a pleasurable match between what we think is happening and it being confirmed as happening, between what we act on and that action making sense in the world. As I mentioned, this does not mean we can do anything we want in the world or can affect it. So this other kind of agency is emphasising the freedom affect. It is what game designer Tynan Sylvester describes in Designing Games as follows: “Agency is the ability to make decisions and take meaningful actions that affect the game world.”

Why do I do this? I don’t apply this element to all projects. Not all are designed to facilitate this. But I am interested in creative works that facilitate a co-creative mindset, a mindset that sees the actual world as able to be co-constructed, as optional, and changeable.

How do I do this? I check to see if the work is actually facilitating an optional thinking mindset, and facilitates hope and the generation of ideas, and co-creation, rather than perpetuating power systems.

Do, Don’t Tell

“Show, Don’t Tell” is a common guideline for screenwriting. I don’t know where it originated yet, but in rereading early screenwriting instruction manuals there is an emphasis on using action to communicate. The idea is this: avoiding explaining to your audience/players what is happening or what they need to do explicitly. Instead, communicate visually. In interactive projects, we take it another step: Do, Don’t Tell.

Why do I do his? It isn’t about making EVERYTHING about doing, but we certainly don’t want everything explained to us. We want the player to figure things out through their actions. This is linked to psychologist Albert Bandura’s “Social Learning Theory” and how we learn through social modelling and through direct experience. Doing is about learning through direct experience, as opposed to identification through social modelling.

For me, the issue is less about whether you’re telling, showing, or doing, and more about what we’re doing as an audience or players. We could be being told by a narrator something, but that speech is written in such as a way that we need to decode what is being said and is encouraging us to apply it to our own lives. So it is about receiving instruction, interpreting or figuring things out. Instead, it is about realising what it is about. So it is more Realise, Don’t Tell. Being told what is happening/what it is about/what to do is incredibly boring, talks down to the audience/player, and leaves no space for change. It perpetuates domination, authoritarian thinking. Letting us realise for ourselves is key to our freedom of thought.

What do I do? I note the times I feel I’m being told, and the times I’m being trained to not think and instead just action the commands of the system. Then I work through ways we can change the learning device.

Narrative Design Debugging

Narrative Design debugging is needed when you make changes and don’t realise that they now break the story in some way. We all do it. We think something will make it better or solve a problem, and we end up unbalancing the rest of the experience. People are more familiar with “debugging” in software, where it can sometimes be easier to spot a problem because the system will glitch or crash. But in narrative there is not necessarily an obvious effect. So it has to be sought.

Why do I do this? Although I think through system changes, we don’t always see the consequences of our changes. And so we need to test with others to help reveal them. It often happens when we make a change just prior to testing, and haven’t thought through the consequences.

What do I do? As a tester, I usually notice it when I do a test even if I’ve experienced the work before. That is because I can put myself into a frame of mind where I respond to exactly what is in front of me as opposed to running on automatic from prior playthroughs (ala the peanut butter and jam sandwich rules test). I think it is best to test immediately after the change, with someone who is familiar with the work. As they’re perhaps more likely to see the effect. Or fresh eyes. Whatever works. The testers may not identify what is wrong of course, so you need to look for the clues in the way they’re behaving and interpreting.

Emotion Design Pass

Do I care about the characters, including my own player-character? Am I feeling the intended attitudes towards characters and other players? Are the emotion points hitting what they need to with enough audience members/players? How do I feel about myself when experiencing the work?

Why do I do this? It wasn’t until I was teaching games that I discovered the need to overtly look for this and give feedback on it. Designers often presume testers will care about characters. But the characters are hollow, they’re underdrawn, or they have traits that are meant to be appealing but which aren’t to me. For instance, nationalism does not inspire me to admire a character.

What do I do? I have found Karl Iglesias’ techniques for character appeal and empathy from his book Writing for Emotional Impact helpful in analysing and designing character connections. I have been expanding and rewriting them to include how we feel about player-characters, negative personality traits, and ourselves.

Replayability Test

Do my audience/players want to watch, read or play it again? This isn’t a goal for all works, but is important for some.

Why do I do this? I am interested in transformation design. And transformation isn’t a one-shot thing. It involves continuous commitment. This means creating works that can help with developing a new habit, new way of thinking. I like the creative challenge of creating something people want to experience again, especially with others.

What do I do? A little thing I ask afterwards is whether people would want to play again. This gives you some insight, but is not conclusive. So I test with various levels of interest: whether they want to sign-up to be notified for it; whether they are interested in signing up to test again & whether they do when invited; and whether they’ll pre-buy. I also check to see what ideas they come up with for how they’d like to play it (or view it) next time, and who they think would like to be involved.

Importantly it is how it makes people feel at the time that is a big factor, along with how they feel about it later. Self-testing has been a big factor for me. I’m really honest with myself about whether I want to play/read my creations again. If I don’t, then I explore why and then I design to resolve that. My misgivings are signposts for me. I didn’t used to listen to them.

Bestseller Test

Does the work possibility have the qualities of a bestseller? Well that is the question! Formulas for potential success can include considering the people involved (actors, directors, writers, etc.), spectacle, popular genres, and marketing budget. But we don’t always have access to these elements (especially budget), these don’t guarantee “success”, and I personally am not interested in upholding the social context that makes these key factors in the first place.

So I look to the method Kathy Sierra talks about in her book Badass: Making Users Awesome. Sierra talks about how bestsellers are made from word-of-mouth, and that a particular kind of design facilitates the strongest word-of-mouth. That design isn’t about having your users/players/audience talking about how great the project is, or the people involved, or the company, it is about how the project makes them feel and be awesome. It is for instance players talking about how they’re assholes in Untitled Goose Game.

https://youtu.be/Gyv-l0MWRQI

Kathy Sierra presentation at XOXO Festival 2015

Why do I do this? I do this because I’m fascinated with what excites lots of people in authentic ways, not through manipulation.

What do I do? [EDITED] I look at the language people use when describing the experience, their physicality when watching or playing (when possible), and may ask questions that give an insight into how they see themselves, and of course see how testers engage long-term, and how they describe it to others. Likewise, Mirjam Eladhari has proposed that ‘player re-tellings’ are a good indicator of success and give insights for analysis.

Complicity Tests

How are the settings, characters, dialogue, events, mechanical & structural logic, processes, and business models of the work upholding oppressive and environmentally-destructive systems? This isn’t a question for everyone. It’s definitely for those of us with the most privilege. Some think that having issue-based storylines does the job, or diverse characters addresses these issues. But it goes further, to the dramatic structures and mechanical choices we make. How does the work uphold systems of power? How does it uphold patriarchy? How does it uphold domination? Imperialism? Capitalism? White supremacy? Fascism? Transphobia? Racism? Ableism? Toxic masculinity? Rape culture? Hierarchies? Anti-semitism? Islamophobia? Fat phobia? Homophobia? Biphobia? Extractivism? … And what I find is a key issue: Exteriority? You don’t need to make extreme hateful works for these to be in effect, they are actually the default output of our standard creative processes, methods and techniques.

Why do I do this? People are dying. People are dying, living in pain, fear, struggling, not thriving, and not being heard. Animals are being tortured and dying. Our planet is being irretrievably damaged. Complicity checks are rare, and need to be normalised so all testing (and the whole creative process) involves looking for complicity. Because the default is not neutral or apolitical works, but works that are complicit. And, speaking as someone who has identified as the black sheep, the underdog, as marginalised, I’m a white woman and there is so much damage I’ve already done. I want to keep looking for the ways I and others are complicit, and keep doing things to change this in a big way.

What do I do? I started with finding tools to reveal and action problems, such as the perhaps the most well-known earliest Bechdel Test, checking how nonhuman experiences are presented, checking the game accessibility scorecard, checking for misappropriation, checking who is telling whose story to whom and why?, checking for negative collateral (!), checking if your fonts are racist, guidelines for inclusive surveys, checking the model of disability being employed, checking the principles of climate visualisation, checking leave-no-trace guidelines for locations, checking for ethical use of AI, checking for responsible portrayal of vaccination, and so on. These are helpful, but they’re just the beginning. Another approach is to hire sensitivity consultants, so they’re checking what you’re making on behalf of marginalised communities. But ultimately we also need to learn to SEE complicity for ourselves, and in ourselves. This takes lots of internal work, research, listening, and action. I won’t go into detail about my own processes now. See Sandrine Micossé-Aikins’ great summary. And then when we SEE complicity, we can act by employing processes that thwart these representations entering your work. Such as nothing about us, without us.

And making new processes. Like I’m doing, and will be sharing in my book about how we make new processes. And testing with a range of testers. And telling others what you see in their work. I’ve noticed I haven’t been doing this everyone. I’ve done it with students because I feel a moral obligation in a position of authority. I haven’t always been as diligent in industry environments with strangers. I have to if I want this normalised, to be part of everyday narrative design testing. It’s hard because often their entire project is soaked in complicity. They’d have to change things, they’d have to redesign, rewrite the whole thing. But that is the point. What is the tipping point for me? To get me to speak up in every situation? Firstly there is the moral obligation. I have more privilege, so I can speak and not be attacked with the same level of violence as others. I’ve also come to this: make it not about the act of informing others, or policing others. It is about presence. It is about me being myself in every situation — which for others may feel like policing, but what feels to me like peace.

“I have come to believe over and over again that what is most important to me must be spoken, made verbal and shared, even at the risk of having it bruised or misunderstood. That the speaking profits me, beyond any other effect.”
Audre Lorde, “The Transformation of Silence into Language and Action”

I’m delving further into adrienne maree-brown’s “pleasure activism” to ensure the experience is regenerative and joyful: “a politics of healing and happiness that explodes the dour myth that changing the world is just another form of work.”

…And you?

These are not all the narrative design elements I look at when I test. But they are ones that are front-of-mind for me at the moment. What about you? Do you have some stories and approaches to share of these mentioned, or others you’ve done?

Image Credit: “ACME Plot Avoidance Scrub” — commissioned illustration by Marigold (Goldie) Bartlett.