This article is the second of a two-part interview with User Behavioristics Founder & Principal, Heather Desurvire. In part one of our interview, we talked about Heather's background and some of her experiences in the field of user research and games user research. We also touched on some playtesting best practices, including when to begin playtesting, and how to integrate playtesting into the design process.

We pick up right where we left off with part two:

Who should run the playtests, and how should they evaluate them?

HD: It's different for different teams. In some companies, it's the producer who manages the design schedule and the design team. In other companies, it's the lead designer who can effect those changes, and still, others organize it differently than that. It depends on the team, the company, and the studies they are running.

Ideally, it's the person who could effect changes and who understands what the message of the game is. It's crucial that they know the design priorities so that we can be the most effective.

It's also great if you can have junior designers watch the playtests and give their input too.

But ideally, it's the person who can effect the game, the person with the game's essence at heart. It's also important that the team understands that playtesting is about revealing the essence of the game. It's not a critique.

It's great to work with teams who do a lot of research, including those who work a lot with PlaytestCloud, as they tend to be fully invested in playtesting. They're the teams who prioritize the game being a success for the players. Those are the people who are the best partners because they love what we do and they are always eager for more.

DC: That makes sense. I recently spoke with the lead designer from the UK based studio, Exient, and he was full of praise for PlaytestCloud and everything we're doing. He says our service helps him out a lot!

HD: People love it. I think that people who are not aware of its value are missing a vital ingredient to ensuring the game's success for its intended players.

For example, maybe a designer wants the game to be their art, and they design the game for themselves, and the people they want to attract are people like them. Which is fine, but for commercial and broader success, it's vital the design works for its intended audience. In actuality, even those who design for people like themselves still find barriers to player experience when utilizing playtesting.

We want to study people who aren't opposed to the style of the game we're testing because if a player would usually never play your game, why recruit those people to evaluate it?

How important are player demographics for getting accurate feedback from your target audience?

HD: For this kind of qualitative research, we find that game experience is the biggest difference when looking for a representative sample. Gender can be important, but not always. Age grouping tends to be significant: There's a big difference in cognitive ability between four and five-year-olds and nine and ten-year-olds, and there's also a big difference between nine and ten-year-olds and twelve and thirteen-year-olds.

The older the player is, the more the differences tend to be about lifestyle. We usually talk about groups based on experience with genres and experience with games in general. We typically classify players as "casual" or "experienced," especially within a certain genre. Maybe a player is experienced with "Match 3," or maybe they never play these games at all, but play other casual mobile games.

Sometimes gender matters because the game is targeted for, say, females thirty-five-to-forty-five. Even then, we'll probably still look for a male perspective too – and include some male players in the playtest, just to be sure we're gender inclusive. Typically, the target demographics are provided by the studios: they usually know what they're looking for.

If it's the second version of a game or an updated version, it's important to know if they've played the game before, so we can see how the changes might affect their experience. Unless we're looking at why players churn, we usually don't try to test players who have left the first version of a game.

If we are looking for players who have churned, we would want to know why they stopped playing. Then we'll look to see if the changes made in the updated version address the reasons they left, and if they do, we can test to see if the updates produce the optimal experience or not.

We want to study people who aren't opposed to the style of the game we're testing because if a player would usually never play your game, why recruit those people to evaluate it? Of course, there might be a specific reason for doing so, such as seeing if there may be other players interested in the game than the intended audience. But generally speaking, we want to study a representative sample of a game's potential players.

DC: Right, so the two most important questions to potential game testers are: "What are you playing?" and "What is your level of experience?"

HD: Well, ideally we would want to know how long they've played a certain genre of games on a particular platform, and what other games they've played, as well as their age and gender. From this, we would likely be able to determine their level of experience. A note about self-reporting level of experience: people tend not to be able to report their level of experience accurately.

For example, a player in a social group that doesn’t play games [non-gamers] might consider themselves to be an experienced player, whereas another player in a social group that consists of highly-invested players [gamers] might also consider themselves experienced. What’s clear here is that these definitions of "experienced" are apples and oranges – they are not equal. So, we prefer to offer more objective criteria. Having an objective and standardized criteria allows us to benchmark more accurately, as one person's self-description for their level of experience may be different than someone else's.

In the end, the most important criteria to us are: What games they have played, what types of games they like to play, and what relevant experiences they have.

Once the players understood the core mechanic, however, the gameplay was fine. But what we were all seeing from initial playtesting would have been a showstopper, and that game would have failed.

What's the biggest change you've seen or heard of that resulted from findings within a playtest?

HD: One of the most revealing ones, which happens more often than we would want, is that our results sometimes show whether the current design of a game is viable or nonviable for the intended players. We've identified in some games, safely, before the game is released, fundamental issues that get in the way of the gameplay. These fundamental issues would prevent the game from being immersive and accessible if released at that time. They would also prevent the game from having as many players as wanted.

We can see if a game is ramping up to be playable in the long-term. I know that in many games we're able to identify that, especially when the game hasn't been released yet.

One experience that stands out for me, because it became clear and evident to the heads of the publisher and the head of the studio, because they were all present, observing the playtest in real-time, and could see for themselves. We didn't even need to report what happened, because they were all there, as there was a lot of investment in the game and this IP [intellectual property]. It was clear that the fundamental basic mechanic was not understood. That would have meant the game's failure in the marketplace had it been released without our intervention.

DC: Wow, that's a big deal.

HD: Yeah. A lot of money went into this IP, and it was clear there was a problem – and they were getting ready to release the game. They only playtested with us for validation purposes, to do their due diligence.

But it was clear from the playtesting: the core mechanic was not understood. This game was for a specific demographic, a specific age group, and the players completely didn't understand how to play.

We actually ran quite a few players through because we wanted to be absolutely sure.

Typically, the optimal number of players depends on what the demographic is, and with a very unified demographic, we don't need as many players. We do like to get at least a few in each kind of section, but because it's formative research and not summative research – we don't need to get statistical significance in the number of players. We find, as long as the group is representative, there's a point of diminishing returns, because we start seeing the same issue over and over again.

Under normal circumstances, it doesn't make sense for design teams to test as many players as we did. But because this was so crucial, and there was so much riding on it, we asked to over-recruit so that we would be sure that we saw the same thing over and over, which indeed we did.

Once the players understood the core mechanic, however, the gameplay was fine. But what we were all seeing from initial playtesting would have been a showstopper, and that game would have failed.

But instead, because we had such good support from the studio and had enough playtesters, we were able to identify what was really happening, and why. The studio was then able to make some changes, after which we retested and were able to demonstrate that those changes made a difference. Then, with a few more tweaks, the game went on to be successful.

But it was clear to everyone that the game would not have succeeded if we hadn't found and identified those issues surrounding the core mechanic.

DC: When you find something that's not working in a game, is it the researchers who make suggestions on how to address the problem?

HD: Let's take playtesting through PlaytestCloud as an example. With PlaytestCloud, Christian [Christian Ress, PlaytestCloud Co-Founder] and I would speak to the design team about what they want before the playtesting begins. Some design teams don't want any suggestions from us at all. They just want the results.

One way we approach giving feedback is that we offer the developer information on what the problems are, how many players are having them, and why they are occurring. We're able to do this because we watch all of the playtests, and we bring our expertise about the underlying player experience principles that might have been broken. We can help to identify not only why something happened but also the underlying principle.

For example, maybe the background was misunderstood as a static background, but it's actually a playable interactive element, and the players didn't understand that. In this case, we'd ideally be able to offer why this happened, because we knew the graphics weren't appearing to the players as the interactive element, as they were intended. But any suggestions we give aren't meant to say, "This is how you should fix it." Any suggestions we give can, of course, be incorporated, but they can also serve as a way to stimulate your design ideas, which are likely to be better because we're not game designers, we're game user researchers.

It's the game designers who know the essence of the game and have the best sense of what to do with our feedback. We always put a caveat in all our reports and presentations to the game designers that they are the experts at their game, and we never mean to supplant any of their ideas. We only offer ideas to get their concept on the right track, and can sometimes give examples and suggestions for how to do that.

So that's the optimal situation. But there are also teams who absolutely don't want any of our suggestions, which is also fine. We find out from the outset what type of feedback teams want. Most teams want the suggestions, but there's no golden rule here.

DC: It sounds like a conversation you have before playtesting begins: Perhaps something like, "Hey, you know, we can do it this way where we just give you what we found works and doesn't work in your game, but if you'd like, we also can give you some suggestions too?"

HD: Yes, it's usually discussed beforehand. We say, "Okay, here's a sample of what we do, we can give you an analysis of what happened and why it happened, but also, we can give suggestions if you'd like.

Our repeat clients tend to know what works best for their team. When it's the first time with a client, that's when we really have to ask those questions. Ultimately, it just depends on what works best for them.

How many players would you recommend per playtest?

HD: It depends on how many different types of players are needed. It gets more complex when you have, let's say, kids between the age of nine and twelve, then between thirteen and fifteen, sixteen and twenty, and then twenty-one through thirty-five, for example. It's important that we have enough in each of those groups because of the cognitive differences in those groups.

So the number can be quite different. Another example is when we want to compare players who are experienced with a genre against players who are inexperienced with a genre. In this example, we'd want to make sure we have enough of both types of players.

Because of the nature of qualitative research, we usually only need between eight and twelve players. But depending on the setup, it could be up to thirty. I hate to say it, but it just depends.

It can be even smaller, due to budgetary reasons. A smaller number doesn't limit us from getting some excellent insights. It's always better to get some data than no data.

There is, however, a minimum and maximum. There is a point of diminishing returns and a point where you really don't have enough. That being said, the minimum can be really quite small, because of the nature of qualitative research, we don't need so many players.

DC: Speaking of diminishing returns – I guess that would depend on the playtest, right? Like, once you have a certain amount of players, there's no reason to keep testing with more players than you need.

HD: Right, right. We just want to make sure we have enough of a representative sample of the types of players we expect to play the game. So in the experienced versus inexperienced comparison, we might set it up as six experienced players and six inexperienced players.

If we just have one group of players we want to test, like only players who are experienced with the genre, then maybe we just need eight or even players total.

Overall, I think the sweet spot is between eight and twelve players. But if you have multiple groups you want to look at, sometimes it can go as high as up to thirty.

That's for qualitative research. Now, if we're doing a survey, then, of course, our numbers get bigger because with surveys we're looking for statistical significance.

But for some surveys, which are just informative or adjuncts to the observed studies like we do with PlaytestCloud, we get a pulse of the players' perception with a small number of players: the caveat being that they're not statistically significant.

As for the changes in technology, services like PlaytestCloud, which offer us the ability to do user research remotely, have been revolutionary. These services bring in a much wider variety of players from different places, and it makes it possible to reach players in their natural environments, which is fantastic.

How has playtesting and the user research landscape changed during your time in the industry?

HD: That's an interesting question because the fundamentals haven't changed much. Aside from the technology we can use now, the fundamental methods we started with are still very similar to what we use today. The biggest difference is that we have much more sophisticated tools and faster analytics. But the bottom line essential data we look for is the same. I think that's because some truths simply exist, and the methods work.

As for the changes in technology, services like PlaytestCloud, which offer us the ability to do user research remotely, have been revolutionary. These services bring in a much wider variety of players from different places, and it makes it possible to reach players in their natural environments, which is fantastic.

I think that's the biggest change. We used to have really only one option, which was to have the players come into a laboratory and put them in a very controlled setting, which takes them out of their natural playing environment.

We used to lose some control when we conducted research in a natural environment. So remote testing has been remarkable because it offers the best of both worlds.

It's particularly useful for mobile games because testing remotely allows for the stopping-and-starting that occurs in the real world. Maybe someone's playing while waiting for the subway, or maybe they get interrupted because their child comes in, or they have to cook dinner, or whatever it might be.

The other change is there's more sensitivity now to the benefit of user research and player research. More designers, developers, and producers are getting how important this.

When I first started doing this for games, there weren't as many companies who saw the value. The ones who saw the value really got the value of doing the research. Others were not aware of this value. But once people try this research, and see the results, they see the value, and then they get fully onboard.

So this change of perception has been a huge, and wonderful change, from my point of view.

In a mobile game, there's no real leeway; it's got to be easy to get into, as there are many options that are inexpensive and easy to access. But that gap is shrinking in the PC and console world as well. Games are getting much better at creating an easier entry experience, so there's more of an expectation of that.

What's the difference between playtesting for mobile versus playtesting for console or PC?

HD: Well, there is a lot of difference on the one hand, and another there is not as much as you would think. The division is the game's intended experience and what platform is being used.

I think there's more expectation on console or PC from the types of players who play on those platforms, as opposed to a mobile platform. There's a more enveloping experience expected, and a little more leeway regarding the new player experience ramp-up time.

In a mobile game, there's no real leeway; it's got to be easy to get into, as there are many options that are inexpensive and easy to access. But that gap is shrinking in the PC and console world as well. Games are getting much better at creating an easier entry experience, so there's more of an expectation of that.

Another big difference is that both PC and console games tend to be played over longer periods of time than mobile games, and tend to be more immersive. But again, those gaps are shrinking as mobile games are bringing more of the experience of what we have on console and PC to the mobile space. People can now play on bigger and better screens, with better graphics.

Probably the best way to answer this question is to say that it comes down to player expectation. What do the players value most, and how fast will those players be willing to leave a game if it's not delivering?

PC and console players tend to be more invested in their platforms than mobile players, which creates more cognitive dissonance. What I mean – and this isn't to advocate that you don't have to be as good with the new player experience on PC or console – is that there's a little more room for players to have a lower expectation of early play because they're more invested overall. They've already spent time and money, and they have more of an expectation of playing for longer; whereas on mobile, it's more, "Oh, I can't figure it out, I'm just gonna delete it, bye." You know, "Delete. I lost ninety-nine cents, so what."

That's the most exciting thing about this work, we can never predict. We can make the best assumptions we can, designers can design the most optimal experience they can, based on what they’ve seen in the past, but people will always surprise us.

What tips would you give to people who are just getting started with playtesting their games?

HD: I would say, have an open mind. Know that's it's not an evaluation of the game, it's the ability to see how real players – real human beings – interact with the game, and that's why we do playtesting. We cannot predict human behavior.

That's the most exciting thing about this work, we can never predict. We can make the best assumptions we can, designers can design the most optimal experience they can, based on what they’ve seen in the past, but people will always surprise us.

Be open minded about how real players play your game and know that playtesting is just a part of the design process. Again, it's not an evaluation. You've designed the game without players. Now it's time to optimize for the human experience.

DC: I think that's a great answer. Keep an open mind, and don't take it as a form of criticism.

Because it's not meant to be, it's part of the design cycle. It's the next stage. You can never predict the human experience. You can be close, but you can never predict. I think that's the message. It's why we test with real players, because you can never predict human behavior.

Where should people new to playtesting look for resources?

HD: There's a few. I would recommend checking out Games User Research SIG. There's a lot of resources on our website, User Behavioristics, and there are also some books I recommend on game usability and game methods, those are also on the User Behavioristics website.

I want to mention Better Game Characters by Design: A Psychological Approach, by Katherine Isbister, her work is amazing. Characters by Design is focused on designing characters and narratives in games in a way that connects us emotionally. Her work is so specific and actionable. I'm a big fan of this and her other work that also seems to have this common thread.

Last question. What is your favorite moment during or after a playtest?

HD: Oh, I love that question. I love it when I get to experience the design team really understanding the analysis. The motivating factor for me is to be of service of the design team. I mean, I think about the players too, of course, but ultimately my job is to help the designers and production team gain insight into their player experience and when they really get and understand the analysis, and have that aha moment… It's such a great feeling because I feel like the work we've done has had an impact. Because then they can go off with the knowledge and create a better player experience because of it. That's the payoff for me. That's why I love doing this work.

It's like being the translator. You know that moment when the translation is understood? That's what it feels like, that's the moment.