Disclaimer: Ideas in this work may or may not be the same as my current or previous employers and or colleagues. Most of the ideas here are not my invention and are either widely accepted pieces of knowledge or are part of referenced literature in the end. Major influence on work is from the book by Dougles Hubbart: How to measure anything. I just humbly synthesized these ideas to more condense form in hope that it will help future game creators. I hope that you will have "fun" and if so, please write me your opinions on [email protected], or as a comment here.
I mean, I kinda care, otherwise I would not write about it. Nevertheless, there exist many theories and definitions of fun, and each of them is failing on one or another front. Good news is that we don't have to know what fun is. How many of you actually know what kilogram really is? It used to be a weight of one liter of water, that times are long gone. Nevertheless, we still use kilogram to weight objects around us. If I tell you that the object's weight is 500 kg you will probably imagine something of car-size. If I tell you that I lost 20 kilograms, you would see it as a big weight loss. We don't need to know what exactly something is to use it in measurements. We just need to know how a kilogram (or in our case "fun") manifests in the world. This is especially true about intangible things like opinion, religion, happiness, love, anger, health and fun. Intangibility never stopped us, why it should in case of fun?
Now, considering not only what fun is, but how it actually manifests in the world; somebody can say "I have fun when I try something several times, before I get it right, I feel challenged that way." or somebody may frown, grip the controller very tightly while retrying the same passage for several minutes; is any of this fun? I don't know, but I know that in my game part of fun should be being challenged and this seems like players are being challenged. Things that are happening in the world like players actions or players expressions, that we can observe and therefore measure are called proxies. They are not directly thing that we are interested in but, they show us the right way to it
We don't need to know precisely what x is to measure it. We need to know how x manifests in the world so we have proxies to measure.
First, a quick detour to school. Why are we going to stand on scale in the first place? We have some internal (folk) theory of how our weight is connected to our physical condition. Kilograms within a specific range means healthy, outside of it means unhealthy.
We as humans have tons of such theories about everything around us. We have internal theories (sometimes called mental model) describing why objects fall to the ground or why our friend behaves a certain way. Thanks to these theories we can predict that will happen if we throw a ball in the air and tell our friend to catch it.
Theories are clusters of hypotheses that should be coherent with each other. A hypothesis is an explanation of how x is connected to y. If x happens then y should always react the same to that event. Hypotheses in theories are interconnected; that means that if we know what is the relationship between x and y and we also know what is the relationship between y and z, then we should be able to deduce the relationship between x and z.
Cases in which we know truly what the relationship between x and y are very rare. In practice more often than not we are not trying to get a true relationship but as close to true as possible. Closer to this case we are, the better hypothesis we have. By getting better hypotheses we are getting better theories. If the hypothesis is not good enough we scratch it and come up with a new one. A hypothesis that does not improve our theory is not good enough.
Once our hypothesis is good enough (or better than starting one) we update our theory. Once we have updated theory we can set up a new better hypothesis based on it, and then the whole loop starts again. With internal theories, we do this automatically and unknowingly (simplified base for learning). Once we make it explicit we can do it way more effectively and generally better. That is a very simplified core of the scientific method. When we are inquiring about the connection of our proxies and "fun" we are using very same core principles. We as game creators to this all the time, so I think it is time to take the next step and make it explicit.
How can we do this next step? Well first we have to set up our hypothesis and then we have to measure it! This is the end of the detour, now we can look at proxies that helps us with measuring.
When we stand on a scale, information that we receive is a number. Do we really care for the number or do we care for the idea that this number represents - in case of a scale it is our health. Essentially, Measurement is an approximation of one phenomenon to another, till we get to an approximation that we understand intuitively (or rather something that is already part of our internal theory). That is the power of proxy. Proxy is a phenomena about which we have enough intuitive understanding. Yes, different people can have a different intuitive understanding of the same topic. Junior designers would argue about who is right, experienced designers would set up an experiment for their theories and then measure the outcomes.
A great theory with bad measurements will lead to worse results than a weak theory with great measurements. Bad measurements will improve theory only a bit (if at all) while good measurement can move theory by miles (this is iteration once again, you probably heard of that before). How do we know measurements to choose? Measurements have parameters:
For sure you can imagine, that each measurement can have different degrees of each parameter. Imprecise but accurate weight will show +/- 25% of the real value each time you step on it. Scale with rough granularity will show only kilograms and precise but inaccurate weight will show exact numbers each time but it will be exactly 8 kilograms off. In each case you will not know exactly what you wanted to know but you will be closer to the truth than before.
So why not always have as precise, as accurate and as granular measurement as possible? Well, because it is expensive; in terms of money, time and actual know-how. Measuring salt for your home cook dinner will be very different than measuring salt for royal wedding main course by a professional chef. Different situations require very different degree of measurements. This all depends on the decision that you are making.
Design is a sequence of decisions about what to make and what not to make. To make this decision we need to have a theory about how each part of our designed game works and what experience it induces. To have the best possible theory we have to improve it as much as possible, and to improve it we have to have as many as good measurements as it is possible. Best designers are not one, who think they know how things should be, but those who are able to update their theory as fast as possible (and this is basically iteration on design).
Every singular decision is connected to a specific subset of hypotheses in theory. These are usually hypotheses regarding proxies or approximate to the proxies we use. Closer the proxies are to our question stronger connection they have. Therefore to make decision we should measure and explore the closest proxies. For example: What you eat and how much you move is closer to your health than fuel consumption of a bus in your city. Nevertheless, both are part of an energy transformation theory. Hence learning about the combustion system may help us with our diet, learning about calories will help us way more.
As you can see, closer the connection more value we get from our investment. You should never invest more resources into measurements than you will get from it. What value you can get from investing in measurement? Well main candidates would be faster iterations, lowering opportunity cost, evading retroactive fixing, evading out of scope or under scope features, more effective testing.
One of the most dangerous parts of collecting data is over-analyzing and over collecting them. I have seen many developers do this. Do you think you didn't? When was the last time you went through Reddit or steam reviews and then came back to the office with a whole new idea about what should you change? This is also the case of that. Humans are pattern making machines. We see faces in clouds, moods in yellow circles with dots and patterns where there are none. Sometimes you can have too much data for your own good, especially if data are not accurate or precise. To lower the chance of the noise talking you should always strive to refutability by looking for counter examples and corroborate by employing multiple different kind of measurements.
...and you don't really need one (but they are extremely useful and I love all of them). It is about properly set parameters of measurement again. Analyst will offer you quite precise and accurate data. No measurement at all will offer you precisely and accurately no data. There is vast space between these two points where you can operate quite cheaply and fast.
Here are some examples of measurements in various degrees. I will always add advanced and minimal variant examples. I doubt that you will use the advanced variant anytime soon, but that is ok. Also take notice that it is just a simple summarization of a few methods, it should serve more as inspiration for additional research than exhaustive list.
With this method we are looking for rules of thumb. Some very simple rules that may lead to specific consequences. This is domain of "player type" of game creators and seniors who played it all already. Short Heuristic analysis can reveal a lot even before you start the game. Good news is that you probably already do this, just not systematically. For example: Does this platformer have a coyote time and how far? Does healing potion have the same colour as health (probably red)? It is well known approach in service and product design (try to ask your closest UX designer).
Minimal variant: Just play the game. What are things that you expected to happen and didn't. Show your game to somebody else who likes to play games and listen for what they notice as first.
Advanced variant: Have extremely detailed journal constant of different introspections in various games and cultural artefact. Improve your heuristics by continuous playing all possible games. Make a library of patterns (Like you can find in Art of Game Design by Jesse Schell or Game Mechanics: Advanced Game Design by Ernest W. Adams and Joris Dormans)
Playing games and seeing how it works. What happens in the game. It is basically starting the whole thing and seeing it running. There are a lot of very good articles, Reddit posts and videos on how to playtest for good reason. Playtesting is the cornerstone of any game development. If you are not doing this already then I don't know what you are doing.
Minimal variant: Play the game on your own or with few people; as soon as possible. Repeat all the time.
Advanced variant: oh boy, so many. Kleenex playtesting, focus group testing, company wide testing. Have a dedicated team of pro players or ex pro players and let them play every new iteration of the game...
Discursive analysis is the analysis of language used in context of a specific topic. People are expressing most of their feedback by language (especially on the internet). Trying to look into this language may offer a very interesting picture.
For example: in game Hunt: Showdown (2019) players created a new term "instaburners", this term is a reference to people who start to burn enemy players once they are downed. Burned players cannot be revived by their teammates. Instaburner is somebody who burns an enemy as soon as possible. Fact that this new term was invented tells us a lot about how often this happens and what kind of connotations this can have in game.
Minimal variant: read feedback on games and think about the kind of language people use. For example: Do they say it is "stupid" or "dumb"? It may have a very different meaning. Maybe try to ask in feedbacks "what kind of dumb it is? Using word clouds to see word representation in feedback forums.
Machine learning will collect all feedback on all possible feedback sites, where it will evaluate patterns, semantics and pragmatics of language.
Using Discursive force (hello surprise mechanics) and then measure its impact on text.
Experienced language researcher analyze data with software as atlas.ti.
Somebody will dive deep into the community (community manager for example) and will explain to the team what are players talking about.
Sometimes you don't have to run the whole game to see how it works. Sometimes it would actually be impractical because you only need to know a portion of the game (like economy). Sometimes you need to cut through complexity or you just don't have another 10k players to play game 2 two hours a day. Then you are going for models. Don't forget that map is not territory; by creating a model you are omitting some parts of the whole system. .
Minimal variant: describe all system parts and relationship and then start imagining how it would work together, Simply ask "what if" question; small mathematical model #EveryDayIsSpreadsheetsDay.
Advanced variant: Stochastic machine learning mathematical model predicting most possible outcomes based on system set up. Ask your soon-to-be-very rich friend about this.
In domain analysis you will go through competitors games in certain domains and look for commonalities and specifics. Domain can be anything from approach to 3rd person camera to genre. At the start of any domain analysis there are questions that you like to answer, things that you want to focus on. For example if you would do domain analysis on battle Royale you may find out that all of them have shrinking level mechanisms in one way or another.
Minimal variant: play some competitors' games and look how they do stuff. Look at videos of playthroughs. Write down how they handle specific cases.
Advanced variant: Make in-depth analysis on specific mechanics including data mining. Frame per frame description of actions. Look into interviews with creators. Plate all the games in genre and list all of the parts and relationships in them.
You are measuring numbers and then using statistical analysis to get new knowledge. In this approach, more is usually more. This is the world of KPI and the F2P market and there is a lot of great material on this even on Gamasutra. No need for me to go into detail.
Minimal variant: play game and note every time you die in specific level; Performing simple student tests on your marks per death in level; put simple scale (1 - 5) in feedback form and then look at average.
Advanced variant: full blown analytics, buying data from big brothers. (I am leaving legality and morality out of it for now). There are plenty of companies who make their living just by this just for games. Don't be afraid of them.
You can actually ask for people's opinion. There are a bunch of problems here that are connected to all qualitative analyses. Question form, tone of your voice, even time since playthrough can change outcome. Neverthless, you may find out very specific new information about your game.
Minimal variant: Let somebody play your game and ask them "what do you think about it?", just let them talk. Don't comment it, don't defend your game, just ask and let them talk.
Advanced variant: Have full on randomized research with a preset of meticulously chosen questions and trained interviewers.
You don't need to know what x, just how it manifests in the world
Don't be afraid of measurement, it is simple, just follow the major steps.
Define what is your decision
Build your theory
Define your best possible hypothesis
Define what are the best possible proxies
Improve your theory, adjust hypothesis
Repeat b) - f) until you have good enough information to make a decision.
Beware the noise! Corroborate and try to refute your hypothesis. Combining different types of measurements and repeated measurements will help a lot.
Low quality measurement is better than non:
We don't need to know what something is, we just have to know more about it than before.
We don't need perfect measurement, just better than before.
We don't need perfect theory, just better than before.
Measure with why (decision) in mind to prevent wasting resources.
Hubbard, Douglas W., How to measure anything: Finding the value of intangible in Business. 3rd edition. Wiley: 2014. (Main inspiration for this blog and major inspiration in my work)
Popper, Karl: The logic of Scientific Discovery. Routledge: 2002 (oh yeah, I am going there. Only for intellectually brave)
Kuhn, Thomas. The structure of scientific revolution. University of Chicago Press : 2012. (scientists are people too)
Whelan, charles. Naked statistics: Stripping the Dread from the Data. W. W. Norton Company: 2014.
Seidman, I. Interviewing as qualitative research: A guide for researchers in education and the social sciences. Teachers College Press: 2006.
Thought experiments, https://plato.stanford.edu/entries/thought-experiment/
Woodward, Matt. Balancing the Economy for Albion Online. https://youtu.be/aX8f1lE09uY (GDC talk on albion online economy, great primer into economy and how to model it.)
Ruskin, Elan. Three Statistical Tests Every Game Developer Should Know. https://youtu.be/fl9V0U2SGeI (quick primer into statistics for game dev)
Collins, Steven. A/B Testing for Game Design Iteration: A Bayesian Approach https://youtu.be/-OfmPhYXrxY (You probably didn't learn anything new from blog above, and that is ok. You still can leave comments about hating frequentism or smtg and then watch this video.)
Silver, Nate. Signal and the Noise: Why So Many Predictions Fail - but Some Don't. Penguin group: 2012.
Sasassovici, Alex and Miravete, Beatriz. How to Use Machine Learning, Live Telemetry Analysis, and Computer Vision to Manage Communities. https://youtu.be/pdJ-1i3cbng (oh yes, you can improve every facet of the game by measurement, even community management)
...and many many more! go and explore, don't forget to measure your progress ;-)