Models for Story Consistency and Interestingness in Single-Player RPGs

Petri Lankoski

Published in Academic MindTrek 2013

(c) Petri Lankoski 2013. This is the author’s version of the work. It is posted here for your own personal use. Not for redistribution. The definitive version was published in Academic MindTrek 2013. [LINK TO BE ADDED]


What are the elements that aect story interestingness or consistency in single-player videogames? The question is approached by comparing player evaluations (N=206) of 11 videogames against a set of features derived by formal (qualitative) analysis. Ordinal regression was used to analyze the collected data. The study posits that dialogue system, romance, moral choice, appearance customization, and support for dierent play styles relate to story evaluation. Females tend to judge game stories more favorably and those with doctoral degree less favorably than players with other education.

Categories and Subject Descriptors K.8.4 [Personal Computing]: General|Games

General Terms: Experimentation

Keywords: ordinal regression, games, storytelling, story consistency, story interestingness


Many modern high-budget computer games have a strong storytelling component. The game contains characters and story twists where players become engaged with the fates of their player-characters as well as those of NPCs (nonplayer-characters). For example, Tavinor [22] considers these kinds of games to be interactive fictions and Juul [10] discusses games as fictive worlds.

Lankoski and Bjork [14] argue narrative interpretation is likely to happen when a game has characters in it. They posit that narrative believability is crucial for coherent playing experience. With narrative believability they mean that the game is believable when players are able to t the game events and character actions to their current interpretation of narrative without conscious effort. When a consistent interrelation is available, the game narrative is believable. [14]

If narrative comprehension is an important part of playing experience, then what makes a game story interesting? If narrative believability is important for the narrative comprehension, what are the game features that maintain or break the consistency? The main research question in this paper is:

  1. What are the game features that aect a player evaluation of story interestingness and consistency?

Analyzing such questions, it is obvious two analytical challenges need to be taken into account: 1) What is meant by such common descriptors? The concept of an “interesting” or a “consistent” story is likely to vary between players. In other words, story evaluations are subjective. 2) Players are not evaluating design choices, but play experiences that arise as a results of playing games in their own preferred ways. While game systems are designed to influence the playing experience, they do not determine the experience.

To summarize, an analytical approach needs to be able to connect game design features to player evaluations of game stories that can take into account dierent play styles and preferences. In other words, an analytical framework should not assume equal scales between players, but allow dierent interpretations of the precise game qualities. However, it is assumed that players are internally consistent in some degree and that they hold the same interpretation of “interesting” and “consistent” from one game to another.

An analysis now follows the Tutz and Hennevogl [23] example about wine testing where nine judges taste eight bottles of wine to judge its bitterness. Each wine is exposed to different conditions temperature (cold or warm) and contact to air (yes/no). Tutz and Hennevogl show how to analyze the effect of the conditions to the taste of wines and how to take into account the influence of judge’s taste using ordinal regression. Judges’ taste can treated as a random effect and conditions (temperature and air contact) are fixed effects. Here the fixed effects presents an estimate how the variable relate to the bitterness judgement of wines and the random effect combines the estimates of the how much variance personal tastes bring to the results.

In this study, respondent judgment about story consistency and story interestingness are modeled in a similar fashion. Instead of physical conditions (temperature air contact), game features (interactive dialogue, voice-acting, etc.) are used in models in order to nd out how each feature relate to players’ judgemen of the story. Moreover, the approach enables evaluating how big an impact each player’s taste and specic title-related implementation details have in judgement. The analysis method is described in more detail in the Method section.

The next section looks at theories relating to narrative comprehension and character playing and traces game features that can in influence the interestingness and consistency of the perceived narrative in games. Subsequently, the quantitative analysis method is described. The findings are described in the Results section. In Discussion, the implications are discussed. Finally, Conclusions summarize the main results.


Jorgensen [9] argues that game characters, including player-characters, are used as narrative tools, to shape the narrative interpretation of a game (c.f., [14] [22]). Jorgensen highlights that the roles of the player-character (in her analysis cases) Mass Effect 2 [4] and Dragon Age: Origins [3] are different. In Mass Effect 2, the player plays the role of Commander Shepherd. He is well-defined character, whereas in Dragon Age: Origins, the player takes the role of the Warden whose personality is less defined by game narration and gameplay. Despite these differences, both games use characters as narrative tool.

Lankoski and Bjork [14] argue that the game is believable when its players are able to maintain coherent interpretation of the events in the game in terms of narration and gameplay. Building on the cognitive theory of narrative interpretation by Zwaan and Radvansky [25], they propose that believability is lost if there is too much disparity between different events in terms of characters, causality, space, time, and intentionality [14].

Lankoski and Bjork [15] argue that social relations of the character have an important role shaping the playing experience. Lankoski [13] proposes that empathy and sympathy with the player-character shape the story comprehension of a game. Waern [8] presents a study looking at how romances in games can create strong playing experiences and have great impact on the story of the game. Tavinor [22] argues that moral choices are crucial in interactive storytelling in games: moral choices can improve the stories of games, as in Bio Shock [8].

In many videogames, the players can choose to pursue different side quests that have dierent implications on story interpretation as well as what kind of personality the player-character is. For example, in the Fallout 3 [1], the player can choose to detonate an atom bomb in the middle of a settlement or disarm the bomb (The Power of Atom quest) as well as choose to become slave trader (the Strictly Profitable side quest) or become a good guy who helps others. This implies that the players’ approach to the playing can have impact on how the story is experienced (c.f., Lankoski and Järvelä [16]).

Based on the described research, we assume that the game features relating to how the game characters, especially player-characters, are presented in a game and players’ action possibilities in the game have an impact on story (we return to these features below).

The important features of the above-discussed theories can be summarized as the following features of games:

  1. Interactive dialogue: the game has interactive dialogue: if player can choose between different dialogue lines (as in the Dragon Age: Origins or in the Mass Effect 2 ).
  2. Dialogue with personality options: the game has interactive dialogue where there are lines for different kinds of characters (as in Mass Effect 2 where you mostly have options to be kind, harsh, and sarcastic or as in the Fallout 3 where you have some different lines, e.g., characters having high intelligence or charisma score, or some having certain skills and perks). Interactive dialogue is prerequisite for this, but interactive dialogue does not need to support different personalities.
  3. Moral choices: the game makes a player to choose from options relating to morals (e.g., in Fallout 3 the player can choose to be a slaver).
  4. Support different play styles: is there only one way to play the game as in the Uncharted 2 or does the game support different playing styles as in the Deus Ex: Human Revolution [7] where one can play the game as, for example, a sneaker or a shooter.
  5. Optional side quest: does the game contain quests that are not required to complete the game?
  6. Appearance customization: If the players can design the player-character appearance (e.g., how the face looks), the game has full customization (coded as yes). If the player is able to have some in influence on how the player-character looks (e.g., by changing clothes or hair color), the game has limited appearance customization (coded as some)
  7. Character development: character development: yes means that the game has (player) planned character development [see, 14] and some if the game has scripted character development (e.g., the character gains new skills after completing a specific quest).
  8. Voice-acting (PC): is the dialogue of the player-character is voice-acted?
  9. Friendship modeling: non-player-character{player-character relations are modeled and the model influence the game progression or events (e.g., in the Dragon Age: Origins a nonplayer-character betrays if he does not like the player-character enough).
  10. Romance modeling: romance modeling is similar to friendship modeling, but leads to a romance between the player-character and nonplayer-character
  11. Romances in cut-scenes: romantic relation between the player-character and the non-player-character is presented in cut-scene.

The aforementioned features are a working hypothesis for game features that might contribute to (or against) the story interestingness and consistency. The next step is to evaluate the importance of each feature (this will be returned in the Method section).

Figure 1. Figure shows the features that games have. The abbreviations of the games are Fallout 3: FA3, Dragon Age: Origins : DAO, Dragon Age 2 : DA2, Deus Ex: Human Revolution : DEHR, Elder Scrolls V: Skyrim: ESV, Grant Theft Auto IV: GT4, Mass Effect 2: ME2, Red Dead Redemption: RDR, Uncharted 2: Among Thieves: U2

The games selected for this study are newer popular AAA titles. To t the models, there should be enough answers for each game. Moreover, it is assumed that the players have a better recollection of their playing experience when they are considering games that they have played somewhat recently.

All games were played and judged based on whether a certain feature is present or not. After initial categorization, four experts comments where made about the analysis. Based on the comments, a revision to categorization was made.

How each feature is present in the games is summarized in Figure 1. As there is only one game (the Uncharted 2 [17]) having no optional side quests, this category cannot be used in analysis. The results are not likely reliable when there is only one game in one group and ten in another. Cut-scenes and scripted events category is not used because all these games use them. In addition, dialogue with personality options was dropped, as all the selected games that have interactive dialogue also have also personality options. Another categorization issue is whether the Fallout 3 has romances in cut-scenes. If one haw the perk “Black Widow”, it is possible to  flirt with a nonplayer-characters, and after that how the character is in love with the player-character is told as a scripted events: the player-character receives love letters later). This is marginal; hence, Fallout 3 does not contain this feature.


Data were gathered using a questionnaire. There were questions on the background of respondents (age, sex, education, and nationality as well as their habits of playing video games, board games table-top role-playing games, and larps). The questionnaire is not validated or counter-balanced. The data was gathered in two questionnaires. As the initial analysis of the first set of games pointed towards interesting variation, second questionnaire was used in order to add more games with different sets of features. The data is an anonymous in both questionnaires.

The questionnaire has eight five-point Likert scale questions on role-playing and narrative and one question about the playing of the game. This paper uses only background questions and questions about the story: The story is consistent and The story is interesting. The scale for both questions is totally disagree (1) to totally agree (5).

The games in questionnaire 1 are Fallout 3 (FA3) [1], Dragon Age: Origins (DAO) [3], Red Dead Redemption (EDR) [20], Dragon Age II (DA2) [5], and Elder Scroll V: Skyrim (ESV) [2]. Those in questionnaire 2 are the Grand Theft Auto IV (GT4) [19], Deus Ex: Human Revolution (DEHR) [7], Mass Effect 2 (ME2) [4], Assassin’s Creed: Brotherhood (ACB) [24], Batman: Arkham Asylum (BAA) [21], Uncharted 2: Among Thieves (U2) [17]. (In what follows, the games are referred using the abbreviations in parentheses.)

Nonproportional quota sampling method was used. The target was to have at least 20 answers for each game. In addition, quota for females (or males) was set 25%. The respondents were recruited by advertised the study in Facebook, Twitter, Google+, and pelilauta. as well as in two forums dedicated to ACB and U2 when needed to gather more answers to those games. In addition, the study was advertised in a Finnish girl gamer forum to get more answers from females.

Data were analyzed using an ordinal regression (see [23]). The software used to run the analysis is R (version 2.14.1) [18] and Ordinal package [6]. Gauss-Hermite approximation was used with 10 quadrature points. Both qualitative game data and questionnaire background data were used in analysis as fixed effects. In the analysis, no specic structure for the ordinal data is assumed beyond order (i.e., 1<2<3) and thresholds (from 1 to 2, 2 to 3 and so on) are estimated in the model.

Random effects are used to catch the effects that are secondary to the study such as heterogeneity of subjects. For example, the respondents are likely to have personal views about what is a good story. With random effects, it is possible to take into account different views and predict how much the views have impact in their story quality evaluations, that is person-to-person variations in general or impact  of the different implementations of games (see [23]).

Fixed effects are estimates on how strong an effect a variable is estimated to have in the regression model. To simplify, the fixed effect describes the size of the effect. In addition, the model can be used in prediction. That is how the subjects would evaluate the X. As the models are not directly comparable to another model, the the predicted probabilities are also presented. This also makes the models easier to interpret.

The strength of regression is that it is possible to multiple variable in one test instead of conducting multiple A-B tests (with multiple tests, one always increases the change of false positives). Moreover, instead of single acceptance criteria (e.g., p<0.5), one gets confidence intervals that reveal more about the data.

All features in table 1 are used in the fitting, but those with low explanatory values were dropped. The models were selected using AIC¹: the model with the lowest AIC was preferred. In addition, a game as random eect was dropped because it did not improve the models (the variance was practically zero). Hence, a game as a random eect indicate that the fixed effects are enough to describe the game-to-game differences.


In this section, the final models with the lowest AIC are presented.

4.1 Respondents

The mean age of the respondents of the questionnaire is 28.49 (min=13.00, max=51.00, 1st Qu=24.00 and 3rd Qu=33.00). 68.9 % of respondents are male and 31.1 % female. Education, nationalities, and playing habits of the respondents are summarized in 3. The number of responses to each game is Assassins Creed: Brotherhood: 39 (23 Males and 16 females), Batman: Arkham Asylum: 41 (26 males and 15 females), Dragon Age: Origins: 77 (52 males and 25 females), Dragon Age II: 40 (23 males and 16 females), Deus Ex: Human Revolution: 40 (31 males and 9 females), Elder Scrolls V: Skyrim: 69 (50 males and 19 females), Fallout 3: 77 (60 males and 17 females), Grand Theft Auto IV: 62 (43 males and 19 females), Mass Effect 2: 61(37 males and 24 females), Red Dead Redemption: 35 (26 males and 9 females), Uncharted 2: Among Thieves: 21 (15 males and 6 females).

Figure 3: Summary of the background variables and answers.
Figure 3: Summary of the background variables and answers.

4.2 Story Interestingness

The positive effects (the game features) that relate to the story interestingness are character development: some, interactive dialogue, romance in cutscenes, and supporting different play styles. The negative effects are character development: yes, appearance customization: yes,² character development yes, moral choices.

However, only the confidence levels of the effects’ appearance customization, romance in cutscenes, and supporting different play styles do not cross zero. Moreover, the confidence intervals are wide indicating. These both show that model might not be very accurate.

Females (CI90=0.1461–0.9107) are estimated to like stories more than males. Also, those who have high school or other education (CI90=-1.3784–2.291) judge stories to be more interesting than those who have doctoral degree Akaike Information Criteria (AIC) is used to measure the relative goodness of t of a model within a set of models (CI90=-3.0803–-1.4235). Other differences in education are not different in 90% confidence level.³ The coefficients of effects and confidence intervals of the model are given in figure 2.

Figure 4 shows higher predicted evaluations of the story when the game has character development: some, interactive dialogue, romances in cut-scenes, and support for different play styles. On the other hand, lower scores are predicted with the features character development: yes one appearance customization. Figure 4 also illustrates the net effects using the U2, DAO, ESV, DA3, and BAA as an example.

4.3 Story Consistency

The positive effects of the model for story consistency are character development (both some and yes) and romances in cut-scenes. However, confidence in these effects is somewhat low. The negative effects are romance modeling: yes, romance modeling: some, and appearance customization. The confidence interval of the appearance customization is wide and it crosses zero. The coefficients and confidence intervals of the model are given in figure 2. In addition, the model shows that females evaluate the stories to be more consistent (CI90=0.0985–0.8084) than males.

Figure 5 illustrates the model for story consistency using predicted probabilities for each effect separately. The figure show higher story constancy score predictions of the features character development (some and yes) and romances in cut-scenes. Lower scores are associated to the features of appearance customization and romance modeling (some and yes). The model also illustrate the net effects using the U2, DAO, ESV, DA3, and BAA as an example.


When looking at the model when all the effects are set to zero, we see that scores above three have a high probability. This shows that all games in this study have high scores based on how interesting or consistent the game story is (see figure 3). This means that not all relevant variables are used in model building; hence, some important variables are missed. On the other hand, the models can reflect the quality of game design and writing. The variables needed reflect the quality of game design and writing. However, connecting specific structural features to the quality of the game is hard. Inclusion of the quality of the game in the model is an interesting aspect and requires further studies where one traces the features that can be connected to the quality.

One should be careful when generalizing from these results. First, the sample is not representative. Second, the number of games in this study is small: hence, the implementations of the said features can have much impact on the results. In addition, the model for story interestingness contains many uncertainties because the confidence intervals are wide and some intervals cross zero. The model for story consistency is more reliable than that for story interestingness; the story consistency model shows high confidence in its threshold coefficients.

Figure 2: Figure shows the estimates and confidence intervals of the best models. 1|2, 2|3, 3|4, and 4|5 are threshold coefficients.

Females liked the game stories more than males. The games have romances as a story element and that might have contributed to this difference. Education might make players more critical toward game stories, but the data show significant difference of story interestingness evaluation between players having doctoral and high school or vocational education.

Figure 4: Predicted probabilities for story interestingness (The game has an interesting story). X-axis predicts the score of the question (1: strongly disagree, 5: strongly agree) and y-axis shows the probability of that score. Predictions are shown per effect as well as for the net effects for U2, DAO, ESV, DA3, and BAA (high school educated male)
Figure 4: Predicted probabilities for story interestingness (The game has an interesting story). X-axis predicts the score of the question (1: strongly disagree, 5: strongly agree) and y-axis shows the probability of that score. Predictions are shown per effect as well as for the net effects for U2, DAO, ESV, DA3, and BAA (high school educated male)

The interactive dialogue seems to relate to story interestingness as well as showing romances in cut-scenes. However, romance modeling relates to negative story consistency evaluation. A likely explanation is that modeling (e.g., how the romance progresses) can lead to inconsistencies between prescripted cut-scenes.

Player planned character development (character development: yes) can easily create tension between prewritten game progression and narrative structure (e.g., cut-scenes in a game are hard to create so that the cut-scenes take into account the prior game events). The limited character development in the games in this study has coupled character development more to game and story progression, which can explain why character development: some is a positive eect in both models.

It is interesting that support for dierent play styles is a positive eect for story interestingness. Intuitively, dierent play styles can easily lead to inconsistencies in the story, but it seems that the players approach the game in rather consistent manner. Based on Tavinor’s [22] argument, one would expect that moral choices would be important to make game stories interesting, but moral choices is only eect in the model for story consistency. However, all the games in this study implement moral choices in a rather similar fashion, it would be interesting to include games having having a different approach to moral choices to story the impact of the implementation.

Interestingly, the model connects appearance customization a negative eect on the story interestingness. Here, it is likely that appearance customization eect catches some other features of the games that relate to the interestingness of the story. The role of the player-character is more tightly dened in the games in the study that does not allow (or allow some customization) than the games that give more freedom to players to customize their characters. Hence, the stories in those games are more character-driven (c.f. [12] [11])


In this study, formal features of games and user evaluations of story are used together to understand which features in games are important in story comprehension. This is a pilot study and hopefully demonstrates the possibilities of mixed model approach in game research. The best models imply the features included in the final models are more important in terms of story than the ones that are not (c.f. figure 1). To recap, the models highlight the importance of following features:

  • Appearance customization.4
  • Character development
  • Interactive dialogue
  • Romance modeling and romances in cut-scenes
  • Moral choices

The data may not include other features that are important in relation to story interestingness and consistency. However, these aforementioned features are good candidates when we continue to study how story comprehension in games work.


Figure 5: Predicted probabilities for story consistency (The game has a consistent story). X-axis shows the prediction to the score (1: strongly disagree, 5: strongly agree) of the question 3 (see appendix) and y-axises show the probability of that score. Predictions are shown per effect in the model as well as for the DOA (high school educated male).
Figure 5: Predicted probabilities for story consistency (The game has a consistent story). X-axis shows the prediction to the score (1: strongly disagree, 5: strongly agree) of the question 3 (see appendix) and y-axises show the probability of that score. Predictions are shown per effect in the model as well as for the DOA (high school educated male).


  1. Akaike Information Criteria (AIC) is used to measure the relative goodness of t of a model within a set of models.
  2. The model combines appearance customization: yes and some levels because design is column rank deficient if both levels are reserved.
  3. As the 90% con fidence intervals are not crossed, these groups are signi ficantly di fferent (approximately at the same level, p<0.05). The eff ects of the other levels of educations do not di ffer signifi cantly from the high- or low-level of education, as the 90% con fidence intervals are cross each other.
  4. As mentioned above, it is not clear if appearance customization as such is the feature that is important for story, especially if appearance customization incidentally catches another feature.


  1. Bethesda Game Studios. Fallout 3. Bethesda Softworks, 2008.
  2. Bethesda Game Studios. Elder scrolls v: Skyrim. Bethesda Softworks, 2011.
  3. Bioware. Dragon age: Origins. Electronic Arts, 2009.
  4. BioWare. Mass effect 2. Electronic Arts, 2010.
  5. Bioware. Dragon age II. Electronic Arts, 2012.
  6. R. H. B. Christensen. Ordinal|regression models for ordinal data, R package version 2012.01-19,
  7. Eidos Montreal. Deus ex: Human revolution. Square Enix, 2011.
  8. Irrational Games. Bioshock. 2K Games, 2007.
  9. K. Jorgensen. Game characters as narrative devices. a comparative analysis of Dragon age: Origins and Mass effect 2. Eludamos. Journal for Computer Game Culture, 4(2):315–331, 2010.
  10. J. Juul. Half-Real: Video Games Between Real Rules and Fictional Worlds. MIT Press, 2011.
  11. E. Lajos. The Art of Dramatic Writing. Wildside Press LLC, Nov. 2009.
  12. P. Lankoski. Character-Driven Game Design: A Design Approach and Its Foundations in Character Engagement. School of Art and Design Publication series A 101. TAIK Books, Helsinki, 2010.
  13. P. Lankoski. Player character engagement in computer games. Games and Culture, 6:291–311, 2011.
  14. P. Lankoski and S. Bjork. Gameplay design patterns for believable non-player characters. In Situated Play: Proceedings of the 2007 Digital Games Research Association Conference, pages 416–423, Tokyo, 2007. The University of Tokyo.
  15. P. Lankoski and S. Bjork. Gameplay design patterns for social networks and conflicts. In GDTW Proceeding 2007, pages 76–85, Liverpool, UK, 2007. Liverpool
  16. P. Lankoski and S. Järvelä. An embodied cognition approach for understanding role-playing. International Journal of Role-playing, 2012.
  17. Naughty Dog. Uncharted 2: Among thieves. Sony Computer Entertainment, 2009.
  18. R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2012. ISBN 3-900051-07-0.
  19. Rockstar North. Grand theft auto IV, 2008. Game.
  20. Rockstar San Diego. Red dead redemption. Rockstar Games, 2010.
  21. Rocksteady Studios. Batman: Arkham Asylum  Eidos Interactive, 2009.
  22. G. Tavinor. The art of videogames. Wiley-Blackwell, Malden MA, 2009.
  23. G. Tutz and W. Hennevogl. Random effects in ordinal regression models. Computational Statistics & Data Analysis, 22(5):537{557, Sept. 1996.
  24. Ubisoft Montreal Studios. Assassin’s creed: Brotherhood. Ubisoft, 2010.
  25. R. A. Zwaan and G. A. Radvansky. Situation models in language comprehension and memory. Psychological Bulletin, 123(2):162–185, 1998.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s