Loss aversion, one more victim to replication crisis? (no)

I saw somewhere a link to working paper by Gal & Rucker entitled The Loss of Loss Aversion: Will It Loom Larger Than Its Gain?, with the comment that loss aversion is one more psychological phenomenon not replicating. I was surprised by this claim because the mechanisms behind loss aversion, or something like it, are very much related to affect psychology. The context of replicability implies that loss aversion was never a thing, it was just a statistical fluke resulting from questionable research practices like social priming and Bem’s psi findings. But that’s not what the linked manuscript says. It’s a review that never actually questions whether such a phenomenon exists at all – rather it’s discussing the scope and an alternative conceptualization of the phenomenon. I’m not that familiar with this literature to assess whether the review really is impartial or whether it cherrypicks its findings (as it’s written quite obviously with a particular conclusion in mind, self-citing a lot), but clearly it should not be cited as evidence that loss aversion as a phenomenon is a result of QRPs. I’m a bit annoyed that the title and even the abstract plays like a clickbait and makes it easy to link the normal theoretical discussion about the limits of a phenomenon to the replicability issue.

In addition to enabling the misreading of this manuscript’s position in the literature, I was slightly miffed that it’s at least partly based on a fundamental misunderstanding of how the mind works. The authors describe a strong and a weak form of loss aversion in order to compare them to evidence, both of them in terms of it being a general, universal principle that can be applied to any human behavior: the strong form as absolute (that “one should not observe cases where gains have a propensity to be weighted more than losses of similar magnitude”, p.9), and the weak form as relative (“on average, one expects the data would largely reveal a greater impact of losses than of gains”, p.10). It may be that this is a feature in decision making research in general rather than a view held only by the authors, but from the point of view of affect psychology, it makes little sense. It’s a strawman, because I don’t think there is anything in psychology that can be considered a universal law like this, at the level of observable outcomes. Human mind does not work on “principles” or “laws” like this, because it is an immensely complex system of reacting, predicting, and self-correcting processes. There is no single process reaching through the whole of human mind, always (or even mostly, on average) producing the same results regardless of circumstances, because that would not be adaptive for the complex physical and social environment our mental machinery. And even if we focus on a very high level, it’s a dubious notion to begin with that all decision making would be governed by a single process translating all kinds of decisions into simple losses and gains.

I admit that I think some things as “principles” of human mind, and negativity bias (related but not identical to loss aversion) sounds like a good candidate, but it does not mean that at the level of observable outcomes, regardless of circumstances, we should see (absolutely or on average) a particular pattern of behavior. Rather, it means that some parts of the system tend to process information in certain ways, and in specific circumstances – where we can somehow control that specifically these processes are the ones influencing the outcomes the most – we can indeed see patterns in behavior.

That said, the alternative conceptualizations – such as propensity towards inaction or status quo – are interesting, and worth considering (assuming the review is not horribly biased) for anyone working with loss aversion. It is very likely true that an intuitively appealing conceptualization tends to be overgeneralized and that scientists easily persist even in face of evidence to the contrary.

 


Gal, D., & Rucker, D. (2017). The Loss of Loss Aversion: Will It Loom Larger Than Its Gain? (SSRN Scholarly Paper No. ID 3049660). Rochester, NY: Social Science Research Network. Retrieved on 12 Jul 2018 from https://papers.ssrn.com/abstract=3049660

Reading: 27 semantic categories for emotional experiences (Cowen & Keltner, 2017)

“Self-report captures 27 distinct categories of emotion bridged by continuous gradients” by Cowen and Keltner (2017) in PNAS has raised interest but also ridicule. Sure, if your preconception is that discrete theories are bs anyway, then “there are 27 categories instead of the traditional six!” may seem funny. But I have argued (Kivikangas, in review; see also Scarantino & Griffiths, 2011) that discrete emotion views (not the same as basic emotion theories) have their place in emotion theory, and I find a lot of good in the article – as long as it is kept in mind that it is about self-reports of emotional experiences.

To summarize, the participants were shown short video clips, and different ratings of emotional experiences resulted in a list of 27 semantic categories that overlap somewhat, implying that both discrete and dimensional views are right. The sample could have been bigger, the list they started with seems somewhat arbitrary, and the type of stimuli probably influences the results. But the article supports many of my own ideas, so my confirmation bias says it’s valid stuff.

In a bit more detail:

  • They had 853 participants (which, IMO, they should mention in the main text as well) from MTurk watching in total 2185 videos (5 s on average) and judging their feelings. The participants were divided into three groups [1]:
    • first group provided free responses to 30 randomly assigned videos (although the supporting information says this was not entirely free, but a predefined set of 600 terms which were autocompleted when the participant typed into a blank box);
    • second group rated 30 videos according to a predefined set of 34 discrete categories (had to choose at least one, but could choose more – apparently this choice was dichotomous) the authors had gathered from bits and pieces around the literature;
    • third group rated 12 videos videos according to a predefined set of 14 dimensional [2] scales (9-point Likert scale).
  • I don’t pretend to know the statistical methods so well I could vouch for their verity [3], but the authors report that 24 to 26 of the 34 discrete categories from the second group were independent enough [3] to rate the videos reliably. The “free” responses from the first group provided 27 independent [5] descriptions, that were then factored into the 34 categories to find out the independent categories. Apparently these three analyses are taken as evidence that categories beyond the 27 are redundant (e.g. used as synonyms; statistically not reliably independent).

cowen,keltner-fig2C

  • Their list above (Fig. 2C) dropped the following from the original 34 categories: contempt and disappointment (coloading on anger), envy and guilt (unreliable, but not loading on any other factor), pride [6] and triumph (coloading on admiration), and sympathy (coloading on empathic pain and sadness). I discuss the list and these dropped categories below.
  • These categories are not completely distinct, but smoothly transition to some (but not any!) of the neighboring categories, e.g. anxiety overlaps with fear, and fear overlaps with horror, but horror does not noticeably overlap with anxiety. When the 27 mathematical dimensions these categories are loading are collapsed into two, we get this map:

cowen,keltner-fig2B

  • The map is also available in higher resolution here, and it’s interactive! You can point at any letter and see a gif of the rated video!
  • The authors compare these categories to the 14 dimensions from the third group, and report that while the affective dimensions explained at most 61 % of the variance in the categorical judgment dimensions, the categories explained 78 % of the variance in the affective dimension judgments. When factored into the categorical data, valence is the strongest factor, “unsafety+upswing” is the second (I read this as threat + excited arousal), and commitment (?) is the third.
  • A final claim is that the emotion reports are not dependent on demographics or personality or some other psychological traits (except perhaps religiosity).

The article begins with the remark that “experience is often considered the sine qua non of emotion”, and the general language in the article firmly places it on the experiential level: the study does not focus on the hypothesized psychological processes behind experiences, nor on the neural structures or evolutionary functions comprising the whole affect system. I mention this specifically, because IMO the inability to differentiate between these different levels is one of the main reasons the wide range of emotion theories seem so incompatible (which is one of the main points in Kivikangas, in review). The article recognizes the limits of the semantic approach admirably without making overextending claims, although in the discussion they do speculate (pretty lightly) about how the findings might relate to the neural level. However, although the authors avoid overextending and list relevant limitations (should be studied with other elicitors, other languages and cultures, etc.), the paper is probably still going to be read by many people as a suggestion that there are 27 strictly discrete categories (no, they are somewhat continuous) of emotions (no, these are only self-reports of emotional experiences – and even self-report “is not a direct readout of experience”, as the authors point out).

Furthermore, I like the position (similar to my own) of saying that – although Keltner is a known discrete theorist – both discrete and dimensional views have some crucial parts right, but that the strictest versions are not supported (“These findings converge with doubts that emotion categories “cut nature at its joints” (20), but fail to support the opposite view that reported emotional experiences are defined by entirely independent dimensions”). The authors also start from the astute observation that “the array of emotional states captured in past studies is too narrow to generalize, a priori, to the rich variety of emotional experiences that people deem distinct”. Another point I recently made (Kivikangas, in review) was that although the idea of early evolutionary response modules for recurrent threats and opportunities is plausible, that the number of these modules would be the traditional six, or even 15 (Ekman & Cordaro, 2014), is not. My view is that affects (not experiences, but the “affect channels” of accumulating neural activation; Kivikangas, 2016) are attractor states, produced by a lot larger number (at least dozens, probably hundreds) of smaller processes interacting together. And definitely there is no reason to believe that the words of natural language – typically English in these studies – would describe them accurately (as pointed out by, among others, Russell, 2009, and Barrett, pretty much anything from 2006 onwards).

So there is a lot I like in the article. However, some obvious limitations they do not explicitly state. First, they begin from a list of 34 terms from different parts of the literature, which is a wider array than normally used, but still rules out a lot of affective phenomena. From my own research history, media experiences have other relevant feelings, like frustration or tension (anticipation of something happening, typically produced with tense music). Of course one can say that those are covered by anger and anxiety, for example, but I would have to point out the relatively small number of participants – the factors might be different with another sample. (A side point is that while this seems to be a fine study, for a more specific purpose, such as for a wider use in game research, one would probably want to conduct their own study with that particular population, because the usage would probably be different.)

A theoretically more interesting point is that they include categories like nostalgia and sexual desire, and even craving and entrancement, which many theorists would argue vehemently against in a list of emotions. Me, I am happy for their inclusion as I think that “emotion” is a somewhat arbitrary category anyway and if we are looking at the affective system as a whole, we note a lot of stuff that are certainly affective but are not thought as emotions (one more point I made in Kivikangas, in review…; also mentioned by Russell, 2009). But it raises a question why many other less traditional categories were not included. Schadenfreude, bittersweet, moral anger/disgust (interestingly, one of the dimensions was “fairness”)? What about thirst? Could be included in craving, but we don’t know, because it wasn’t in the study. I have stated (Kivikangas, 2016) that startle is affective, as is the kind of curiosity to which Panksepp refers as “seeking” (Panksepp & Biven, 2012). Would the youtube generation differentiate between amusement and lulz? Naturally, some decisions must be made what to keep and what to ignore, but if they were going for the widest possible array (with things like nostalgia, craving, and entrancement), I think it could be still considerably wider. I have not looked at the list of 600 “free” responses, but apparently the authors checked which of the 34 categories were supported by the free responses, but did not check what other potentially relevant they might have included.

A second obvious limitation is the stimulus type: 5-sec (on average) video clips. The authors state that this should be studied with other kind of elicitors, sure, but they don’t explicitly mention that maybe some of their results are due to that. Specifically, the reason for dropping quite common emotions – contempt, disappointment, envy, guilt, pride, triumph, and sympathy – from their list might be that they (at least some of them) need more context. Guilt, pride, and triumph are related to something the person does, not something they simply observe in third person. Contempt is related to a more comprehensive evaluation of target personality, and envy relates to one’s own possessions or accomplishments. Actually, I was surprised that they found anger, which may also be difficult to elicit without context (as anger traditionally is thought to relate to person’s own goals) – but indeed, in the supporting information it was the second next category to drop when they tested models with less (25) factors. I suspected that there might be clips with angry people and that participants had recognized anger instead of felt it, but this seems to not be the case. Clips present in the interactive map classified with E for anger are either unjustified and uncalled-for acts of violence, or Trump or Hillary Clinton – which probably are closer to moral anger than the traditional blocked-goals anger. Anyhow, the list of found factors would be even longer if the type of stimulus did not limit it.

As a conclusion, although I have been more interested in the affective system underlying the emotion experiences and haven’t seen much point in the arguments over whether the experience can be best described as discrete emotions or dimensions, the empirical map combining aspects of both is much more plausible to me than a strictly discrete list or a too tidy circumplex. And even though the reports of (a priori restricted) emotions are not the same as the affective system underlying them, I am hopeful that this paper helps the discussion that perhaps the different models are not incompatible, and that perhaps the models may be different on different levels of scrutiny (i.e. experience vs. psychological vs. neural vs. evolutionary).

 


Footnotes

[1] The numbers are a bit unclear. The authors flaunt: “these procedures yielded a total of 324,066 individual judgments (27,660 multiple choice categorical judgments, 19,710 free-response judgments, and 276,696 nine-point dimensional judgments”.

They say that “Observers were allowed to complete as many of versions of the survey as desired, with different videos presented in each”, and that “Each of
the 2,185 videos was judged by 9 to 17 observers in terms of the 34 categories”, but repetitions per participant or per video for other response types are unclear. Without prior knowledge then, 853 in 3 groups = 284 participants per response type, which is barely above what Lakens & Evers (2014) say is needed for finding a stable (see quotation below) small effect size (r = .1; required n = 252), but below what is required for 80 % power (n = 394) for that effect size. According to within-subjects power calculations I remember, 9-17 repetitions per video does not really help the power almost at all.

“With a small number of observations, effect size estimates have very wide CIs and are relatively unstable. An effect size estimate observed after collecting 20 observations can change dramatically if an additional 20 observations are added. An important question when designing an experiment is how many observations are needed to observe relatively stable effect size estimates, such that the effect size estimate will not change considerably when more participants are collected.” Lakens & Evers, 2014, pp. 279

 

[2] Mostly appraisal dimensions in addition to traditional dimensions: approach, arousal, attention, certainty, commitment, control, dominance, effort, fairness, identity, obstruction, safety, upswing, valence.

[3] One thing I found weird was the median-split correlation for demographics and other traits. They used it to show that traits do not explain differences in emotional responding, but a quick googling only shows recommendations that median-splits should not be used because it loses a lot of information. I hope this is not a sign that the method has been used purposefully in order to find no differences.

[4] “Using SH-CCA we found that between 24 (P < 0.05) and 26 (P < 0.1) statistically significant semantic dimensions of reported emotional experience (i.e., 24–26 linear combinations of the categories) were required to explain the reliability of participants’ reports of emotional experience in response to the 2,185 videos.” I don’t immediately understand how this method works.

[5] “In other words, we determined how many distinct varieties of emotion captured by the categorical ratings (e.g., fear vs. horror) were also reliably
associated with distinct terms in the free response task (e.g., “suspense” vs. “shock”). We did so using CCA, which finds linear combinations within each of two sets of variables that maximally correlate with each other. In this analysis, we found 27 significant linearly independent patterns of shared variance between the categorical and free response reports of emotional experience (P < 0.01), meaning people’s multiple choice and free-response interpretations identified 27 of the same distinct varieties of emotional experience.”

[6] Fig. 1 and its caption shows pride not loading to its own factor and relief loading, but the text talks about these vice versa, and relief is in other figures, so most likely figure 1 is mistaken.


References

Barrett, L. F. (2006). Are Emotions Natural Kinds? Perspectives on Psychological Science, 1(1), 28–58. https://doi.org/10.1111/j.1745-6916.2006.00003.x

Cowen, A. S., & Keltner, D. (2017). Self-report captures 27 distinct categories of emotion bridged by continuous gradients. Proceedings of the National Academy of Sciences, 114(38), E7900–E7909. https://doi.org/10.1073/pnas.1702247114

Ekman, P., & Cordaro, D. (2011). What is meant by calling emotions basic. Emotion Review, 3(4), 364–370. https://doi.org/10.1177/1754073911410740

Kivikangas, J. M. (2016). Affect channel model of evaluation and the game experience. In K. Karpouzis & G. Yannakakis (Eds.), Emotion in games: theory and praxis (pp. 21–37). Cham, Switzerland: Springer International Publishing. Retrieved from doi:10.1007/978-3-319-41316-7_2

Kivikangas, J. M. (in review). Negotiating peace: On the (in)compatibility of discrete and constructionist emotion views. Manuscript in review.

Lakens, D., & Evers, E. R. (2014). Sailing from the seas of chaos into the corridor of stability practical recommendations to increase the informational value of studies. Perspectives on Psychological Science, 9(3), 278–292.

Panksepp, J., & Biven, L. (2012). The archaeology of mind: neuroevolutionary origins of human emotions. New York, NY: W.W. Norton & Company, Inc.

Russell, J. A. (2009). Emotion, core affect, and psychological construction. Cognition & Emotion, 23(7), 1259–1283. https://doi.org/10.1080/02699930902809375

Reading: “Moral outrage in the digital age”

“Moral outrage in the digital age” by M.J. Crockett is a short theoretical paper, drawing together from several lines of research a [model? hypothesis? theoretical framework? theory?] to explain how the operation of moral outrage is transformed by digital media.

I’m not particularly keen on the underlying view on moral outrage which seems to be based on basic emotion interpretation of anger and disgust (from Fig. 2: “For each immoral act, moral outrage was calculated by multiplying self-reported anger and disgust” – btw, why multiplying and not averaging or calculating a sum?), but otherwise it makes a nice and plausible case of the differences the digital media might make. I’m not familiar with most of the empirical research it refers to, so I can’t say much about how convincing the actual evidence is, but the overview fits my preconceptions.

The main points can be summarized (Fig. 1 is not immediately clear):

  • Humans have psychological processes to react with an emotional condemnation when they think a moral norm has been violated.
  • Digital media
    • gives us a greatly increased access (removes physical constraints) to information about moral violations than traditional social communication (like gossip)
    • lowers the costs (effort; the article talks about the possibility of physical retribution, but I’d generalize that as the risk of potentially wasting the social capital) for expressing outrage
    • lowers the inhibitions (no face-to-face feedback means we don’t have to deal with the fact of causing emotional distress in others, which is a negative experience for most) of expressing outrage
    • increases the potential benefits (reputational rewards of moral quality and trustworthiness; successful regulation of group behavior).
  • These factors drive more moral outrage in digital media, which increases social polarization, dehumanize the targets (and their groups?), and reduce societal trust.

The short paper does not suggest any interventions, but if these mechanisms hold, then it seems to me that potential ways to inhibit this process would be to increase the costs and inhibitions, as the access and potential benefits are more difficult to control (and latter perhaps should not be controlled?). Especially effort, but perhaps costs of social capital as well, could be increased via technological solutions. These are testable predictions to cut out the most low-effort outrage. It would be interesting to see what portion of the outrage would this influence. For instance:

  • Minimally increase the effort, by increasing the steps of, or introducing a small waiting period to sharing.
  • Introducing a way to incur a minimal social cost to sharing, e.g. a downvote, perhaps limited to the friends of the sharer only, so a downvote would actually carry a meaning of “people I care about think somewhat less of me” and maybe would not be constantly abused like on anonymous platforms?

 


 

Reference:

Crockett, M. J. (2017). Moral outrage in the digital age. Nature Human Behavior. Retrieved from https://www.nature.com/articles/s41562-017-0213-3