Lifestream, Comment on Tweetorial analysis – Where is Angela? by Renee Furner

Great reflection on what is missing in the data, Dirk. Thanks for sharing.

Regarding my mentions per tweet ranking, here’s some data from outside the ‘window’: I’m pretty sure the cause was simply my ‘early’ response to tweets.. which was influenced by ‘cultural-based time zone factors’, since in my region Friday is a weekend day (meaning I could respond quickly to James’ tweets on Friday morning without work interrupting .. ), and I’m +3 GMT so wasn’t sleeping till the later tweets.

‘Is any of the presented data and data analysis relevant at all? Does it say anything about quality?’

Wondering, do you have any ideas about the kind of analysis (and method) that would (or rather ‘might’) produce a relevant, meaningful interpretation? For example, if you were interviewed about the experience, and asked about what you found most useful/meaningful or which of the Tweet (either questions or responses) prompted the most thought on your part, or about what you felt or thought at the time, would we get closer to ‘relevant’? & if the interviews were repeated with all participants? It would be time consuming, yes, but.. would it reveal something worth uncovering?

What if (for a touch of the creepy) your web-camera had filmed you while tweeting, and captured signs of your mood, algorithmically interpreted? Or measurements of delay between reading a tweet and responding to it?

What in your mind is missing from the data that is required to make it meaningful?

Thanks again,

Renée

from Comments for Argonauts of the Western Pathetic http://ift.tt/2nDebPk
via IFTTT

Lifestream, Pocket, Society-in-the-Loop

Excerpt:

MIT Media Lab director Joi Ito recently published a thoughtful essay titled “Society-in-the-Loop Artificial Intelligence,” and has kindly credited me with coining the term.

via Pocket http://ift.tt/2b2VVH5


I came across this short blog post when I was still thinking about the need for some kind of collective agency or reflexivity in our interactions with algorithms, rather than just individualised agency and disconnected acts (in relation to Matias’ 2017 experiment with /r/worldnews – mentioned here and here in my Lifestream blog).

…’society in the loop” is a scaled up version of an old idea that puts the “human in the loop” (HITL) of automated systems…

What happens when an AI system does not serve a narrow, well-defined function, but a broad function with wide societal implications? Consider an AI algorithm that controls billions a self-driving cars; or a set of news filtering algorithms that influence the political beliefs and preferences of billions of citizens; or algorithms that mediate the allocation of resources and labor in an entire economy. What is the HITL equivalent of thesegovernance algorithms? This is where we make the qualitative shift from HITL to society in the loop (SITL).

While HITL AI is about embedding the judgment of individual humans or groups in the optimization of narrowly defined AI systems, SITL is about embedding the judgment of society, as a whole, in the algorithmic governance of societal outcomes.

(Rahwan, 2016)

Putting society in the loop of algorithmic governance (Rahwan, 2016)

Rahwan alludes to the co-evolution of values and technology – an important point that we keep returning to in #mscedc, we are not done unto and nor do we simply do unto technology. Going forward (and a point Rahwan makes), it seems to me imperative that we develop ways of articulating human values that machines can understand, and systems for evaluating algorithmic behaviours against articulated human values. On a global scale it is clearly going to be tricky though – to whom is an algorithmic contract accountable, and how is it to be enforced outside of the boundaries of established governance (across countries, for example)? Or, acting ethically (for instance, within institutional adoption of learning analytics), is it simply the responsibility of those who employ algorithms to be accountable to the society they affect?

Lifestream, Tweets

In the article Cathy tweeted,  ‘Predictive Analytics: Nudging, Shoving, and Smacking Behaviors in Higher Education’, there is the suggestion that HE institutions could use the wide net of data points they collect to ‘nudge’ students and improve student outcomes. Regardless of the intentions, I find it all a bit sickening, to be honest: ‘Nudging, used wisely,  offers a promising opportunity to redirect students’ decisions’/’With predictive analytics, colleges and universities are able to “nudge” individuals toward making better decisions and exercising rational behavior to enhance their probabilities of success.’ How efficient everything will be once those pesky irrational decisions are eradicated.. and we all behave in the same, well-policed manner. Heck, we won’t even need robots then..

It made me think of this video –

A focus on ‘rational’ decision-making doesn’t just make me  (so. very. utterly.) mad because of the restrictions to free will that it could imply. Throughout history, people (women, people of colour, less-abled people) have been denied access to political process and justice due to not being considered capable of ‘rational’ thought. Who decides what is rational? Based on what values? Any kind of system which encourages (or enforces) any particular way of thinking needs to be accountable, and we – society – need to be able to influence the values underpinning the system.

This, of course, goes full-circle, and links back to Rahwan’s (2016) ideas about putting ‘society-in-the-loop’ of algorithmic governance, and to the ethical concerns associated with technologies that grew out of our cyber cultures block (discussed, for example, in an earlier blog post on the political beliefs of various transhumanist positions).

Lifestream, Diigo: Undermining ‘data’: A critical examination of a core term in scientific inquiry | Markham | First Monday

“The term ‘data’ functions as a powerful frame for discourse about how knowledge is derived and privileges certain ways of knowing over others. Through its ambiguity, the term can foster a self–perpetuating sensibility that ‘data’ is incontrovertible, something to question the meaning or the veracity of, but not the existence of. This article critically examines the concept of ‘data’ within larger questions of research method and frameworks for scientific inquiry. The current dominance of the term ‘data’ and ‘big data’ in discussions of scientific inquiry as well as everyday advertising focuses our attention on only certain aspects of the research process. The author suggests deliberately decentering the term, to explore nuanced frames for describing the materials, processes, and goals of inquiry.”
from Diigo http://ift.tt/2mOzpW3
via IFTTT


Another great read this week – Markham (2013) suggests ‘data’ acts as a frame, through which we interpret and make sense of our social world. However, she adds,  “the interesting thing about frames, as social psychologist Goffman (1974) noted, is that they draw our attention to certain things and obscure other things.” Through persistent framings, particular ways of interpreting the world are naturalised, and the frame itself becomes invisible.  So is the case with ‘data’, the frame of which Markham  views as having transformed our sense of what it means to be in the 21st century, when experience is digitalised and “collapsed into collectable data points”. These data points are, however, abstractions, which can be reductive, obscuring rather than revealing:

“From a qualitative perspective, ‘data’ poorly capture the sensation of a conversation or a moment in context.”

Certainly, this is reflected in my experience of the Tweet Archivist data analysis of our tweetorial last week.  As such, I particularly enjoyed Markham’s call to embrace complexity, and to reframe the practice of inquiry as one “sense–making rather than discovering or finding or attempting to classify in a reductionist sense.”

“the complexity of twenty–first century culture requires finding perspectives that challenge taken for granted methods for studying the social in a digital epoch. Contributing to an infrastructure of knowledge that does not reduce or simplify experience requires us to acknowledge and scrutinize, as part of our methods, the ways in which data is being generated (we are generating data) in ways we may not notice. Changing the frame from one that is overly–focused on ‘data’ can help us explore the ways our research exists as a continual, dialogic, messy, entangled, and inventive process when it occurs outside the walls of the academy, the covers of books, and the written word.” 

Markham also writes of another strategy for reframing research, which is as a generative process achieved through collaborative remix. Here, the focus is on interpretation and sense-making rather than on findings per se:

“Using remix as a lens for thinking about research is intended to destabilize both the process and products of inquiry, but not toward the end of chaos or “anything goes.” The idea of remix simply refocuses energy toward meaning versus method; engagement versus objectivity; interpretation versus findings; argument versus explanation. In all of this, data is certainly available, present, and important, but it takes a secondary role to sense–making.”

I thought it was apt to include comment on that part of Markham’s paper here, owing to remix’s position within our last block in relation to notions of community cultures, but also because in a sense it speaks to ‘new’, more experimental forms of authorship, which have been a focus in the course.

Lifestream, Diigo: A critical reflection on Big Data: Considering APIs, researchers and tools as data makers | Vis | First Monday

“This paper looks at how data is ‘made’, by whom and how. Rather than assuming data already exists ‘out there’, waiting to simply be recovered and turned into findings, the paper examines how data is co–produced through dynamic research intersections. A particular focus is the intersections between the application programming interface (API), the researcher collecting the data as well as the tools used to process it. In light of this, this paper offers three new ways to define and think about Big Data and proposes a series of practical suggestions for making data.”
from Diigo http://ift.tt/2aFY3FC
via IFTTT


A couple of points from this paper seem relevant this week.

  1. The tools we use when researching ‘limit the possibilities of the data that can be seen and made. Tools then take on a kind of data-making agency.’ I wonder what the influence of the Tweet Archivist API is on my sensemaking of our data.
  2. Data are always selected in particular ways’ some data are made more visible than others and the most visible doesn’t necessarily align with or take into account what was most valued by and meaningful to users. ‘It is important to remember that what you see is framed by what you are able to see or indeed want to see from within a specific ideological framework.’ What did we value most in our tweetorial (obviously different things for different folks)? We still need to construct research questions that focus on those things most important to us, even if the data are less readily available.
  3. ‘Visibility can be instrumentalised in different ways, depending on the interests of those seeking to make something visible. Visibility can be useful as a means of control, it can be commercially exploited, or it can be sold to others who can exploit it in turn.’ How are we exploiting visibility in education?
  4. The monetisation – or making valuable in other ways – of data makes the data itself unreliable. Helen suggests this in her blog post, where she muses that perhaps if she’d known what aspects of our behaviour in the tweetorial were being analysed, she would have ‘gamed it’. 

Lifestream, Pocket, Persuading Algorithms With an AI Nudge

Excerpt:

Readers of /r/worldnews on reddit often report tabloid news to the volunteer moderators, asking them to ban tabloids for their sensationalized articles. Embellished stories catch people’s eyes, attract controversy, and get noticed by reddit’s ranking algorithms, which spread them even further.

via Pocket http://ift.tt/2k0DN3H

Full results


In this experiment, tabloid news articles on /r/worldnews were randomly assigned either

  1. no sticky comment (control)
  2. sticky comment encouraging fact-checking
  3. sticky comment encouraging fact-checking and downvoting of unreliable articles.
Figure 1: sticky comment encouraging fact-checking. Matias (2017).
Figure 2: sticky comment encouraging fact checking and downvoting. Matias (2017).

Results

Changes in human behaviour

Both sticky comments resulted in a higher chance of comments on articles containing links (1.28% more likely to have at least one link for the sticky with scepticism, and 1.47% more likely to have at least one link for the sticky encouraging scepticism and  voting).  These figures are representative of the effect on individual comments – but the increase in evidence bearing comments per post is much higher:

“Within discussions of tabloid submissions on r/worldnews, encouraging skeptical links increases the incidence rate of link-bearing comments by 201% on average, and the sticky encouraging skepticism and discerning downvotes increases the incidence rate by 203% on average.”

  • Changes in algorithmic behaviour

Reddit posts receive an algorithmic ‘score’, which influences whether the post is promoted or not.

“On average, sticky comments encouraging fact-checking caused tabloid submissions to receive 50.9% lower than submissions with no sticky comment, an effect that is statistically-significant. Where sticky comments include an added encouragement to downvote, I did not find a statistically-significant effect.”

Why does this matter? And what does it have to do with learning analytics?

The experiment illustrates a complex entanglement of human and material agency. The author of the study had predicted that the sticky encouraging fact-checking would increase the algorithmic score of associated posts, thinking that the Reddit score and HOT algorithm would respond to changed commenting activity, or that other behaviours that do influence the Reddit score and HOT algorithm would also be changed by the changes in commenting behaviour. It was predicted that the inclusion of encouragement to downvote would limit the predicted changes in algorithmic scoring. However, mid-experiment Reddit updated their algorithm. 

“Before the algorithm change, the effect of our sticky comments was exactly as we initially expected: encouraging fact-checking caused a 1111.6% increase in the score of a tabloid submission compared to no sticky comment. Furthermore, encouraging downvoting did dampen that effect, with the second sticky causing only a 453.26% increase in the score of a comment after 13,000 minutes.”

The observed outcomes show the difficulty of predicting both human and algorithmic responses, the dramatic impact on outcomes which changes to an algorithm can produce, and the need for monitoring of these outcomes, to ensure desired effects are maintained.

“Overall, this finding reminds us that in complex socio-technical systems like platforms, algorithms and behavior can change in ways that completely overturn patterns of behavior that have been established experimentally.”

Connecting this to learning analytics rather than algorithms more generally, when we use algorithms to ‘enhance’ education, particularly when ‘nudges’ aimed at improving student success, we need to be cognisant that behaviours don’t always change in the ways expected, and that the outcomes of behavioural changes can be ‘overwritten’/cancelled-out by algorithmic design.