Much of our work on Code Acts in Education over the past few years has focused on the work that algorithms do (and what they are made to do and by who) in relation to learning, policy and practice. But the work of algorithms extends far beyond education of course.
Ben Williamson, while acknowledging the influence of algorithms on his own search results, performed inurl: searches for algorithms within major UK news websites. The short form results (all quoted):
The Guardian‘s editorial line is to treat the algorithm as a governor;
The Telegraph treats the algorithm as a useful scientist whose expertise is helping society;
The Sun is largely disinterested in algorithms in terms of newsworthiness;
the editorial line of The Mirror is to treat algorithms in terms of brainy expertise;
Algorithms as problem-solvers might be one way of categorizing its [The Daily Mail‘s] editorial line*
*Based on an initial search. An hour later Williamson repeated the search, and received different results. “The Daily Mail is certainly not disinterested in algorithms–the result returns are pretty high compared to the tabloids, and the Mail does frequently re-post scientific content from sources like The Conversation–but by no means does it adopt the kind of critical line found in The Guardian.”
My concerns about algorithms are related to governance, and, I read the Guardian.. Do I read The Guardian because it (more than the other publications given) matches my worldview, or do I think the way I do because of the publications (like The Guardian) that I read? Or, was I initially attracted to The Guardian because of its similarity to my worldview, but now my worldview is influenced by the fact that I read The Guardian, and its initial similarity to my worldview perhaps allows some things to slip beneath the questioning of my ‘truth’ radar?
Fascinating work – makes me wonder, is there a website that presents diverse viewpoints on topics and events using inurl: searches? i.e. monitors news sites, feeding content from diverse sources, organised by topic or event, using humans to add new topics/events as events occur? And with an editorial team to summarise the editorial positions of the publications represented on specific topics? Would such a site help combat political polarisation and divisiveness?
Also.. how can we teach ‘algorithmic literacy’? Can we? When do we start? Would a site which unpacked what this could look like, and offered teaching ideas and a place for discussion be of use? [Assignment ideas..]
This is a talk I presented at the Nordic Educational Research Association conference at Aalborg University, Copenhagen, on 23 March 2017.
Education is currently being reimagined for the future. In 2016, the online educational technology magazine Bright featured a series of artistic visions of the future of education. One of them, by the artist Tim Beckhardt, imagined a vast new ‘Ocunet’ system.
I found this post after reading Knox’s (2014) post on interpreting analytics in the same blog space. What would we call that? Searching laterally? Something which was, at the time, really frustrating in DEGC was that we were always given links to Journal home pages rather than to the specific article we were reading. While I seem to recall this being connected to copyright and appropriate practice, it was frustrating because none of the links were set to open in a new window/tab by default, so unless one right clicked and opened a new window/tab, one then had to go back to the original page to find out which issue one was looking for.. but I’ve subsequently reflected (repeatedly!) on how it made me much more aware of the types of ‘publications’ and their respective content, and perhaps resultantly, I think my ‘lateral searching’ has increased. It’s not a new practice, of course, but an addictive one nonetheless, and it’s always good to find a ‘treasure trove’ of good reads.
I’m getting tangential, though – what caught my eye about this post, in particular, was the focus on ‘imaginaries’, and the ways in which such ‘imaginaries’, or fictions, play a role in the creation of future reality. Williamsons writes,
..what I’m trying to suggest here is that new ways of imagining education through big data appear to mean that such practices of algorithmic governance could emerge, with various actions of schools, teachers and students all subjected to data-based forms of surveillance acted upon via computer systems.
Importantly too, imaginaries don’t always remain imaginary. Sheila Jasanoff has described ‘sociotechnical imaginaries’ as models of the social and technical future that might be realized and materialized through technical invention.Imaginaries can originate in the visions of single individuals or small groups, she argues, but gather momentum through exercises of power to enter into the material conditions and practices of social life. So in this sense, sociotechnical imaginaries can be understood as catalysts for the material conditions in which we may live and learn.
The post has a lot more in it, focusing on how the imaginaries of ‘education data science’ combined with affective computing and cognitive computing are leading to a new kind of ‘agorithmic governance’ within education. Frightening stuff, to be frank.
What I’m really interested in is the role of these ‘imaginaries’ though: how do fictions, and, frequently, corporate fictions, work their influence? Which previous imaginaries, captured in science fiction, can we trace – along with their reception over time – to present day materialities?
And, why are ‘the people’ so passive? Why isn’t there shouting about imaginaries being presented as inevitable? Why isn’t their protest? A rant: “Uh – you want to put a camera on my kid’s head, to tell me how she’s feeling? Have you thought about asking her? You want to produce data for parents? How about as a society ‘just’ recognising the value of non-working lives and giving people enough time to spend with their kids while they’re trying to pay rent or a mortgage?”
It would make an interesting study – perhaps too large for our EDC final assignment, but I’m wondering about it could be scaled back.
Renee, thanks for this – and for the alert to the very-well-hidden hyperlink. I wouldn’t have found it without your second comment!
The graphs risk masking something acknowledged in the accompanying text, namely that ” the annual number of single-author, non-review papers themselves, as tracked since 1981, has remained largely consistent in the course of the three decades”. The declining percentage share reflect the increase in multi-author pieces, not so much the decline in the single-authored pieces per se. Clearly a complex picture is in view.
Also, I’m curious that there is no category for ‘humanities’: presumably it’s incorporated within ‘social sciences’. I’d imagine, within that category, there are lots of sub-sectors, each with their own practices, circulations and markets. Different assemblages, reacting to and with digital cultures in differing ways. Great to have some data-led insight on it, and inviting of more. Many thanks!
from Comments for Matthew’s EDC blog http://ift.tt/2obZwrK
When I spoke at the London School of Economics a couple years ago, part of my talk was an extended criticism of the use of models in learning design and analysis. “The real issue isn’t algorithms, it’s models. Models are what you get when you feed data to an algorithm and ask it to make predictions. As (Cathy) O’Neil puts it, ‘Models are opinions embedded in mathematics.'” This article is an extended discussion of the problem stated much more cogently than my presentation. “It’s E Pluribus Unum reversed: models make many out of one, pigeonholing each of us as members of groups about whom generalizations — often punitive ones (such as variable pricing) can be made.
My additions (i.e. from my reading of the article):
What are ‘weapons of math destruction’?
Statistical models that:
are not opaque to their subjects
are harmful to subjects’ interests
grow exponentially to run at large scale
What’s wrong with these models that leads to them being so destructive?
1. lack of feedback and tuning
2. the training data is biased. For example,
The picture of a future successful Ivy League student or loan repayer is painted using data-points from the admittedly biased history of the institutions
3. “the bias gets the credibility of seeming objectivity”
Why does it matter?
It’s a grim picture of the future: WMD makers and SEO experts locked in an endless arms-race to tweak their models to game one another, and all the rest of us being subjected to automated caprice or paying ransom to escape it (for now). In that future, we’re all the product, not the customer (much less the citizen).
Inside this picture, the cost of ‘cleaning up’ the negative externalities that result from sloppy statistical models is more expensive than the savings that companies make through maintaining the models. Yet, we pay for the cleaning up (individually, collectively), while those pushing the weak statistical models save.
The other loss is, of course, the potential: algorithms could, with good statistical modelling, serve societal needs, and those in need within society.
The line of argument is hard to argue with – but one does have to ask, is ‘sloppy’ the right term? Is it just sloppiness? At what point does such ‘sloppiness’ become culpable? Or, malicious disregard?
In the final tributary, I investigated the entanglement of human and technical agency, driven by wider concerns about the governance of society and how ‘citizens’ can maintain a voice in that governance when so much influence is exerted through commercial and technical agency. Divisions in (and the co-evolution of) agency were explored through discussion of Matias’ (2017) research into algorithmic nudges with /r/worldnews (and in these notes on a Tweet), and developed based on a blog post in which Rahwan (2016) writes of “embedding judgement of society, as a whole, in the algorithmic governance of outcomes.” A peer (Cathy) helped me to connect this with predictive analysis ‘nudges’ in education, where I similarly see a need for collective agency to be used to integrate human values and ensure accountability. This line of thinking also links to ethical concerns about new technologies raised in our cybercultures block.
Warning – my analysis became somewhat unwieldy in length – for the fast read skip to the summary.
In week 9 of #mscedc our course leaders hosted a two-day ‘tweetorial’. Our activity during this period (and indeed, all our activity on Twitter, including at other times) was analysed by underlying algorithms – but what do the visualisations and numbers really reveal? How useful are they? And to whom?
How has the Twitter archive represented our Tweetorial?
Tweet Archivist is a freemium Twitter analytics service. On the website, the commercial usefulness of the type of analysis offered is outlined (my emphasis):
We also analyze the archive for you, bubbling up information like top users, words, urls, hashtags and more. This allows you to find the influencers, measure campaign effectiveness, determine sentiment and view the most popular images associated with the tweets for this term. In addition, we do a language breakdown and a volume analysis based on number of tweets per day.
However, the usefulness of the analysis is also assured more general, with the promise of ‘valuable insight into trends and behaviours’. Are these ‘behaviours’ relevant to education?
The narrative of Tweet Archivist is one of objectivity and the ability to ‘‘make visible the invisible’, which is common to the narrative frequently encountered with regard to learning analytics (Knox, 2014).
The data is presented in via drop down menus/links surrounding most recent activity for the hashtag followed. The free version of the site does not enable isolation of date ranges, and seems to be limited with regard to the time frame for which it can retrieve particular days’ tweets (which may be related to Twitter’s own policies). As such, for the purpose of this interpretation, I will be using analytics related to the second day of the Tweetorial (Friday, 17 March) as well as the ongoing archive data, which runs from 5 March – present.
What do these visualisations, summaries and snapshots say about what happened during our Tweetorial?
A basic presentation of the number of tweets and impressions, which refers to ‘the total number of times tweets were delivered to timelines with this search or hashtag in this archive’.
These figures are presented without ‘judgement’; at this point there is no indication of how the numbers compare with other numbers. However, the underlying sense seems to be that more activity equals more value.
Here those people who used the hashtag #mscedc are ranked according to the number of tweets they sent containing the hashtag during each period. The labelling as ‘top’ users could be seen to imply value, equivalent to ‘best’ when in fact it refers only to ‘most’. There is no link between the analysis and what might have been valued in the tweetorial – either explicitly (through ‘likes’ or through content comments, for example) or implicitly, but without a public signal.
It may be worth noting that those who used the hashtag most during the tweetorial are also, primarily, those that have used the hashtag most on Twitter over the longer period.
Colin is an exception – and in his own reflection on the event noted that he did not expect to be amongst the highest tweeters during the tweetorial.
The word clouds produced by Tweet Archivist are indicative of the highest frequency words during each time period. As the number of tweets recorded on the second day of the tweetorial is equivalent to 13.19% of the tweets overall, it is not surprising that ‘algorithms’, ‘data’ and ‘analytics’ feature in both word clouds. ‘Learning’ was only used four times during the second day of the tutorial. Is this significant? Why isn’t ‘LA’ there? Without contextual information, little more can be understood than principle topics discussed., and even this is misleading as jokes and asides containing ‘cheese’ appear to be more significant to conversation than ‘students’. The topics are further obfuscated by the inclusion of words which don’t convey much meaning independently (I’m, their, yes, you’re, got).
The ‘top URLs’ illustrates how many times sites were linked to. Since the lowest number is 1 (once) for each URL provided for both the short, one-day data and the longer, 22 days’ data, it is difficult to know how Tweet Archivist decides what to include. Many more links have been shared ‘once’ that could have been included.
Many of the URLs linked to are simply other tweets (4 of 12 for 17/03/17 and 2 of 25 for 5-27/3/17). Four of the links for the longer period are to the EDC course blog or individual course blogs. Other URLs are obscured (at a glance – they are still hyperlinked) as they use Tiny URLs or Google shortcodes. One would think that these URLs would be indicative of what is valued enough by students/tutors to share. However, as noted, the inclusion of those shared just once, when there were many more links shared just once, makes it unclear what Tweet Archivist included in the measure.
source of tweet
Students were advised to use Tweetdeck for the tweetorial. It appears (though cannot be confirmed) that some students who would ordinarily have used Twitter Web Client switched to using TweetDeck as instructed. However, the actual figures represent not users, but Tweets sent from particular sources/platforms. All we really know is that, on the second day of the tweetorial, a larger proportion of Tweets were sent from TweetDeck. Is this because more students used TweetDeck, or because those students that used TweetDeck were more prolific in their tweeting than those using other platforms/devices?
Similarly, in the (slightly) longer term, more tweets have been sent from TweetDeck than other sources. Do students who are inclined to Tweet more have a preference for TweetDeck, does TweetDeck enable increased participation through Twitter, or do more students use TweetDeck than other sources? If we are more inclined to use TweetDeck and Twitter Web Client than iPhone, Android and Hootsuite, is anything suggested about our mobility or lack there of?
More students Tweet from iPhones than Android. Does this suggest more students have and use iPhones, or that iPhone users are more prolific tweeters than Android users? Who is this information useful to? App developers? iPhone marketing? It is unclear.
As far as I know, all conversations for the tweetorial were in English. However, at one point it was suggested Eli was speaking Swedish, despite Twitter’s translation tool not being able to translate it. These metrics don’t reveal much as we’re communicating mono linguistically and contrary indications are erroneous. The limitations of the tools are revealed, and data suggests English is the main language spoken by participants. However, this last point is a generalisation – we can only surmise that English is the lingua franca based on the given ‘evidence’, not that it is students’ preferred language.
Volume over time
The longer term view shows that, as would be expected, the volume of Tweets spiked during the tweetorial.
To gain more detailed information about Saturday 17 March, I used Mozdeh to extract a timeline. The information could not be obtained using the freemium version of Tweet Archivist.
The graph shows that a higher volume of Tweets were sent at 10am (GMT), with smaller peaks at 12:00, 3pm, 5pm and 8pm. From this style of presentation, it is unclear whether these peaks were created by more prolific tweeting by individuals, or if collectively more people were online at these times. It is also unclear whether participation was influenced by geographical location and time zone. Were particular users active over the entire day, others for short bursts, or is there a different explanation? The patterns behind active periods are not revealed in any detail here. Looking at specific users could reveal something more about the conditions the data is produced in, but such analytics are not provided by Tweet Archivist.
Tweet Archivist counts the number of mentions users get on tweets containing the #mscedc hashtag. What does this reveal? Presumably, receiving mentions, for example through replies, or because more people choose to involve you in their conversations, is viewed as a positive measurement – commercially it might be indicative of how engaged you are with clients, for instance. However, since question numbers were not used within our tweetorial, mentioning other users (through reply) was the only way to keep messages organised. Because of this, more mentions may simply be related to starting conversation threads/posing questions (i.e. James during the Friday of the Tweetorial), and responding to questions earlier, which could result in gaining mentions in subsequent replies.
There does not seem to be a correlation between the number of tweets sent and the number of mentions received. Without completing a content analysis, it is impossible to know if mentions were received in response to questions posed (as they presumably were in the tutorial by James), whether the mentions added information to points made or challenged/complimented/complained about them. Further, since Tweet Archivist reports on it but not on the reverse, there seems to be a perceived value in receiving mentions, but not in giving them. Or, is the data just more readily available for the former? I would have thought such ‘data’ to be equally accessible.
Tweet Archivist does not provide network mapping to see the tweets between individual users. I openly admit to not knowing how to control the parameters within Mozdeh, but this is the network map produced for 17/03/17:
This network map suggests (I think – I am not experienced at reading such diagrams) that the most communication on Friday 17 March connected James, Philip, Eli, Nigel, Daniel, and Colin. It suggests my interactions were greatest with Colin, Eli and Nigel. This is based on being mentioned in these participants’ tweets, but it is deceptive as just two more mentions results in a considerably thicker connection line (connecting me to Colin versus me to Daniel, based on their mentions of me in Tweets).
It’s also to important to look at how choice of who to include changes the visualisation. In the presented map, ‘top’ users are those who had the most tweets to them. How would changing the criteria for ‘top’ influence who is included and excluded from the visualisation? For instance, ‘top’ could be determined by tweets from or tweets from and to, or be based on content analysis. If used for assessment purposes, or to influence algorithmic ‘nudges’ in education, using basic quantitative measures may encourage students to ‘game’ the system and interfere with actual engagement while simultaneously producing a picture which is perceived to be of it.
Analysis of hashtags primarily reveals that as a group we are not very inclined to use hashtags, and that when we do, the hashtag is frequently not directly connected to the topic we are discussing. Course tags seem to be most used.
Note that 11a needs cross-checking as it does not even include #mscedc. The included hashtags seem to be faulty. If it is deliberately excluded from the set, why is it included in figure 11b?
Just one image was shared on 17/03/17, and pertinently, it was an image of text. Could this be construed as a subversive act against the ‘infrastructure’ hosting our tweetorial? Pushing back against the 140 character limit in an attempt to engage more profoundly with the subject matter?
Or is it a subversive act against algorithmic surveillance? In using image, does Anne’s tweet allude automatic analysis and typical data mining approaches?
The follower counts reveal that we have attracted some people who happen to have a lot of followers to our #mscedc discussions. They do not indicate how involved these people are in our discussions. Nor do they indicate what the value is in having people with more (or less) followers involved in your discussion. For some students, an increased potential of being seen when participating online, with one’s emergent identity, could actually be perceived as a threat. In this case, it might be desirable to have people with fewer followers involved in a conversation. Alternatively, students may see those with many followers as gateways to more resources, and more people who could potentially answer their questions or connect them with ideas. How does this work in reality, though? For many of our conversations we replied to previous tweets on the same thread or question. As a result (as I understand it), only people who follow both the tweeter and the tweeter(s) they replied to would see the message. The visibility of any tweet would further be affected by Twitter’s algorithmic filtering. Yet, the number of followers, and the associated scale and reach that having a large number of followers is perceived to entail is valued, hence this metric is included in the analysis.
Do these visualisations, summaries and snapshots accurately represent the ways you perceived the Tweetorial to unfold, as well as your own contributions?
Enyon (who is also referred to in Anne’s image tweet above) reminds us that while we can count all kinds of things, the numbers alone are not a measurement of the value people place on them (2013). The same is true of these pictures of our tweetorial. For me, on the second day of the tweetorial the significant discussions (or comments that triggered ‘significant thinking’) were around:
ownership of data, and rights and responsibilities related to how that data is used (and by whom);
the impact (and interference) of algorithms on research process
ways in which perceptions and values are algorithmically shaped.
Neither these topics, nor my/our thinking about them are conveyed clearly by the focus on most used words. The word cloud is reductionist, and to me, does not seem reflective of the thought that went into the discussions.
In addition to ‘academic conversations’, I enjoyed the supportive, sometimes amusing banter of my peers. The record of cheeses, rollerskates or tales of spam may make it into the archive as decontextualised words, but without their context they are meaningless, perhaps even making us all look a little deranged!
Positives of the experience aside, there was a point at which ‘life interrupted’ and I had to disconnect. This also happened on the first day, where I had back to back teaching and could not engage with questions/ideas. This experience remains invisible within algorithmic reporting.
It is perhaps most useful to look at the visualisations to see the world in which they are produced, rather than the events they report on. In this sense we would be following Knox’s (2014) advice, in seeking not to see “the reality ‘behind’ the image, but how and why the image itself was produced”. In this case, it seems that many of the measurements are produced with a commerical/advertising model of getting maximum eyes, through the scale and reach of participants. For these purposes, it is important how many times a key (or brand) word is used, and how many people might see it being used, to establish brand identity/identification with brand. Is this a model we can apply to learning and more broadly, to education?
What might be the educational value or limitations of these kinds of visualisations and summaries, and how do they relate to the ‘learning’ that might have taken place during the ‘Tweetorial’?
Previously I’ve blogged about how network mapping could be used to help teachers monitor online group work and ensure all participants are ‘involved’. However, the automated version of this seems too rudimentary. Counting instances of interaction or reference to other learners not only fails to identify whether interaction is meaningful or not, but it is easily ‘gamed’, and has the potential to alter behaviours based not on engagement but on giving the illusion of engagement.
Similarly, reporting of most used words does not capture the complexities of learning, nor advise teachers about whether students have used terms appropriately and meaningfully to talk about the concepts behind them. This is not to say that there is no use for such visualisations, but, I feel, it is necessary to use such summaries in consultation with students or in cohort with ethnographic approaches so that whatever is captured can be interpreted meaningfully.
Measurements of periods in which students are most active could help with the provision of support, within reason. For instance, my students tend to use our LMS most in the very early hours of the morning. It’s practical to work with this information by ensuring IT capacity during their active times; less so to provide teaching support.
I am less able to find a usefulness for measurements such as the influencer index. Within education, I fear such measures would be more likely to reinforce existing inequalities than improve education.