In the final tributary, I investigated the entanglement of human and technical agency, driven by wider concerns about the governance of society and how ‘citizens’ can maintain a voice in that governance when so much influence is exerted through commercial and technical agency. Divisions in (and the co-evolution of) agency were explored through discussion of Matias’ (2017) research into algorithmic nudges with /r/worldnews (and in these notes on a Tweet), and developed based on a blog post in which Rahwan (2016) writes of “embedding judgement of society, as a whole, in the algorithmic governance of outcomes.” A peer (Cathy) helped me to connect this with predictive analysis ‘nudges’ in education, where I similarly see a need for collective agency to be used to integrate human values and ensure accountability. This line of thinking also links to ethical concerns about new technologies raised in our cybercultures block.
Warning – my analysis became somewhat unwieldy in length – for the fast read skip to the summary.
In week 9 of #mscedc our course leaders hosted a two-day ‘tweetorial’. Our activity during this period (and indeed, all our activity on Twitter, including at other times) was analysed by underlying algorithms – but what do the visualisations and numbers really reveal? How useful are they? And to whom?
How has the Twitter archive represented our Tweetorial?
Tweet Archivist is a freemium Twitter analytics service. On the website, the commercial usefulness of the type of analysis offered is outlined (my emphasis):
We also analyze the archive for you, bubbling up information like top users, words, urls, hashtags and more. This allows you to find the influencers, measure campaign effectiveness, determine sentiment and view the most popular images associated with the tweets for this term. In addition, we do a language breakdown and a volume analysis based on number of tweets per day.
However, the usefulness of the analysis is also assured more general, with the promise of ‘valuable insight into trends and behaviours’. Are these ‘behaviours’ relevant to education?
The narrative of Tweet Archivist is one of objectivity and the ability to ‘‘make visible the invisible’, which is common to the narrative frequently encountered with regard to learning analytics (Knox, 2014).
The data is presented in via drop down menus/links surrounding most recent activity for the hashtag followed. The free version of the site does not enable isolation of date ranges, and seems to be limited with regard to the time frame for which it can retrieve particular days’ tweets (which may be related to Twitter’s own policies). As such, for the purpose of this interpretation, I will be using analytics related to the second day of the Tweetorial (Friday, 17 March) as well as the ongoing archive data, which runs from 5 March – present.
What do these visualisations, summaries and snapshots say about what happened during our Tweetorial?
A basic presentation of the number of tweets and impressions, which refers to ‘the total number of times tweets were delivered to timelines with this search or hashtag in this archive’.
These figures are presented without ‘judgement’; at this point there is no indication of how the numbers compare with other numbers. However, the underlying sense seems to be that more activity equals more value.
Here those people who used the hashtag #mscedc are ranked according to the number of tweets they sent containing the hashtag during each period. The labelling as ‘top’ users could be seen to imply value, equivalent to ‘best’ when in fact it refers only to ‘most’. There is no link between the analysis and what might have been valued in the tweetorial – either explicitly (through ‘likes’ or through content comments, for example) or implicitly, but without a public signal.
It may be worth noting that those who used the hashtag most during the tweetorial are also, primarily, those that have used the hashtag most on Twitter over the longer period.
Colin is an exception – and in his own reflection on the event noted that he did not expect to be amongst the highest tweeters during the tweetorial.
The word clouds produced by Tweet Archivist are indicative of the highest frequency words during each time period. As the number of tweets recorded on the second day of the tweetorial is equivalent to 13.19% of the tweets overall, it is not surprising that ‘algorithms’, ‘data’ and ‘analytics’ feature in both word clouds. ‘Learning’ was only used four times during the second day of the tutorial. Is this significant? Why isn’t ‘LA’ there? Without contextual information, little more can be understood than principle topics discussed., and even this is misleading as jokes and asides containing ‘cheese’ appear to be more significant to conversation than ‘students’. The topics are further obfuscated by the inclusion of words which don’t convey much meaning independently (I’m, their, yes, you’re, got).
The ‘top URLs’ illustrates how many times sites were linked to. Since the lowest number is 1 (once) for each URL provided for both the short, one-day data and the longer, 22 days’ data, it is difficult to know how Tweet Archivist decides what to include. Many more links have been shared ‘once’ that could have been included.
Many of the URLs linked to are simply other tweets (4 of 12 for 17/03/17 and 2 of 25 for 5-27/3/17). Four of the links for the longer period are to the EDC course blog or individual course blogs. Other URLs are obscured (at a glance – they are still hyperlinked) as they use Tiny URLs or Google shortcodes. One would think that these URLs would be indicative of what is valued enough by students/tutors to share. However, as noted, the inclusion of those shared just once, when there were many more links shared just once, makes it unclear what Tweet Archivist included in the measure.
source of tweet
Students were advised to use Tweetdeck for the tweetorial. It appears (though cannot be confirmed) that some students who would ordinarily have used Twitter Web Client switched to using TweetDeck as instructed. However, the actual figures represent not users, but Tweets sent from particular sources/platforms. All we really know is that, on the second day of the tweetorial, a larger proportion of Tweets were sent from TweetDeck. Is this because more students used TweetDeck, or because those students that used TweetDeck were more prolific in their tweeting than those using other platforms/devices?
Similarly, in the (slightly) longer term, more tweets have been sent from TweetDeck than other sources. Do students who are inclined to Tweet more have a preference for TweetDeck, does TweetDeck enable increased participation through Twitter, or do more students use TweetDeck than other sources? If we are more inclined to use TweetDeck and Twitter Web Client than iPhone, Android and Hootsuite, is anything suggested about our mobility or lack there of?
More students Tweet from iPhones than Android. Does this suggest more students have and use iPhones, or that iPhone users are more prolific tweeters than Android users? Who is this information useful to? App developers? iPhone marketing? It is unclear.
As far as I know, all conversations for the tweetorial were in English. However, at one point it was suggested Eli was speaking Swedish, despite Twitter’s translation tool not being able to translate it. These metrics don’t reveal much as we’re communicating mono linguistically and contrary indications are erroneous. The limitations of the tools are revealed, and data suggests English is the main language spoken by participants. However, this last point is a generalisation – we can only surmise that English is the lingua franca based on the given ‘evidence’, not that it is students’ preferred language.
Volume over time
The longer term view shows that, as would be expected, the volume of Tweets spiked during the tweetorial.
To gain more detailed information about Saturday 17 March, I used Mozdeh to extract a timeline. The information could not be obtained using the freemium version of Tweet Archivist.
The graph shows that a higher volume of Tweets were sent at 10am (GMT), with smaller peaks at 12:00, 3pm, 5pm and 8pm. From this style of presentation, it is unclear whether these peaks were created by more prolific tweeting by individuals, or if collectively more people were online at these times. It is also unclear whether participation was influenced by geographical location and time zone. Were particular users active over the entire day, others for short bursts, or is there a different explanation? The patterns behind active periods are not revealed in any detail here. Looking at specific users could reveal something more about the conditions the data is produced in, but such analytics are not provided by Tweet Archivist.
Tweet Archivist counts the number of mentions users get on tweets containing the #mscedc hashtag. What does this reveal? Presumably, receiving mentions, for example through replies, or because more people choose to involve you in their conversations, is viewed as a positive measurement – commercially it might be indicative of how engaged you are with clients, for instance. However, since question numbers were not used within our tweetorial, mentioning other users (through reply) was the only way to keep messages organised. Because of this, more mentions may simply be related to starting conversation threads/posing questions (i.e. James during the Friday of the Tweetorial), and responding to questions earlier, which could result in gaining mentions in subsequent replies.
There does not seem to be a correlation between the number of tweets sent and the number of mentions received. Without completing a content analysis, it is impossible to know if mentions were received in response to questions posed (as they presumably were in the tutorial by James), whether the mentions added information to points made or challenged/complimented/complained about them. Further, since Tweet Archivist reports on it but not on the reverse, there seems to be a perceived value in receiving mentions, but not in giving them. Or, is the data just more readily available for the former? I would have thought such ‘data’ to be equally accessible.
Tweet Archivist does not provide network mapping to see the tweets between individual users. I openly admit to not knowing how to control the parameters within Mozdeh, but this is the network map produced for 17/03/17:
This network map suggests (I think – I am not experienced at reading such diagrams) that the most communication on Friday 17 March connected James, Philip, Eli, Nigel, Daniel, and Colin. It suggests my interactions were greatest with Colin, Eli and Nigel. This is based on being mentioned in these participants’ tweets, but it is deceptive as just two more mentions results in a considerably thicker connection line (connecting me to Colin versus me to Daniel, based on their mentions of me in Tweets).
It’s also to important to look at how choice of who to include changes the visualisation. In the presented map, ‘top’ users are those who had the most tweets to them. How would changing the criteria for ‘top’ influence who is included and excluded from the visualisation? For instance, ‘top’ could be determined by tweets from or tweets from and to, or be based on content analysis. If used for assessment purposes, or to influence algorithmic ‘nudges’ in education, using basic quantitative measures may encourage students to ‘game’ the system and interfere with actual engagement while simultaneously producing a picture which is perceived to be of it.
Analysis of hashtags primarily reveals that as a group we are not very inclined to use hashtags, and that when we do, the hashtag is frequently not directly connected to the topic we are discussing. Course tags seem to be most used.
Note that 11a needs cross-checking as it does not even include #mscedc. The included hashtags seem to be faulty. If it is deliberately excluded from the set, why is it included in figure 11b?
Just one image was shared on 17/03/17, and pertinently, it was an image of text. Could this be construed as a subversive act against the ‘infrastructure’ hosting our tweetorial? Pushing back against the 140 character limit in an attempt to engage more profoundly with the subject matter?
Or is it a subversive act against algorithmic surveillance? In using image, does Anne’s tweet allude automatic analysis and typical data mining approaches?
The follower counts reveal that we have attracted some people who happen to have a lot of followers to our #mscedc discussions. They do not indicate how involved these people are in our discussions. Nor do they indicate what the value is in having people with more (or less) followers involved in your discussion. For some students, an increased potential of being seen when participating online, with one’s emergent identity, could actually be perceived as a threat. In this case, it might be desirable to have people with fewer followers involved in a conversation. Alternatively, students may see those with many followers as gateways to more resources, and more people who could potentially answer their questions or connect them with ideas. How does this work in reality, though? For many of our conversations we replied to previous tweets on the same thread or question. As a result (as I understand it), only people who follow both the tweeter and the tweeter(s) they replied to would see the message. The visibility of any tweet would further be affected by Twitter’s algorithmic filtering. Yet, the number of followers, and the associated scale and reach that having a large number of followers is perceived to entail is valued, hence this metric is included in the analysis.
Do these visualisations, summaries and snapshots accurately represent the ways you perceived the Tweetorial to unfold, as well as your own contributions?
Enyon (who is also referred to in Anne’s image tweet above) reminds us that while we can count all kinds of things, the numbers alone are not a measurement of the value people place on them (2013). The same is true of these pictures of our tweetorial. For me, on the second day of the tweetorial the significant discussions (or comments that triggered ‘significant thinking’) were around:
ownership of data, and rights and responsibilities related to how that data is used (and by whom);
the impact (and interference) of algorithms on research process
ways in which perceptions and values are algorithmically shaped.
Neither these topics, nor my/our thinking about them are conveyed clearly by the focus on most used words. The word cloud is reductionist, and to me, does not seem reflective of the thought that went into the discussions.
In addition to ‘academic conversations’, I enjoyed the supportive, sometimes amusing banter of my peers. The record of cheeses, rollerskates or tales of spam may make it into the archive as decontextualised words, but without their context they are meaningless, perhaps even making us all look a little deranged!
Positives of the experience aside, there was a point at which ‘life interrupted’ and I had to disconnect. This also happened on the first day, where I had back to back teaching and could not engage with questions/ideas. This experience remains invisible within algorithmic reporting.
It is perhaps most useful to look at the visualisations to see the world in which they are produced, rather than the events they report on. In this sense we would be following Knox’s (2014) advice, in seeking not to see “the reality ‘behind’ the image, but how and why the image itself was produced”. In this case, it seems that many of the measurements are produced with a commerical/advertising model of getting maximum eyes, through the scale and reach of participants. For these purposes, it is important how many times a key (or brand) word is used, and how many people might see it being used, to establish brand identity/identification with brand. Is this a model we can apply to learning and more broadly, to education?
What might be the educational value or limitations of these kinds of visualisations and summaries, and how do they relate to the ‘learning’ that might have taken place during the ‘Tweetorial’?
Previously I’ve blogged about how network mapping could be used to help teachers monitor online group work and ensure all participants are ‘involved’. However, the automated version of this seems too rudimentary. Counting instances of interaction or reference to other learners not only fails to identify whether interaction is meaningful or not, but it is easily ‘gamed’, and has the potential to alter behaviours based not on engagement but on giving the illusion of engagement.
Similarly, reporting of most used words does not capture the complexities of learning, nor advise teachers about whether students have used terms appropriately and meaningfully to talk about the concepts behind them. This is not to say that there is no use for such visualisations, but, I feel, it is necessary to use such summaries in consultation with students or in cohort with ethnographic approaches so that whatever is captured can be interpreted meaningfully.
Measurements of periods in which students are most active could help with the provision of support, within reason. For instance, my students tend to use our LMS most in the very early hours of the morning. It’s practical to work with this information by ensuring IT capacity during their active times; less so to provide teaching support.
I am less able to find a usefulness for measurements such as the influencer index. Within education, I fear such measures would be more likely to reinforce existing inequalities than improve education.