Tweet archive analysis

The tweetorial

The tweetorial was set as an activity using Twitter, the software itself dictating to a large extent the type of interaction possible, ie an asynchronous discussion in 140 character ‘bytes’. The tweet archive revealed that some of us had chosen to use a twitter-facilitating software such as TweetDeck or Hootsuite, which would have shaped our experience a little differently, making it easier to see conversation threads perhaps. The use of the mediating communication technology ‘performed’ the experience of the discussion together with our knowledge that the tweetorial would be analysed afterwards; the whole entangled with our own affective and material context. These sociomaterial elements were not evident in the analytics, unless some detail could be construed from any of the tweets themselves.

Jeremy asked whether education changes when we use automated (more-than-human) means to understand it and I answered by saying that I thought it did because the way we frame and do things creates the world in which they are done:

I considered that the use of Twitter, the knowledge that the tweetorial would be automatically analysed, as well as our own individual situations enabling us to participate to a greater or lesser extent would all have contributed to the conditions in which it took place.

In the same way that Twitter and the conditions of the activity constituted the experience and the way we engaged with it, so the analytics privileged a particular perception of it.

How has the Twitter archive represented our Tweetorial?

As far as I can tell, the tweet archive has represented our Tweetorial by providing:

  • list of the tweets in chronological order
  • a wordcloud and list of the most frequently used words
  • a list of top urls tweeted
  • breakdown of the source of tweets (platform or software used)
  • the language of the tweets
  • a graph showing the number of tweets over the days
  • user mentions
  • hashtags tweeted
  • tweeted images
  • ‘influencer’ ranking

We can see immediately from the archive’s panel of information what is deemed important to its architects. The summaries and visualisations are market-orientated as their nomenclature suggests: ‘rank’, ‘top’, ‘volume’, ‘influencer’ and were primarily concerned with quantifying. This quantitative data might be useful over time revealing emerging patterns when compared to regular tweetorials held under similar or slightly varying conditions, but was less useful for an analysis of a single learning encounter. As Nigel pointed out during this week’s hangout, more revealing would have been, for example, information on which tweets provoked the most responses.

It wasn’t clear from the archive which of the mscedc students were missing from the conversation, nor did it, of course, list any tweets which didn’t have the mscedc hashtag but which could have been tweeted by one of us during that time and when we forgot to include it. In that sense, an appreciation of who or what might be ‘missing’ was more nuanced than the archive suggested.

Student absence would not have been very noticeable during a rapid conversation-style learning activity, nor deemed worthy of particular comment with everyone assuming they were unable to take part if they considered it at all.  A Learning Analytics dashboard which reported on such absence, would, however, subtly alter the norms around such activities, making participation more prominent and therefore presented as important for learning (observable in many moocs). Such a dashboard-delivered conclusion would not properly account for lurkers or students able to learn without a strong participatory presence and any action triggered by non-participation might be injudiciously applied. This sort of problem may be overcome by students explaining their absence, but the necessity of having to do so changes the educational landscape from open and unpoliced to monitored and check-pointed in which itemised individuals are marshalled for market forces. It also changes the nature of the trust relationship between student and institution.

What do these visualisations, summaries and snapshots say about what happened during our Tweetorial, and do they accurately represent the ways you perceived the Tweetorial to unfold, as well as your own contributions?

Twitter is a very distinctive application and is fast-flowing and often confusing, making it difficult to follow threads of conversations unless you are very organised and adept at using facilitating dashboards such Hootsuite or TweetDeck to handle a chat. It is easy to feel behind the curve of a Twitter conversation, with a sense of needing to catch up. We didn’t use any question numbering schemes to which answers could have been matched, so it was sometimes difficult to know if you ‘should’ be tweeting a response to an earlier question if the conversation had moved on, and if so, whether to provide context. There is a wish to contribute politely without muscling in on an exchange and ensuring, if you can, that your point hasn’t already been made. My own experience was to quickly scroll back to see what I had missed at the same time as trying to pick up on interesting things and remember what I wanted to say – a high cognitive load! I tried to answer some of the main questions when I took part because they were easy to although I also attempted to join in ongoing conversations too.

The visualisations, summaries and snapshots provided by the archive didn’t reflect this feeling of sometimes being overwhelmed which I experienced during the course of the two days as I ‘part-time participated’, as others must have done, other than by the reproduction of the tweet stream itself. In an educational context for those not used to Twitter, it might be a difficult experience at first, a situation not reflected in the tweet archive, yet a crucial affective factor influencing learning.

Rather than reporting on the number of tweets, it would have been interesting to know which ones shaped the conversation, changed its course or provided diversions. The twitter archive was oriented towards consumerism and inferred reward for profusion and prolixity rather than ‘quality’ of tweet which is more difficult to analyse by any method including human judgement. This is why, perhaps, in a market economy such quantitative data is privileged over more expensive-to-acquire (?) information. Qualitative analysis would better match educational need, searching for keywords to determine changes in direction or emphasis, sentiment or tweet type (positive, negative, affirming, conciliatory, questioning, humour etc). It is interesting to compare the tweet archive to some of the other tools discovered by students. For example, Nigel found a website called Keyhole ( which reported on ‘Sentiment’ amongst other things. Without knowing how these results were measured, it is difficult to draw any conclusions (a very telling indictment of such analytics in itself).

(Note that I signed up for a free short-term trial of this tool and used the mscedc hashtag, but the results above may not reflect the actual two days of the tweetorial.)

The tweet archive did report on the top urls which, as noted in my Thinglink, was useful because I had missed quite a lot of them during the conversation. The aggregation of this information, easily achievable by code, was helpful and underlined for me the necessity of being more purposeful and organised about capturing tweeted links. This sort of metacognitive reflection was something I was concerned would be contracted out to Learning Analytics rather than being developed in the student, but I ended up benefiting from it myself and realise that this could very usefully extend to other learning situations, spaces and platforms.

The archive’s report of the top ranking words was neither surprising nor particularly revealing, although the use of the word ‘perhaps’ was reflective of a tentative and questioning position entirely natural in a learning context. Evident on the word list were the attempts to play Twitter’s algorithms and have fun. That it was so easy to manipulate Twitter’s algorithms was enlightening and reinforced a human wish to resist and subvert constraint. It is important to keep this in mind in a world in which data surveillance is growing so rapidly with much discursive effort expended to justify itself as neutral or benevolent and even, or especially, as its polished presentation seems to repel contestation. Keyhole reported a different word cloud, underlining the clichéd but telling view that it is possible to demonstrate multiple viewpoints with statistics.

 Jeremy and James ranked the highest for user mentions in the tweet archive which was indicative of their place as experts in our community of practice and of our responses to their direct questions, but mscedc students and those outside the immediate group figured too, reflecting an extending connectivist learning community.

There were some anomalies in the reported data which provoked thoughts of buggy algorithms working behind the scenes. For example, the Language data reported on und as one of the two-letter language codes and the Influencer list didn’t include me although it seemed to rank by number of followers of which I had enough, I think, to be on there. Evidently, something much more discerning than number of followers was factored into the calculations but remained opaque to the observer! In the main, however, I was surprised to feel that I was ‘faithfully represented by the code’ as far as it went. This would not, I’m sure, always be the case when we are defined by metrics, a situation leading to problems of being misrepresented and this faulty interpretation subsequently following us in our digital life.

What might be the educational value or limitations of these kinds of visualisations and summaries, and how do they relate to the ‘learning’ that might have taken place during the ‘Tweetorial’?

The dashboard figures and visualisations didn’t relate anything of the learning that might have taken place unless unscientific inferences are drawn from the volume of tweets correlating with engagement and amount of learning. However, the sheer number of tweets would suggest lively and compelling exchanges from which, it could be argued, it would be difficult not to absorb some new ideas.

The tweet archive’s word cloud was especially disappointing as it must have simply counted words and reported on the most used. It included such words as I’m and I’ve which other algorithms might have excluded or analysed in conjunction with adjoining words. For this activity, a close analysis of the discourse might better reveal ‘learning’ although it would have been a difficult task even before considering our use of abbreviation to cram our thoughts into 140 characters.

From my own experience of the activity, I am sure ‘learning’ did happen because I mused over questions, offered some answers and modified my own thinking as I gained new perspectives, followed others’ arguments and made connections. A close analysis of the tweets themselves by a human (or AI) would enable them to be classified and sorted, giving a more accurate picture perhaps of what occurred during the tweetorial.

Looking over the archive subsequently, I created my own top ten takeaway tweets, a list which would be different for everyone and which even for the selector, would fast become out-of-date. Thinking about this prompts me to consider the value of likes and retweets and how they are not only context- but also time-dependent or significant. Setting great analytic store by likes, for example, might be an accurate situated snapshot but would not constitute enduring fact. An analysis of liked tweets over time might well reveal development of thought and would be useful for the student to view a progression. This would characterise learning as an ongoing process and not a single temporal synaptic event and emphasise that it’s a process never finished and impossible to simplistically depict.

Leave a Reply

Your email address will not be published. Required fields are marked *