It’s a wonderful lifestream (or is it?) – Week 8 summary

Value is the main theme of the lifestream this week, both in the sense of a principle which governs our behaviour and something regarded as important or useful. Both definitions intersect in the development of algorithms, as well as in the ways in which their usefulness is communicated to us.

In a quite brilliant article about algorithms and personalising education, Watters asks the pertinent question:

What values and interests are reflected in its algorithm?

It’s a big and important question, but this Ted talk suggests to me that it would be propitious to change it to:

Whose values and interests are reflected in its algorithm?

Joy Buolamwini explores how human biases and inequalities might be translated into, and thus perpetuated in, algorithms, a phenomenon she has called the ‘coded gaze’. Similar considerations are taken up in this article too, as well as in this week’s reading by Eynon on big data, summarised here. I also did a mini-experiment on Goodreads, in which I found results which could potentially be construed as bias (but more evidence would definitely be required).

It isn’t just a question of the ways in which values are hidden or transparent, or how we might uncover them, though this is crucial too. My write-up of Bucher’s excellent article on EdgeRank and power, discipline and visibility touches on this, and I explored it briefly in the second half of this post on Goodreads. Rather, one of the ways in which hiddenness and transparency are negotiated is in the ways in which these values are communicated, and how they are marketed as having ‘added value’ to the user’s experience of a site. The intersection of these issues convinces me further of the benefit of taking a socio-material approach to the expression of values in algorithms.

Goodreads and algorithms, part the definite last

Good recommendation algorithms are really (really!) difficult to do right. We built Goodreads so that you could find new books based on what your friends are reading, and now we want to take the next step to make that process even more fruitful.

This quotation is from the Goodreads blog, a post written by Otis Chandler, a Goodreads CEO. The “next step” to which he refers is Goodreads’ acquisition of the small start-up, Discovereads, which was developing algorithms around book recommendations. The algorithms used by Discovereads were multiple, based on book ratings from millions of users, and tracking data patterns of how people read, how they rate, the choices they make, what might influence them.

It’s roughly based on the sorts of algorithms that drive Netflix, though there’s an obvious difference between the two platforms, and it’s not the type of content. Goodreads isn’t a publisher nor a producer of its own content; it isn’t promoting its own creations but rather can influence the user to spend money in a way that Netflix, which works to a different economic model, may not. Chandler admits this: one of the goals in adopting the Discovereads algorithm is that it will improve marketing strategies, ensuring that sponsored content (books promoted to users) will be more up their street.

Given this, then, it’s possible to say that the way recommendations work in Goodreads is based on at least three things:

  1. The ratings provided by an individual at the point they sign up – part of the process of getting a Goodreads account is adding genres you’re interested in, and “rating” a (computer-generated) series of books
  2. The algorithms at play are monitoring human patterns of reading and rating and, presumably, analytics and big data collected on what might encourage a person to add a recommended book to their lists (and perhaps, too, to their shopping basket)
  3. The Amazon connection: the fact that Goodreads isn’t providing its own content, and that it’s owned by Amazon, makes a particular sort of economic link. Not only does it incentivise Goodreads promoting specific economic content, but it means that Goodreads can influence how and where consumers’ money is spent. Presumably analytics based on how often Goodreads’ recommendations leads to a purchase is fed back into the recommendation system to improve upon it.

Knox (2015) suggests that actor-network theory might account for the “layers of activity involved” in the complex, often hidden, and often automated ways in which humans and non-humans interact in the development and deployment of algorithms. One of the principal benefits of this approach (and there are many) is that it inherently assumes that the human and non-human are working together. This is not always self-evident, and the quotation at the top of this post suggests that the two are seen to be in opposition. The incorporation of the Discovereads algorithm, it is implied, will lead to a fundamentally different way of generating recommendations. It signals a move from human-generated recommendations (what your friends are reading) to computer-generated ones, based on this algorithm.

The responses to the blog post written by Chandler suggest that this binary is presupposed by Goodreads users as well. The posts below, for example, clearly espouse the benefits of both ‘routes’ to recommendations. But they suggest that recommendations are either human- or computer-generated: there’s no indication that non-human interference in extant friend-generated recommendations, nor any human influence in the computer-generated ones. It’s a code-based version of the binary we’ve encountered lots in the past eight weeks: the perception that the options of technological instrumentalism and technological determinism are the only ones.

The reality, of course, is that it’s a false binary. It’s not a choice of human or non-human but – as Knox outlines – both are present. The difference, then, to which Chandler refers, the change heralded by the acquisition of Discovereads, isn’t necessarily in the source of the content, but in the perception of that source. It’s in the perceived transparency or hiddenness of the algorithm.

References

Chandler, O. (2011). Recommendations And Discovering Good Reads. Retrieved 11 March 2017, from http://www.goodreads.com/blog/show/271-recommendations-and-discovering-good-reads
Knox, J. (2015). Critical Education and Digital Cultures. In M. Peters (Ed.), Encyclopedia of Educational Philosophy and Theory (pp. 1–6). Singapore: Springer Singapore. https://doi.org/10.1007/978-981-287-532-7_124-1

Goodreads and algorithms, part the fourth

In this (probably) last instalment of experimenting with the Goodreads algorithm, I’m particularly playing with specific biases. Joy Buolamwini, in the Ted talk I just watched (and posted), says this:

Algorithmic bias, like human bias, results in unfairness.

It would be hard, I think, to really test the biases in Goodreads, and especially insufficient to draw conclusions from just one experiment, but let’s see what happens. I’ve removed from my ‘to-read’ shelf all books written by men. I’ve added, instead, 70 new books, mostly but not exclusively from lists on Goodreads of ‘feminist’ books or ‘glbt’ books [their version of the acronym, not mine]. Every single book on my ‘to-read’ shelf is written by someone who self-identifies as female.

And after a little while (processing time again), my recommendations were updated:

Of the top five recommendations, 1 is written by a man (20%); of the fifty recommendations in total, 13 are written by men (26%).

I then reversed the experiment. I cleared out the whole of the ‘to-read’ shelf, and instead added 70 books, nearly exclusively fiction, and all written by people who identify as male.

And again, a slight pause for processing, and the recommendations update. Here are my top five:

Two of the top five books recommended are written by women, and of the 50 in total 7 were by women (14%).

So when the parameters are roughly the same, and with the very big caveat that this may be a one-off, it seems that Goodreads recommends more books by men than by women. Is this bias? Or just coincidence? Probably quite difficult to tell with just one experiment, but it may be worth repeating to learn more.

Finally, one weird thing. In both experiments, there were two books that appeared on the full recommendations list. One is by Anthony Powell, A Dance to the Music of Time which, given the general gravitas of the books I added in both experiments, is fairly understandable. The other, though, is this:

 

Bill Cosby’s ‘easy-to-read’ story, aimed at children, is included because I added John Steinbeck’s East of Eden? Unfortunately I have no idea why it was in the women-only list, because I didn’t check at the time, but that feels like a really, really peculiar addition.

Goodreads and algorithms, part trois

So far, the Goodreads recommendations based on my ‘to-read’ pile haven’t been that great, so I’ve done a few more experiments.

First, I removed from my ‘to-read’ list anything that didn’t strictly fall into the category of literary fiction or reasonably highbrow non-fiction, and I added to it six books, along a similar theme: Ulysses by James Joyce, Finnegan’s Wake by Joyce too, Infinite Jest by David Foster Wallace, The Trial by Kafka, A la recherche du temps perdu by Proust (the French version, no less), and The Brothers Karamazov by Dostoevsky.

And not much changed. Mainly because it doesn’t update automatically – again I’m noticing a delay in the algorithm working. But I noticed something else when deleting things from the list. Goodreads automatically ranks the books you add to the list, in the order that you’ve added them. This makes complete sense – I expect many people choose their reading in a far less haphazard way than I do. And in any case, this explains why books about climate change were so prominent in the recommendations – This Changes Everything was first on my list.

Goodreads also allows you to edit the ranking, so I’ve moved the two James Joyce books I added to positions #1 and #2, and I’ve moved the climate change book to #20.

Again, nothing happened. The recommendations were still based on books that I had now removed from the list. I refreshed the page, logged in and out, and no change. So I went back, and added a 7th book: Robin Stevens’ Murder Most Unladylike, which is aimed at 10 year olds. And new recommendations appeared.

ALL of them are based on the items I added earlier (not the most recent addition) – you can see the first two are about Proust, and yet NONE of them are based on the James Joyce books I moved to top ranking on the list.

Goodreads and algorithms, part 2

Earlier today I went through all fifty recommendations based on my ‘to-read’ list, and tidied them up: things that genuinely suited my interested I added (seven books in total), and things that didn’t suit, I deleted.

Since then – and it’s been about four hours – my ‘to-read’ recommendations have vanished.

I’m guessing I fall into the last category here, and they’re ‘in the process of generating recommendations’. I would have expected the algorithm to work instantaneously, desperate to populate, but clearly it’s a slower process than that.

So, anyway, I then went and added three more items to the list, bringing the total up to twenty. And immediately it came back.

None of the titles above are listed here based on the three new books I just added, and three of them are a result of the same book, Under the Udala Trees by Chinelo Okparanta.

Goodreads and algorithms

I’ve been using Goodreads to track what I’m reading for the past three years, and I thought I’d investigate the algorithm that drives its recommendations. I’ve added almost 300 books to Goodreads since I started using it, nearly all of which I have read and rated, so there’s a lot of data there on my reading habits. However, I don’t use Goodreads to plan what I’m going to read next – I don’t use it as a wishlist but as a way to record things. Subsequently I currently have just ten items on my ‘to-read’ list:

It’s not the most eclectic list of literature – two non-fiction (on Russian history and climate change), and eight novels which would probably just about fall into the genre of ‘literary fiction’ (as meaningless as that is). But I feel, at least, that this list roughly reflects my reading habits.

The recommendations right now are based on three of the ten books listed: This Changes Everything by Naomi Klein accounts for three of the books, The Romanovs by Simon Sebag Montefiore is the reason why there’s now a picture of Stalin on this blog, and Amy Tan’s The Joy Luck Club is responsible for the fifth one. Interestingly (maybe), I’ve only heard of one of these books. I’m an English Literature librarian, my partner is an English Literature teacher, and my idea of a fun day out is to a bookshop.

Goodreads lets you know why the recommendation is included, which is pretty helpful.

And it also gives some guidance on how to improve your recommendations.

 

The trouble is that I’m not really that interested in reading any of these books. My goal, using the guidance above, is to get my top five recommendations to actually be helpful, to suggest books that I want to read right now (I can think of at least a dozen off the top of my head). I’m going to try to fix it so that the algorithm reflects what I want, rather than the other way around.

With my librarian hat on, it might also be useful to compare how Goodreads recommends books to how a discovery layer (also known as a library catalogue) can recommend articles and other titles – if I have time I’ll look at that too.