The Trolling of Tay

TayThe story in brief: Microsoft created a self-learning chatbot designed to emulate the speech of Millenials. They let her loose on Twitter, where she immediately got trolled hard by members of one of the most notorious boards of 4chan, and she turned into a massive Hitler-quoting racist. Microsoft took the bot down, and are working hard to remove the worst of her tweets. Oops.

Aside from obvious conclusion that there are some awful people on 4chan, what can the testing community learn from this?

One seems to be that if you carry out testing in a very public space, any testing failures will be very public as well. An artificial intelligence turning into a noxious racist is a pretty spectacular fail in anyone’s books. Given the well-known nature of the bottom half of Twitter, it’s also an all-too-predictable failure; people in my own Twitter feed express very little surprise over what happened. It’s not as if anyone is unaware of the trolls of 4chan and the sorts of things they do.

What they should have done is another question. Tay was a self-learning algorithm that merely repeated the things she’d been told, without any understanding of their social contexts or emotional meanings. She’s like a parrot that overhears too much swearing. It meant that if she fell in with bad company at the start, she’d inevitably go bad.

The most important lesson, perhaps, is that both software designers and testers need to consider evil. Not to be evil, of course, but to think of what evil might do, and how it might be stopped.

This entry was posted in Testing & Software and tagged , . Bookmark the permalink.

7 Responses to The Trolling of Tay

  1. Carl says:

    This is kind of like the Boaty McBoatface thing… you can’t count on the Internet to be reasonable or do the “right” thing. It’s fickle and, collectively, likes to do things for a laugh, or just to see if it can.

  2. Tim Hall says:

    The Boaty McBoatface thing is essentially harmless.

    The chatbot thing is more a lesson learned, and raises questions of how you train and socialise an AI.

  3. Colum Paget says:

    There’s profoundly bigger issues raised by Taygate. Firstly, the whole thing is another instance of the kind of utopian thinking that still holds sway amongst socialists and neo-cons, the idea that people are essentially good, or at least essentially rational. This is why such people consistently build systems that last a day before being hacked. Ms Kerzner was right when she said that not expecting this to happen shows how out of touch Microsoft are, but it’s not that they’re out of touch with the modern age: they’re out of touch with people in any age. “If you build it someone will try to break it” has been a truism since we came down from the trees.

    But the flaw here is not thinking that human nature is good or evil, rather it’s thinking that there’s any consistent human nature, or ever could be. Any population of people is going to have makers and breakers, cookers and eaters, good and bad, and it needs to be that way. The person who’ll behave like an asshole in one situation is often the person who’ll behave heroically in another and any society or species needs a variety of personality types to function. As Carl says above, the public “Its fickle and, collectively, likes to do things for a laugh, or just to see if it can.” and this attitude is what drives a lot of innovation. Thus anything you build is going to run up against its nemisis sooner or later, and so you have to build things to be robust in the face of abuse and manipulation. The failure of Tay mirrors the failure of social structures like the SF community or the US Republican party who build ideologies that are easily hacked, and fail completely in the face of a single skilled manipulator.

    And in this regard Tay is really a success, as her radicalization by 4-chan trolls probably mirrors in a faster, simplified form, the radicalization of impressionable people by extremist groups. In this regard, Tay probably has behaved like an impressionable person on the internet. That’s a kind of Turing Test, isn’t it?

  4. Tim Hall says:

    I see some of the usual suspects blaming the whole thing on white male privilege, but I think they’re missing the point. Every time you buold something, wheher it’s an AI chatbot or a convention code of conduct, you need to think “How might a bad actor exploit this”? Because if you don’t think of it, the bad actors will.

    The more I’ve read about Tay, the more it’s becoming clear most of her worst tweets were literal word-for-word quotation of other people’s tweets stripped of their context. Microsoft have built a glorified parrot here. Perhaps she should write lyrics for hatebeak?

  5. Colum Paget says:

    # I see some of the usual suspects blaming the whole thing on white
    # male privilege, but I think theyre missing the point.

    Oh, not “the Jews”? I bet someone’s blaming the Jews. Trump will probably say it was Mexicans, the FBI will say “this is why wee need backdoors in encryption”.

    # Every time you buold something, wheher its an AI chatbot or
    # a convention code of conduct, you need to think How might
    # a bad actor exploit this? Because if you dont think of it, the bad actors will.


    # Microsoft have built a glorified parrot here.

    But aren’t most people on the internet a bit like that?

  6. To be honest, I don’t think you even need “bad actors” for this one. Lots of people feel like corporate semi-publicity stunts like this are a sort of intrusion into public space, and immediately go about trying to subvert them. I guess it’s instructive that out of all the stuff they could have chosen to have “Tay” parrot, they went for racist garbage – but that’s also the biggest impact for least effort way to “dirty” a “clean” corporate entity.

  7. Tim Hall says:

    Interestingly, this piece suggests that inadequate QA was to blame – there was a feature put in during development that should have been removed before Tay went live, but was left in