Friday, 24 October 2014

How a stag do demonstrated a unique psychological experiment

Academic psychologists must have great fun designing experiments, particularly ones involving alcohol. There must be occasions, though, when practical and ethical considerations make experiments impossible. I wonder if this anecdote could have been conducted under experimental conditions?

A good friend of mine was best man at a stag do a few months ago. As is traditional, he racked his brains trying to think of ways to embarrass the stag, whilst getting him as drunk as possible in the shortest possible time. His idea was a work of genius. The best man later confided that he had no idea at the time whether it would work - if it didn't, the joke would have been on him - but as it turned out, it's a piece of experimental psychology worthy of any journal.

The lads had hired a house and, all being heavy drinkers, decided to do something familiar from most of our student days: a "centurion". For the uninitiated, a centurion consists of drinking 100 shots of beer in 100 minutes. The actual volume of alcohol consumed is substantial without being outrageous; it's about 6 pints, but what gets you hammered isn't so much the quantity consumed as the mechanical regularity of the drinking. One must drink a shot on the minute, every minute, for over an hour and a half. Inevitably casualties occur and vomiting after 60ish shots is perfectly respectable (I've known people to be sick on less than half that). An ideal icebreaker for the start of the stag do: relive old student days and get the whole party battered at the same time.

This was a centurion with a simple twist: whilst everyone else would plough on as normal, unbeknownst to him, the stag was fed non-alcoholic lager.

The implied reverse psychology is brilliant. Normally at a stag do, the main man would expect to get screwed over. This guy can handle his drink and fully expected to be given "dirty pints" and other horrors over the course of the weekend. His expectation was to be the drunkest member of the party. Not the other way around.

100 minutes later, a dozen battered lads staggered around the room. Several had hurled into the toilet. Handshakes, backslaps, and proclamations of undying man-love crossed the room. At the centre of it was the stag - still standing and beating his chest as one of the few to make it to the end without being sick. His hair dishevelled, he staggered around the room, a maniacal glazed look in his eye and slurring his speech. Like everyone else in the room, he was absolutely, utterly battered.

Without having had a drop of booze.

I'm told that when the bad news was given to him, the stag went very, very quiet. And took a few minutes to sober up. Of course the only observers in the room were extremely drunk themselves, but the stag himself happily admitted to feeling smashed and never once thought to question the alcohol. He remembers after about 40 shots thinking to himself "I'm not feeling too bad here" but as everyone else around him descended into chaos, he went with the flow and even now swears that he felt hugely intoxicated at the time.

For me this indicates an important aspect of group dynamics: not only to we behave how those around us do, but our own mental state is affected by the expectations of those around us. We do as others do but even our inner feelings are affected. Clearly the stag's motor functions were theoretically capable of standing up straight and speaking properly; but the behaviour of those around him, combined with what he was primed to believe, affected his body's functions significantly. Think of it in reverse - have you ever had a few drinks and been drunk on adrenaline, only to witness a nasty car accident or to come home and find yourself burgled, and manage to sober up extremely quickly?

(PS - I say this was "unique" but I wonder if it has been demonstrated somewhere before? I'd love to know)

Monday, 11 August 2014

A few questions for the Barbican surrounding Hamlet ticketing policy

The much-hyped production of Hamlet directed by Lyndsey Turner and featuring Benedict Cumberbatch in the title role went on general sale this morning. Sadly the process appears to be mismanaged at best, with hints of a cartel at worst.

That there are many more hopeful people than tickets is inevitable, and venues, festivals and promoters have struggled for years with the issue of how to disappoint people in the fairest way possible. The Barbican appear to have failed spectacularly.

This morning, before going on general sale, the website claimed that stalls seats were sold out for the entire run, with circle seats "nearly" sold out. This would seem to imply that large numbers had been sold to patrons, members, friends, and all the other various levels of membership for which punters pay a premium in order to enjoy benefits like early booking. This is entirely fair and absolutely standard across the industry. Presumably some tickets have also gone to sponsors and other partners. That isn't pleasant to think about, but a certain amount of back-scratching and palm-greasing (back-greasing?) needs to be done with sponsors in order to keep venues and productions viable. As long as the proportion of tickets going to sponsors isn't huge, this is also acceptable.

So far, so good, and when people logged on to find themselves in queue with upwards of 20,000 people ahead of them, they will have been disappointed but not necessarily surprised. The online booking system assigned places randomly in the queue to all those who were logged on before the booking window opened, which is completely fair. The queueing system was then torturously slow; I moved up 1300 place - a third of the way up the queue - in an hour and a half. Messy and frustrating, but nothing worse.

Then - perhaps inevitably - rumours started swirling around Twitter of alternative locations to purchase tickets. ATG (the Ambassador Theatre Group, of which the Barbican is not a member) was often cited. Sure enough, with a few seconds wait, I was offered 4 tickets for a total of £269 including a "booking fee" of £4 per ticket and a single "transaction fee" of £3 (quite what the difference between a booking and a transaction is escapes me, but I'll let it pass).

Fact time: booking via the Barbican website, tickets for Hamlet cost "£30-£62.50 plus £3 online booking fee". They also mention that "a limited number of Premium Seats are available" (their capitalisation).

Something else to mention: the Barbican advised punters on the best place to buy tickets:


The ATG tickets available were all "Band A" - and stated explicitly that this was the top price £62.50 + £4 "booking fee" per ticket. They all appeared to be stalls tickets; let's not forget that the Barbican claimed that stalls were "sold out" before tickets even went on general sale. There was no way of choosing individual seats via ATG but anecdotally people on Twitter seemed to be getting hold of some very good tickets.

But the Barbican isn't a member of the Ambassadors Theatre Group. It's owned by the City of London Corporation. So presumably the Barbican have simply sold a load of tickets for a show for which they knew there would be extremely high demand, so that ATG could sell them on at a premium.

Worse is to come.

At around 1030 the reputable theatre website WhatsOnStage.com - always a good source for listings, reviews and debate - tweeted that they had some tickets for sale. I followed the link and sure enough, they had tickets for sale for all nights. Once again there was no facility for punters to choose seats. The price: "from £78" on weekdays, and "from £119" at weekends (no mention was made at this stage of booking or transaction fees).

£119 is a 90% increase on the Barbican's top ticket price.

Fact time again: the Barbican have introduced special anti-touting measures for this production - the lead ticket booker needs to show photo ID.

To reiterate: £119 is a 90% increase on the Barbican's top advertised ticket price.

I tweeted WOS about this and got the following reply:

So presumably WOS are selling these "Premium Seats" with a 25% markup on...well, the Barbican don't mention prices so let's assume £95.50 (incidentally, ATG were selling "Premium Seats" for £99.50 which would make sense if their markup is £4 again).

An aside: "STAR" mentioned by WOS are the Society of Ticket Agents and Retailers. They do indeed mention 25% as the maximum generally acceptable markup

To be clear, I don't have (much of) a problem with WOS or ATG; it would appear that they're playing within the rules of the system, even if a 25% markup on top of "Premium Seats" is pretty outrageous. They're businesses trying to make money. However, I do have a very big problem with the way the Barbican are dealing with this and the way they are allocating tickets for one of the most in-demand productions in recent years. To that end I have some questions for the Barbican:
  • Did all Barbican members who attempted to buy stalls tickets get them successfully? [**update - see comments below - if I was a member I would be livid]
  • Does the Barbican think it is hypocritical to introduce anti-touting measures whilst at the same time allowing tickets to be sold for 90% above the advertised top ticket price?
  • Can the Barbican, and indeed other venues such as the Old Vic who operate a system of "Premium Seats" (their capitalisation) admit that this is nothing but a ruse to inflate prices, as there is nothing "premium" or special about them - they are simply standard top price tickets to which a substantial additional sum has been added, presumably to encourage punters that those "standard" top price tickets are better value than they would otherwise appear? (This is classic behavioural economics).
  • What is the reciprocal arrangement between the Barbican and ATG [edit - see comments below]? Was the Barbican contractually obliged to sell what appears to be a substantial proportion of stalls seats to ATG, even though they could easily sell them out - probably several times over - themselves?
  • Why did the Barbican advise that the "best" place to obtain tickets was from their own website, while it actually appears that ATG was a far quicker and more reliable method? When all stalls and most circle seats have been given to agents, does that mean the Barbican site is "best"?
  • What does the Barbican stand to gain from selling off tickets to third party agents versus selling them via their own website? What is the point of going to the effort of a "fair" online ticketing system when agents can sell them however they want?
  • Does the Barbican feel that the process has been well managed overall?
** A couple of updates: it seems that ATG and theatrepeople.com are "official ticketing partners". Presumably they bought their tickets from the Barbican more cheaply than at the full retail price. Meanwhile, WOS have an article which summarises the popularity of the show whilst tactfully not plugging their own £119 tickets.

***Update 2: after a 3-and-a-half hour wait in the queue I did get tickets, and for a Saturday to boot. The stalls and circle are indeed sold out being sold via agents only, but there is still decent availability at time of writing (1330 on Monday 11th) - it's a big venue with a long run! The queueing system provided by queue-it.net works fine, even if the wait is extremely long.

***Update 3: some very interesting comments below.

Wednesday, 23 July 2014

Waltz for Debby

This vocal version tops even the original. Listen to the Mark Murphy's take on it (on the LP Satisfaction Guaranteed)...wonderful

In her own sweet world
Populated by dolls, and clowns, and a prince, and a big purple bear
Lives my favourite girl...
Unaware of the worried frowns that we weary grown-ups all wear

In the sun she dances to silent music
Songs that are spun of gold somewhere in her own little head.
One day - all too soon,
She'll grow up and she'll leave her dolls, and her prince, and her silly old bear...
When she goes they will cry as they whisper good-bye
They will miss her I fear but then so will I

Thursday, 15 May 2014

Social media listening vendors ALL have a responsibility to push for higher quality

A few months ago I happened to see a short social media insight report, written by a large, highly respected global research agency, for one of the world’s most iconic brands. It was very brief (5 slides in total) and formed part of a wider research report.

I was embarrassed for the vendor (not my own company, I hasten to add!). In those five slides were several claims so patently wrong that you wonder if anyone had their head screwed on when the report was written. They started by claiming that 99.9% of comments made on Facebook originated in the US – and that global mentions had a very heavy bias towards America as well.

They went on to paste in some automated sentiment charts which claimed that in some markets, social media reaction to the client’s highly entertaining, engaging promotional campaign was >97% neutral.

They also claimed that a sudden spike in online mentions of this major, engaging, global consumer campaign was due to coverage in a minor B2B magazine discussing a particular aspect of the production.

All of this – along with some other rather spurious claims – in five slides, lest we forget.

Let’s forget about the actual numbers for a minute. What concerns me is that the exec who wrote the report clearly never bothered to think about what the metrics meant – or to run a simple common sense test. Nor did the person who signed off the report. (It doesn’t reflect well on the client, either; did they not think to push back and ask what these numbers meant?) By all means report the numbers in good faith as provided by the tool you are using…but for goodness’ sake provide a footnote or caveat explaining the limitations. If reported “as fact”, anyone with an ounce of sense can rebut your findings.

Some basic understanding of how social media monitoring tools work can help explain those anomalies. These tools do their best with location detection – but it’s complex and far from easy to get right, and also platform specific. Facebook barely give away any metadata – so in most cases monitoring tools simply pick up the fact that Facebook.com is registered in the US and run with that. Similarly, automated sentiment tools tend to dump data in the “neutral” bucket if they aren’t sure – which depending on the dataset and language can often mean that almost everything is marked up as being neutral. As for the claim about the B2B magazine…I can’t explain that without seeing the raw data, but I’d imagine it’s due to duplicate mentions in the data.

I cite this specific example because I was frankly appalled at what a shoddy job this highly respected agency had done. But it’s representative of an endemic problem with poor-quality social media insights and monitoring – rubbish being peddled by technology suppliers and agencies is being met with client-side ignorance, resulting in an acceptance of poor findings…until somebody more senior does a review, realises the findings from social media are weak and/or unreliable, and blames the approach in general rather than specific failings. All this leads to a widespread mistrust in social media listening/insights. The damage doesn’t need to be done; it does need a little common sense, a willingness to go further than merely pasting charts directly from a tool without some sort of sense checking and interrogation of the data where appropriate, and some basic caveating and management of expectations. Most anomalies can be explained.

Social media research is a crowded space, and competes with many other emerging techniques for a share of limited client budgets. It is incumbent on all suppliers to push for better standards – as otherwise the mistrust can only grow and buyers will take their money elsewhere.

Thursday, 27 March 2014

4 social media research challenges to overcome when tackling live debates

As we approach the final furlong of the race for the Scottish Independence referendum and rapidly approach another General Election, much excitable talk bubbles up once again about using social media as an election predictor; with the current fashion for presidential-style election debates, those are under the social media analysis spotlight too, with Twitter and other platforms providing a source of instant feedback and soundbites - cheaply or for free. Media organisations, research companies, political parties and casual observers alike all feast on instant statistics about who has "won". Needless to say, live debates provide a snapshot of how social media can give large-scale instant feedback - something which tickles the fancy of insight departments in companies and organisations the world over.

Last night's EU debate on LBC between Nigel Farage and Nick Clegg was a good canvas to show how there are significant challenges to such an approach. To demonstrate why, I set up a quick search for the hashtags #NickvNigel and #LBCdebate, using social media monitoring tool Brandwatch. Incidentally, this isn't a tirade against such tools, which do exactly what they're supposed to. Instead, it's a call to arms: to make this data meaningful, we need to think very carefully about the context of such data, to clean it appropriately, and to treat is with extreme caution. If we take necessary steps, which may involve cutting out substantial proportions of the data, we may be able to get meaningful results.

The Blurrt "worm"

The LBC website has a "worm", courtesy of Blurrt. Sadly at time of writing the LBC website was creaking and the worm wasn't visible at all during the debate itself. All that was visible was the phrase "The requested URL /graphs/sentiment/ was not found on this server." The bolded word leaves me sad, but not as sad as the "how it works" page, which gives no information whatsoever on the methodology and a lot of explanation of some basic sampling theory - dressed up in such a way as to make it look intimidating to a non-technical audience whilst still explaining nothing useful. There is certainly a place for real-time analysis (although as Francesco D'Orazio points out succinctly, "If you can’t make decisions in real time there is no point in using real-time intelligence"); that real-time analysis must inevitably depend largely (or solely) on technology. As an advertisement for robust social media analysis, however, this is flawed, flawed, flawed.

There are several challenges which we need to consider.

1. Using hashtags as search terms

As this was a casual exercise, I opted for simplicity in my search term, opting initially for #NickvNigel (simply because this was the one appearing on my own Twitter feed) and later adding #LBCdebate, which I only spotted once it was mentioned by Nick Ferrari 10 minutes into the debate itself - a good thing I did, as #LBCdebate turned out to be the dominant hashtag: 

This brings up one potential issue - retrospective data, which may not always be complete depending on how it's coming from Twitter. 

But there's a more fundamental problem. Almost by definition, the use of a hashtag implies prior knowledge of its existence, and generally also implies an affinity for the topic, and possibly good connections with others close to the topic. The casual LBC listener stumbling across the debate who chose to comment - very likely the unpartisan "floating voter" who we are so anxious to identify - will be unlikely to be found here. There are parallels in commercial social media research, too; do real people use hashtags like #danceponydance, or do they just talk about "the T-mobile ad"? (Hint: that's actually not a good example, as it's a rare occurrence of a campaign that has really taken off in social media. Much to my advertising research colleagues' frustration, not to mention that of my clients, the reality is that most campaigns barely get talked about at all.)

Should we go with the easy option, or try to look at all tweets from the period referring to Clegg or Farage? Had I done the latter, the results might have been very different.

2. Coding: far from trivial

I dived in and manually coded 199 tweets. Simple, right? Not at all. There are myriad ways of doing this. This was a quick-and-dirty exercise on my part, but it's worth jotting down some of my assumptions, because even a quick-and-dirty bit of coding can rapidly prove a head-scratcher. I'm not claiming this is the "right" way to go about things! On the contrary, there are probably approaches which are far better, and some of my assumptions are probably way off the mark. For example, I could have focussed purely on tweets which made reference to the debate performance itself ("Farage is winning", "Clegg sounds nervous", etc).

I started by taking a sample of tweets using either hashtag, between 1900 (the start of the debate) and 2100 (an hour after the finish). The time period is arbitrary. My code frame was very simple: "Clegg", "Farage" or "neither". Broadly speaking, I defined "Clegg" as any tweet saying either something good about Clegg or something bad about Farage, and "Farage" vice versa; "neither" was any comment which gave nothing away. Any retweet of an official party account I automatically set to being "for" that party (mercifully both Labour and Tory HQ seemed to be very quiet); retweets of mainstream news accounts, without added comment, I set to "neither" unless the tweet reported something obviously critical. This approach was pretty self-explanatory to begin with, but there were snags aplenty.

This tweet is clearly making a political point, but for which side?
How about this?
Or this?

(For reference, I coded those as "neither", "Clegg" and "Clegg" respectively, but I wouldn't quibble with anyone who coded them differently).

Other tweets, meanwhile, needed a good look at the context and/or embedded media/links to make an educated decision - this one is clearly pro-Farage:

3. Are opinions representative of Twitter? Of the wider population? Even of the tweeters talking about the issue?

Coding social media verbatim is tricky at the best of times and whether a manual, automated or machine-learning approach is taken, clearly needs a lot of thought. However, even if we assume an optimal coding strategy, there's a deeper-seated problem, and this comes back to the question which old-school market researchers always ask about social media data: But is it representative?

When asked that question, I generally fall back on a standard response: "Probably not...but does it matter?" There are so many unknowns, but survey respondents aren't exactly representative either ("yes, of course I'll spend 45 minutes for little or no reward answering questions about my mortgage provider")

The problem is not a question of demographic representivity, but more "to what extent do the views expressed on tweets represent the views on Twitter?" The first and most obvious point is that people only tweet about stuff they care about. Hence we'll have to stick with surveys for our mortgage provider research. Do the tweets represent the underlying opinions? Probably not - it's only the things that delight/outrage people the most that actually get posted. People don't necessarily offer up unprompted opinions unless they feel the need to broadcast them.

But studying political tweets is even more problematic.

4. Activists dominate proceedings

Of the 198 tweets I analysed, 153 gave some sort of opinion one way or another. I looked at the profiles of these 153 tweeters to see if I could find anything out about them. A Twitter profile gives you 160 characters to define yourself. After going through a few, it seemed to me that they could be divided into four categories:
  • Activist
  • Politician
  • Journalist
  • Other
I decided to code anyone as an "activist" whose profile showed an obvious leaning towards a particular political party or ideology. My reasoning was that anyone who uses up some or all of their 160 character bio to state their political leanings would be likely to be pretty dyed-in-the-wool. Some were a grey area: there were plenty who were self-described as "interested in politics" who I coded as "other", while anyone who said things like "socially liberal" or "Europhile" I placed in the "activist" bucket. "Politician" means anyone whose bio states that they are an MP, MEP, Councillor and so on; prospective candidates were problematic, although anyone who was borderline would end up in the "activist" category anyhow. "Journalists" were mostly self explanatory.

The breakdown of "opinionated" tweeters is as follows:
No less than 36% of the tweets were written (or retweeted) by tweeters were self-described as being politically polarised*, with another 3% being journalists.

Does that skew our sample? Of course it does - massively. There is a substantial minority of politically savvy, active cyberwarriors sticking up for their man. It's true of the #IndyRef debate as well. Never mind the demographic breakdown of Twitter - it's the propensity of people to tweet about what matters to them that is more important. The sample is biased away from casual listeners and floating voters, and towards a polarised, politically charged audience. Shortly before the debate began, Lib Dem Digital Communications lead Bess Mayhew sent out an email to supporters which said "LBC are running a “Twitter worm” which tracks who is winning the twitter battle. Nick needs your help to come out on top, so lets get tweeting!" In a world increasingly judged in this way, groups will always look for ways to game the system.

There's one further consideration to take into account which I've also not dealt with here - multiple tweets by the same person. As an example, Peter Chalinar (@TaleahPrince) tweeted nearly 200 times yesterday about the debate (mostly retweets of others) - mostly strongly in favour of Farage, whilst Lib Dem MEP Rebecca Taylor notched up nearly 150 tweets. While neither of them turned up in my sample of 198, there were several people whose tweets appeared twice. De-duplicating authors is another step in social media analysis which might want to be taken, depending on the objectives.

* Of course it could be argued that anyone tuning into an hour-long programme on a political issue that isn't even considered to be in the top 10 issues facing Britain today according to the Ipsos MORI issues index would be likely to be a bit of a politics nut anyhow. 

So what about the results?

What about them? Hopefully I've demonstrated that without some careful methodological thought, the results are pretty meaningless, and my own system was not thought through in detail - I simply wanted to point out some issues. For the record, the Blurrt worm seems to have done reasonably well at picking up sentiment expressed towards particular issues as the debate went on, and called it overall in favour of Farage, mirroring the snap Yougov poll taken immediately after the debate. My own results were rather different:

Topline figures
Clegg 44%
Farage 33%
Neither 23%

Ignoring the "neithers", this boils down to
Clegg 58%
Farage 42%

What about if we exclude politicians and activists from our sample? This reduces the sample of opinionated views from unpolarised people down to a rather meagre 94 (less than half of our original sample size)

As it turns out, and somewhat to my surprise, there was actually very little effect, with the results now amended to

Clegg 55%
Farage 42%

Perhaps implying that the cyberwhipping on both sides was equally effective.

How do I explain the discrepancy between my own results and the worm (and indeed the poll)? It's hard to say. There were a few hashtag "hijacks" - people talking about issues which came up in the debate which were not directly related to the EU; notable examples included Scottish independence and gay marriage, where there were several tweets critical of Farage - by my own rules I coded these as "wins" for Clegg but perhaps these could have been excluded from the sample or coded as "neither". There were several tweets reporting the Yougov poll result which I categorised as neutral as they were merely reporting the mainstream media outlet - I could have coded these as being for Farage, which would have boosted his score a few points. Other than that, there are so many variables that I find it difficult to pinpoint.

Perhaps Sky's primitive method was best?

Sky News opted for a simple approach - they posted a couple of tweets, one in favour of Farage, one for Clegg, and asked for retweets to endorse. This direct approach - closer to a traditional market research technique - might work better in such circumstances, and indeed this was in line with the poll (and the worm):

Where does this leave political social media analysis?

Overall, then, I believe there are multiple issues with political social media samples, although with appropriately thoughtful handling I do think these issues can be overcome. There is certainly a place for fast-turnaround or real-time analysis which presents significant challenges, although once again these are not insurmountable. Watch out for the next debate on the BBC, for which no doubt there will be more furious analysis and debate.