Category Archives: Uncategorized

Fundamentals Come First

After a surprise fifth place finish, Rand Paul dropped out the Presidential race. My scrapings suggested that Paul had a strong Twitter presence, something that could actually mean something if connected with the result in Iowa. That is a big ‘could.’

Money definitely matters. Paul had little of it, with his campaign running with just over one-million and the super-pac with just over four. This is not a well capitalized operation. Paul never caught the support level of his father, no wild-eyed ground game was behind him.

On the ground in Kentucky, he faces a primary and then a strong democratic challenger from Lexington. Kentucky Republicans delivered on their promise to cut “Obamacare” by ending the state’s extremely popular Kynect program. Combined with growth in Lexington and Louisville and the polarization of the Evangelical vote, it seems that the safe place for a Southern Libertarian may have become quite tight.

Paul had a fairly robust social media impact for a candidate with his fundamentals. It didn’t translate into a win. Trump has an outsized social media impact compared to Cruz, this translated into a near tie for third. Neither polls nor tweets effectively predicted the outcome. All measures are suspect.

Super Tuesday is less than a month away. There is no chance that any of these candidates will be effective in building a ground game in each state. The question is, will Paul voters head for any of the remaining candidates, or will they head toward Sanders as Penn and Teller recommend?

Iowa Caucus Predictions: Twitter Loves Rand Paul

The Iowa Caucuses are tonight. My predictions on the basis of Twitter data from a time-bound (during the townhall) and space-bound (just Iowa) dataset.

Clinton and Sanders are running relatively even. There is a lot of Twitter heat for Sanders, although much of it comes through retweets. Methodologically, this is suspect given the use of Twitter as a transmit medium during media events. Much of the Twitter activity is at least topically about Clinton. The most popular Tweets seem to turn on reclaiming the name ‘socialist.’

The volume of Sanders re-tweets is difficult to process. Thousands of lines of my sample (100,000) are taken by the same tweets.  The near lack of meaningful user content is shocking, over these six months of research it seems like twitter is less democratic by the week.

The Republican ticket is more difficult to predict. In this dataset there is something striking – between the mass Sanders and Clinton retweets, there is a Republican: Rand Paul. Not Trump. Paul.

Why 4 Deep?

The use of the expression 4Deep in my data refers to the number of layers of interaction I have built into my social network of the hashtag. Because of limitations in my computing power, it is difficult to get the data as clean as I would like. I will be revisiting this dataset (and approach) in July to render the entire election on Twitter.

Who won the #DemDebate on Twitter?

The final democratic debate before the Iowa caucuses was held on Sunday night. Post-debate questions almost always include: who won? Aggregated polling methods have proved to be a good tool for shifting journalism toward more productive questions. Unfortunately, aggregated polling is more difficult in these elections. We can try to get some leverage on the question of “who is winning Twitter,” although if you have followed my research you would know that judging the disposition of Twitter as a whole is a dubious enterprise.

Twitter conversation related to the debate, was underwhelming. The balance of retweets to tweets was two-to-one, which is not surprising given that Twitter tends to become a broadcasting medium during crises, rather than a dialogic medium. So, how do we get a sense of the temperature of Twitter when it is so erratic?

Here is the first dendrogram:

debate dendro

Here is the problem: most of these posts just seem to include the names of candidates. Not policy positions. As a general overview, Sanders was the subject of roughly one-thousand more Tweets than Clinton.

The retweet leader was from Donald Trump, “Notice that illegal immigrants will be given ObamaCare and free college tuition nothing has been mentioned about our VETRANS”

Followed by Sanders: I got into politics not to figure out how to become President. I got into politics because I give a damn.

Then a Sanders quote: “I believe in a society where all people do well, not just a handful of billionaires.” – Bernie

Aside from Huckabee attempting a racist joke, the Trump veterans argument, the top of the debate re-tweet stack was Sanders heavy.

Much of the retweet activity came in repeated calls to follow a live-tweet or sign-up for Clinton’s text message update plan. If we are judging by which selection of retweets named someone the most, a sort of emotional expressive politics, Sanders won. Although, parsing robots and retweets could just as easily mean that a server somewhere won.

Even when using Twitter’s metadata to sort retweets, most of the original content appears to be unoriginal. Many uncoded retweets and other such noise. Even in the ostensibly original content, a third mentions sanders, and roughly a quarter mentions Clinton. Using immigration as a proxy for issue engagement, only 41 of the 11,000 original tweets contain that string. Isis appeared 70 times. It seems that the bulk of material in this original content section are declarations of support for one candidate or the other.

demdebate4corners

When structured as a network, there are a few clear cores of activity. The most powerful individual nodes are the conservative jjauthor and former Fox personality Steven Crowder. Nearly tied with Crowder was People4Bernie, followed then by a cluster of, Hillary Clinton, The Democrats, SandraALTX, PoliticalMiller, Hillary4Florida, GlennHeiser, YouTube, RandPaul, ZaidJilani, BernieSanders and HillaryClinton originals (rather than retweets) appear in this region as well. After this group there is a rapid fall in centrality of any given node.

twocommunities

The green and red modularities are only roughly an eighth of all computer detected in this network. In short, there are tens of thousands of people talking to each other, with very little meaningful network control. Unless you are a Conservative author, in which case you have something.

People4Bernie clearly was the strongest handle in the network flow, although there were also strong Clinton handles. This high eigenvector score suggests that Sanders could more effectively seed information into the conversation by way of his supporters.

So, who won?

The Sanders organization had great strength in controlling message flow, although Clinton also had a good bit of traction. It is fascinating that from this perspective the most retweeted (Trump) has almost no meaningful centrality to this network. In short, Sanders was running Twitter, but that may not mean much.

Don’t try to start a Twitter wire service

Part of my research includes the scraping and visualization of bulk Tweets. I end up seeing a lot of sentences this way, far more than I will ever actually read. I use Mallet to text mine my Twitter collections.

I have noticed a number of groups attempting to create what appear to be wire services either with regional handles or candidate specific Twitter streams, such as the candidate_news_network and POLS. These are messing up my analysis as they pump huge volumes of uniform text with little relevance.

Here is my line on these enterprises: they are a false start at best and generally junk.

This is what Wikipedia calls a junk pile.

Does anyone really care what a brand new wire service writes in a microblog format? I can’t believe that any real fan of a candidate needs a curated Tweet stream when they can have the real thing.

This is Twitters problem in a nutshell: the news services are a poor substitute for a poor substitute for real news and analysis. Are these services intended for “novice” users? Newsflash: there aren’t many of those, Twitter is slowly burning out, not building new. My best guess is that these news services are intended to accumulate followers and then sell out. Sort of like Twitter should have long before that IPO thing.

What does this mean for me? Finding ways to clean this stuff out of my dataset, blerg.

Person of the Hour?

The basic cluster dendrogram over the last three weeks reveals three distinct sub-topics.

There are three distinct clusters.
There are three distinct clusters.

 

The first category is a widely re-tweeted message:

#WakeUpAmerica✅DEMAND✅VOTER✅IDENTITY✅ INTEGRITY#TCOT#YCOT#PJNET#COSProject#Election2016

There are slight variations that include some other references such as “@truethevote,” but the general purpose of this category seems to be a claim related to voter fraud. The poster is a part of an active Twitter network that circulates Tea Party related messages. This network is very small and highly active, if I re-modeled to exclude retweets they would likely fall away entirely. There is little doubt that this network is quite artificial. This particular network seems to like Trump and Carson, and strongly dislike Clinton, Rubio, and Bush. They don’t seem to have a lot of activity related to Fiorina or Sanders.

The second topics seem to be a cluster on other candidates with one side twisting toward Walker and Sanders. Fiorina has an entire sub-section, with no reference outside of that sub-section.

The last cluster of the dendrogram includes references to Bush, Trump and Carson.

Clearly there is a strong anti-Clintion sentiment and a strong Tea Party resonance, but the real flow of Tweets seems to be related to the idea of a pure populist rage that occasionally includes references to particular candidates. This is not candidate advocacy.

I may need to find a new approach to cleaning the data…

Raw Twitter Activity is NOT a Poll

I have been actively tracking election activity on Twitter for two weeks now. The impact of strategic communication on Twitter is very clear. There are sockpuppets and shills everywhere. They eclipse organic use of the network. Here is a chart:

Dendrogram with no cleaning

Well, I guess this is what you see when hundreds of users retweet the same content basically simultaneously and the entire network seems designed around communicating one perspective. That chart is garbage. Inferences made based on that chart are highly suspect.

It’s cleaning time.

Some of the first users I removed were clearly sockpuppets. One user sends dozens of posts per hour including pictures from Victoria’s Secret, Bob Marley, popular image macros (such as keep calm memes), and retweets of popular figures like Joel Osteen, along with some libertarian content. This seems like it could be the work of a robot as it maps what would be popular content.

Sorting out what it means for such a large percentage of links to include rich media is also difficult. Could half of people want to share an image with every conversational tweet, or is this just another indicator that this is an artifact of autotelic speech?

Here is a cleaner chart:

Cleaner

Suddenly Scott Walker is everywhere because of an extreme position on immigration, Trump is in the pack, and the polls don’t seem to matter. The top two clusters are still a bot retweeted tweet and a counter effort to that tweet. This is not an organic conversation network.

In short, Twitter is not a poll. Twitter is a microbroadcasting network utilized by a variety of groups to create the illusion that they enjoy broad popular support. Although this will not curtail research on Twitter, or the coverage of Twitter by the legacy media, it is important to remember that it isn’t a transparent window in the social now, but a contested rhetorical field.

The alternative to the consideration of Twitter as a rhetorical struggle is to suppose that a spammer retweeting pictures from Victoria’s Secret and cool pictures of guitars  is an expression of genuine politics.

What is the purpose of this project?

Twitter is important. Facebook is super-duper important.

Or at least it is important to people working in the popular press. The characterization of Twitter activity in campaigns and elections is haphazard at best. Often, Twitter provides an unlimited reserve of backing for different characerizations of the public mood. The volume of Tweets their interaction or their representativeness is less important than their capacity to prove basically any argument right.

Perhaps most annoying, and telling, is the common journo phrase, “they even coined a hashtag,” used as a marker of the importance of a social moment or movement. Creating a hashtag is a joke. It may be the easiest thing in the world.

Aggregative research, similar to that of Sam Wang or Nate Silver, is not really an option in understanding the flow of information across social networks. To gain access one would need an “internal champion,” and likely a research agenda that would benefit a large social network site. I have neither of these; neither connections nor extensive resources. Using multiple, redundant approaches, I will be mapping social network activity related to the 2016 election and putting regular updates, graphics, and working theories on this site. Resources built as a part of this project will be available for collaboration with other academics and popular press writers. In addition to providing data, I will be building graphics both using traditional design software (Sketch is my go-to drawing program) and d3.js resources.

In short, this page is a hub for information related to empirical analysis of actually existing social network information flows related to the 2016 election.