This post is in response to a post by Mark Liberman that appeared on the blog Language Log. You can see the original post by clicking here.
Well, I can't say I've been called a "flack" before, but I guess readers can decide if it's deserved.
Given that the top company value at SpinSpotter is transparency, I'll be as open as I can possibly be in responding to Prof. Liberman's post. It's probably best to start with a detailed description of what SpinSpotter actually does.
SpinSpotter consists of three key parts:
1) An advisory board of prominent journalists from across the political spectrum who set objective rules for what constitutes “spin” in journalism.
2) A guided form of crowd-sourcing that steers users to operate strictly within the rules handed down by the Journalism Advisory Board.
3) A computer algorithm on the back end that aggregates and analyzes users’ input to isolate the words or phrases that, at any particular point in time, constitute the most consistently egregious instances of spin . The algorithm then leverages this knowledge across the Web by pre-marking those words or phrases wherever they appear, and asking users to rate the spin as being significant or not significant within the context of the article they're reading. So, for example, if a phrase such as "third Bush term" begins to appear repeatedly, is being consistently rated as "High" spin, and tends to consistently fall under the same rule violation, the algorithm will begin marking it as spin, and inviting users to determine if, within each particular context, it is, in fact, being used as spin.
Though there are some complexities to what the algorithm does, it's not what you would call high science. We do have some hopes for expanding the role and sophistication of algorithms in our system down the road to include at least some limited ability to infer spin all on their own. But we realized early on that the state of the technology in this area, even when using the latest in NLP (Natural Language Processing) or even LSA (Latent Symantic Analysis), simply isn't yet up to the task.
So the primary focus of our development efforts has been on developing a very guided form of crowd-sourcing - a system that, once all the kinks have been worked out, rewards objectivity, encourages participation, and makes it as difficult as possible to "game" the system. We recruited a world-class mathematician, Dr. C. Andrew Neff, to help us with this. He has been focused on the development of our trust engine, which is the set of calculations that determine the extent to which any particular user's comments will be seen by other users and influence the back-end inference algorithm.
Since our system is fundamentally user-driven, we knew and were concerned about the fact that there would be no spin markers in our system on day one (just as there were no friends in Facebook on the day it launched in 2005). We tried to address this by hiring a small group of journalism students to serve as pre-beta users. Our hope was that they could begin to populate the system with spin markers, and begin to feed the back-end algorithm so that it could start to find likely instances of spin even before we released the beta version of our software to the public. Unfortunately, this plan proved disappointing for two reasons. First, we overestimated the number of hours the students we enlisted would actually put in over the last few weeks of their summer vacations and first few weeks of classes. Consequently, the spin markers that actually appeared, being spread across a range of publications, were barely noticeable. Second, because of the limited number of spin markers being created, we were not able to generate the level of input necessary to make the back-end algorithm produce reliable, high-quality results. It could generate a good number of spin markers if we set it to its lowest trust rating, but the quality of those markers - in terms of the percentage of markers that reflected significant instances of spin - was limited. We ultimately decided is was better to launch without algorithmically-generated spin markers than to launch with suspect algorithmically-generated spin markers, because we didn't feel doing otherwise was fair to journalists, or to the reputation of our system. Of course, we could alternatively have decided to delay our launch until we felt comfortable about the breadth and depth of spin markers populating the system, but - rightly or wrongly - we felt it was more important to get our beta software out there in advance of the U.S presidential election. Given the heat we've been taking from Prof. Liberman and others over the fact that there are so few spin markers in the system, you could argue we made the wrong choice! But we're encouraged by the early activity on the system, and are hopeful there'll be a critical mass of spin markers in the system by the time election day arrives.
No matter how you slice it, we certainly didn't do ourselves any favors by making some key mistakes in our launch communications. Of particular note was a graphic that appeared on our website, since replaced, that was cited by Prof. Liberman for implying that we had in fact developed an inference algorithm capable of identifying spin all on its own. The offending graphic, which appeared on our What We Do page, showed a web page containing spin markers with three call-out boxes around it. The first box said, "SpinSpotter looks for areas of news which appear to present editorial opinion as fact or other instances of 'spin' from a published list of rules of spin." While this sentence accurately describes SpinSpotter's general intent as a company, the way the sentence was positioned in the graphic strongly implied that SpinSpotter had an algorithm to do this, independent of any user involvement. In fact, this sentence should have appeared as the heading of the graphic, and the other two call-out boxes, which describe user input and the way in which SpinSpotter learns from user input to technologically create additional spin markers, should have appeared below it to describe how the system actually works. The fact that this graphic made it into the initial release of our website without being corrected is wholly my responsibility. The graphic was a first draft sent to me shortly before launch, and in reviewing all the copy going into the site - and in the rush of needing to create some crucial copy that was missing - I failed to review it properly before forwarding it to the engineers for posting on our site. To make matters worse, in the rush of launching the service at the DEMOfall08 Conference, I did not proof that page of the site until I got back in the office several days later.
As for the press coverage Prof. Liberman cites, I wasn't there when our founder, Todd Herman, was being interviewed by The New York Times. What I do know is that on the same day Todd spoke with Claire Cain Miller of The New York Times, he also spoke with Elinor Mills of CNET, and the article Ms. Mills produced struck me as a generally accurate description of what we do. I also know that in every interview I've personally been involved in I've bent over backwards to explain how the rule set is used to guide user input, and the user input is then used to feed the algorithms. As an example, you can read the article by Jake Swearingen of Venture Beat, who, interestingly enough, chose to include within his article the offending graphic I describe above, and he still got the story straight. I also feel confident that any reporter who stopped by our booth at the DEMOfall08 Conference pavilion - a group that included the venerable Walt Mossberg of The Wall Street Journal - clearly understood that the system is fundamentally user-driven.
Two technical notes pertaining to Prof. Liberman's post. First, he is absolutely correct that the browser plug-in does not do any analysis within the user's browser, but rather "calls home" to the SpinSpotter server to see if any spin markers exist for the page the user is viewing. The reason for this is simple: Doing any calculations within the browser itself would slow the browser down, which could become really annoying for users. So we do any processing that needs to be done back at the SpinSpotter server, then upload it to the toolbar when the user arrives at the respective web page.
Second, Prof. Liberman is also correct that spin markers are unique to the page on which they were created - i.e., you can't mark up a page and then see the markers on a different incarnation of the same story (e.g., in print format, or when reading an AP story that appears in multiple publications). This is due to the fact that, to place a spin marker at the correct location within a web page, we have to work with the XPath of that marker in relation to that page. Because XPaths are like street addresses for the HTML elements on a page, they vary from page to page, and while we can interpolate between pages in some instances, we haven't been able to find a reliable way to interpolate between all. So we keep the spin markers specific to the page on which they were created. Maybe someday we'll find a trick to interpolating between all similar pages, but we aren't there yet.
The bottom line in all this is that we are proud of what we've created; but we also recognize that the beta version of the system is not yet everything we'd like it to be, and we recognize we've made some mistakes (and one key mistake in particular) in how we've presented what we do to the world. At the moment, we're in the process of fixing several early bugs, including a bug in the back-end user input algorithm which has caused us to take it offline for 2-3 days. We'd like to get the Internet Explorer version of the toolbar up and running (hopefully, within a month). And we're still working on getting enough early spin markers into the system to make it interesting for people on first use, at least across some of the top online news sites.
But, in general, we're pleased with how this first-pass beta version of the system is working, and we're excited about the possibilities. We're also incredibly open to suggestions and criticisms. We don't by any stretch claim to have gotten everything right the first time out, and we welcome the input of Prof. Liberman and anyone else willing to take the time to scrutinize what we do with a critical eye.
One final note: We've taken to heart the feedback regarding our semantic sloppiness around our use of the term "Passive Voice," and will be enlisting the help of an English professor to clean up our act.