Nate Silver: “The Signal and the Noise” | Talks at Google

Nate Silver: “The Signal and the Noise” | Talks at Google

HAL VARIAN: OK. Well, I guess we should
get started. My task is to introduce
our speaker. Now I suppose it’s theoretically
possible that everybody just wandered into
this room by chance, expecting that something interesting
might be happening. I don’t know. What do you think, Nate? Is that plausible? [LAUGHTER] NATE SILVER: You could
have herding, right? A few people actually know
what they’re doing, and everything else kind
of follows along. HAL VARIAN: Yeah. It could be, but I’m guessing
that event has a relatively low probability. So I’m guessing that most of
the people know who our speaker is, and you
know his story. So I’m not going to
introduce him. I’m just going to turn
you over to Nate. [APPLAUSE] NATE SILVER: Well, thank you. I’ll spare you the kind of
wedding toast speech, and we’ll get to the questions
pretty soon. But it’s a real honor to be
here at Google, in my view probably the smartest company
in the world right now about working with data and
using it to actually transform people’s lives. And I hope that some of what we
were able to accomplish– and I tend to use “we” when I
mean “I” sometimes, right? But I do see it as kind of
a collective project that FiveThirtyEight is a part of, to
make journalism smarter, to make political coverage better,
to just have people be a little bit more data literate
and data savvy. And when that kind of hits into
the punditry, interesting things, I guess, can
happen sometimes. But we have a lot of great
questions, it looks like, I think we’ll leave time–
there’ll be time for questions at the end from people
in the room as well. So let’s get started. HAL VARIAN: Absolutely. So we produced all these
questions using [INAUDIBLE], so these are questions
submitted by members of the audience. The techniques in
FiveThirtyEight’s methodology section were invented
ages ago. Someone could have done what you
were doing by at least the 1990s or earlier. Why did it take the world so
long to produce and appreciate Nate Silver? [LAUGHTER] NATE SILVER: I was busy
playing poker. HAL VARIAN: Besides your mom. NATE SILVER: My mom, right? No, if the Congress hadn’t
basically banned internet poker, then maybe I’d still
be doing that, right? [LAUGHTER] NATE SILVER: So in all
seriousness, though, it’s only been a few election cycles,
maybe the last four or five, where you had as many state
polls as we do right now. And it’s only been more recently
than that we can really collect them pretty
easily on the internet, where you do a Google News search or
you steal from Real Clear Politics, or whatever else. But collecting all that data
and have it be essentially free isn’t something that’s been
able to be true for all that long, really. But also, I think people
are taking more of a do-it-yourself mentality toward
aggregation and toward producing content. When a news organization– and the Times is
no exception– produces a poll, they want
you think it’s the only poll in the world. They don’t fry the context of
well, this poll says that Ronnie’s ahead, by the other 12
polls say he’s not, which is what a reporter probably
should do. So you have, I think, some
perverse incentives in news organizations who produce their
own polls– which are expensive, and I’m
a fan of good, expensive telephone polling– to pretend that no one else
is doing the same thing. And so I think that’s
part of it. It kind of cuts against the
newsroom’s first instinct. And part of it is that this is
actually a relatively new thing, where you have so much
data available for free. HAL VARIAN: So right before
the election, the Romney campaign claimed to be confident
that they would win. What was wrong with their
internal polling? Why were the predictions
so far off? Or were they just keeping a
straight face when they knew defeat was coming? NATE SILVER: Yeah, I think maybe
more– well, there are different layers of the Romney
organization, right? And one question is, what did
Mitt Romney himself believe? And I’m not sure about that. There’s a question of, what did
Mitt Romney’s pollsters really believe? And it’s probably not
what they told the press, exactly, right? So I think there’s some
culpability here where the press is fed spin and bullshit,
basically, and then they act surprised later on
when the bullshit in fact turns out to be bullshit. And they blame the messenger,
when they should be blaming themselves. Look, when campaign pollsters
talk to the press, they’re not trying to produce accurate
information. They’re trying to put
things in the most favorite light they can. I had very few conversations
with the campaigns. I try and avoid that,
almost go out of my way to avoid that. But when I talked to the Romney
folks, they tended to have a pretty realistic
view of what the landscape looked like. Maybe it’s because when they’re
talking to me, then I’m not going to go actually
report that. It’s just kind of a friendly
on-background conversation, so maybe you get less
spin and a little bit more actual insight. But I think it’s not
hard to tell– it’s hard for a campaign to
operate under the assumption that it’s probably
going to lose. Then it seems like it’s
wasting its time. And so there are all kinds of
different biases that can creep in, from confirmation bias
in the way you kind of construct your samples, to
what news actually gets reported to the candidate. I think any organization,
whether it’s a political campaign or anything else, if
you’re an organization that can do a reasonably good,
a recently honest self-assessment, that’s a big Edge. That’s why, by the way,
consultants make millions of dollars per year. McKinsey or whatever else,
they’re not really adding a lot of value, necessarily. I’m sure they’re very
smart people. But with the actual work
product, so much as they provide mediation services to
resolve internal conflicts in companies– we can do politically incorrect
things if you hire McKinsey to go in and say
something obvious, basically. [LAUGHTER] NATE SILVER: So it’s kind of the
same agency, probably, you have, I think, in a political
campaign, where– and by the way, one thing the
Obama campaign did in ’08– I’m not sure about
this years– they had four or five different
pollsters that worked independently from
one another, at least to some degree. So they would have maybe the
same, two different pollsters in-house survey Ohio. And then if you have a
difference of opinion in one candidate– one pollster might show a tie
in Ohio, and one might have Obama up five– then you can start to iron
that out, and those discussions are often
productive. But within a campaign,
it can become something quite insular. I think in general, people
overrate the value of the campaign’s internal polls. Even if you actually saw the
real internal polls– which you don’t– even then, I think, even though
some of the pollsters are very smart, they’re
operating under worse incentives than you have
for a public pollster. It’s hard to add value relative
to simple methods with the extra things you might
do, and so if you have bad incentives, then often that
leads to worse overall results, even if you’re a more
capable person in some abstract sense. HAL VARIAN: Your remark reminds
me of the definition of a consultant– somebody who borrows your
watch to tell you what time it is. NATE SILVER: Exactly. [LAUGHTER] HAL VARIAN: So probability
is hard to grasp for the layperson. So what do you think the best
way is to visualize the prediction data? NATE SILVER: I think if you have
something which is like geographical data– so the
National Weather Service actually has worked on this
problem a lot with its hurricane forecast. If you can actually show
people the “cone of uncertainty,” as they call it,
then they gradually come to understand what that means. They’ve also kind of worked
on problems like how much uncertainty do you want
to show in that cone. If you show the 90% confidence
interval, in essence, it would still be very, very wide. If you show the 50% confidence
interval, well, it’s kind of useful, but you’re going
to be outside that range half the time. They kind of settled by trial
and error on showing 2/3 the possible outcomes. But probability, it is something
where you can say all the time that well, this
kind of has an 80% chance of winning, and it’s a
straightforward, kind of objective statement. But people don’t really grasp
what that means, exactly. That’s why I say, having played
poker, it comes more naturally to me. Because you know that if your
opponent has to catch a flush with one card to come, he’ll
make his hand 20% of the time. You’ll win 80% of time. But believe me, you’ve played
plenty of hands, and you know, you take those bad beats
20% of the time. You know what it feels like. You busted out of tournaments
or lost a lot of money on those hands. So you’re very aware of
what 80-20 really kind of feels like. I think that’s harder for
people when they’re not encountering probability
on a day-to-day basis. I think in general, the news
media has a tough time, and maybe people in general
have a tough time– everything is either
50-50 or 100-0. It’s either, oh, too close to
call, or it’s like, we’re sure what’s going to happen. It’d be a shock if anything
else transpired. When usually we’re operating
on the margins here. And if you guys are
engineers– I know some of you
probably are– you know when you’re building
a program, you’re making little incremental improvements
most the time. More often, you have
this a-ha moment. You’re debugging things
and you’re making these marginal gains. And that kind of process doesn’t
necessarily make for great stories, necessarily. And so that also clashes with
the way that the campaigns tend to be covered. HAL VARIAN: I just read a
review of a new textbook called “Introduction to
Probability and Statistics Using Texas Hold ’em.” NATE SILVER: Yeah, yeah, yeah. HAL VARIAN: True story. True story. So maybe literacy will
increase, if we get the right materials. NATE SILVER: Well, I know some
investment banks and hedge funds are now training their
people by having them play Texas Hold ’em. Somehow that attitude, and
the instinct to take new information that you
perceive and have a better first judgment– I met a guy who says, oh, don’t
do the Malcolm Gladwell “Blink” thing. You want to think through
problems, think a little bit more slowly. At the same time, to have a
better first cut when you encounter a new piece of data,
a better first instinct, whether that’s meaningful or
not, whether it should change your summation of a problem or
not, that’s helpful as well. And poker is a very game
for developing that skill, I think. HAL VARIAN: So as you and your
model gain renown and credibility, how will your
forecast affect the very events they aim to predict? And how can you adjust for
this in your model? NATE SILVER: Well, I was talking
with some people over lunch, and it’s going to be a
problem that you guys might encounter as well, where because
Google a itself is so influential on what people
identify and maybe what trends go viral, then it becomes a
very challenging problem, potentially. I would hope that we
didn’t influence people all that much. It worries me maybe a little bit
more in something like a primary campaign, where voters
are being more tactical, if you have to pick between
five or six options. There’s some evidence, by the
way, this worked toward Rick Santorum’s benefit in the Iowa
caucus, where you basically had three interchangeable
candidates, more or less, Bachmann, Perry, and Santorum,
who were all polling at about 10%. That’s kind of an inefficient
outcome, where people who would want Bachmann to win the
Iowa caucus for some reason would probably be happy
with Santorum as well. So once one candidate had any
edge, you saw a lot of tactical shifting. And in this case, there was
some CNN poll that was published that had
Santorum gaining. It could’ve been real. It could have been a
[INAUDIBLE] fluke. But that gave him momentum,
and then people joined the bandwagon, and the surge became
a self-fulfilling prophecy after a while. In general elections, where you
have just two parties, you have less of that tactical
behavior. So I’m not as worried about
it, I don’t think. There can be things, though,
like when we had US House Forecast in 2010– we’ll do it again in 2014– there are scarce resources. So if you give a candidate only
a 5% chance of winning, then maybe her congressional
committee won’t fund her as much as another candidate. And so that can be
complicated. Or when we do things like say,
well, here’s the optimal allocation of resources between
different states. We had a return on
investment index. Implicitly, that kind of assumes
that the campaigns are behaving the way they
have in the past. It’s kind of like, sort of
like the Lucas critique, almost, in economics, where
you’re modeling things based on past data, but you’re
implicitly accounting for decision-making of past
policymakers. So the extent to which certain
states might move relative to other states implicitly includes
the assumption that campaigns were making before. And if they change the way
they behave, maybe target particular states differently
than they did in the past, then you have some structural
uncertainty in the problem. So these things are really
complicated, and they kind of keep you up late at night. I think they’re one reason, in
general, to when you are making a choices, as far as
specifying a confidence interval, to err on the side of
including more uncertainty rather than less. Although you can have the
reverse case, where if you have a self-fulfilling prophecy,
then you can be accurate, just because that’s
what you predict is going to happen, and lo and
behold, it does. I did some work for a movie
studio a couple of years ago. And they had a belief in their
culture that some particular random day– say, the first
weekend in October– was a very good time
to release a movie. I think they had had some movie
that just had a good script one year, was a
well-directed movie that did well in that slot. So every year thereafter, they’d
put some breakout film in that October 7 slot, let’s
call it, and because they believed it was going to do
well, they’d pump a lot of marketing budget behind it. And lo and behold,
it did do well. [LAUGHTER] So this became more are more
entrenched, and they seemed very, very smart, because they
predicted that every year this film would be successful. But of course if you pick one
of your better films and put lots and lots of marketing
muscle behind it, then it’ll work. It has nothing to do with the
date on which it’s released. HAL VARIAN: Yes, we see
this at Google. I talked to an advertiser
once who said, I know my advertising works because I
increase my expenditure every December, and what do you
know, my sales go up. [LAUGHTER] HAL VARIAN: So what do you
think is the best way to evaluate the performance of
forecasters who are computing probability estimates
for discrete events? And in particular, what if
these events are rare? NATE SILVER: If they’re
rare, it’s tough. If they’re common, then
calibration, in my view, is the best test, meaning of the
things you say will happen 80% of the time, do they happen
80% of the time? And it’s problematic if they
happen half the time, but also if they happen 100%
of the time. People don’t realize that if
you make these 80-20 calls, you’re supposed to get that
wrong 20% of the time, of you’re doing something wrong. You’re being underconfident,
I guess. And sometimes we’ve been
accused of being too conservative in the confidence
intervals that we specify. But there are also
cases where– the Senate candidate in North
Dakota, Heidi Heitkamp, we had her with only an 8% chance
of winning, and she did. I think we had Rick Santorum
with a 3% chance of winning the Iowa caucus at one point
before his surge, or a 2% chance, and he did. So we look at that over
the long run. But over the short run,
then you have to be– I don’t know. I mean, if you don’t have a
large enough sample size to test things truly at a sample,
then yeah, there aren’t any great substitutes for it. And I tended to fall more
to, I guess I call them Bayesian priors. But to think, does the structure
of this model make a lot of sense? You can make inferences. One thing I found helpful in
writing my book is that you talk about forecasting practices
in different fields, and so you can learn implicitly
from that. Well, here are some good
principles that generally produce more reliable
forecasts. So by following those
guidelines– and if you are, then having one
additional case might not tell you all that much,
necessarily. If you have 100 additional cases
or 1,000, then sure, it tells you an awful lot. And one thing that Google gets
to do, is because you guys can kind of collect data on demand,
basically, from billions of users, here, you
can solve a lot of things through trial and error. And if you have that much data,
then that is probably the better way to do
it, more kind of a brute force approach. But I think people aren’t
sensitive enough to how much your whole attitude and approach
should change when you’re in an environment
that isn’t data-rich. You almost have to do some of
the opposite things, really, I think, in and not be overly
fixated on how any one result goes. But it’s tricky for people. HAL VARIAN: So tell us about
your tech set-up and workflow. So what programming and stats
languages do you use? And one question I know this
audience wants to know is, do you work in a cubicle with
three other people? [LAUGHTER] HAL VARIAN: Do you get
free massages? NATE SILVER: I do not
free massages. The food at “The New York Times”
cafeteria is not as good as the food at the
Google cafeteria. So I don’t really have as much
of a routine as a lot of people do, in part because I
go through different phases where I’m traveling a lot or
writing the book or promoting the book or doing media or not
doing media, or working on a model, or going to the office. So there’s not really a
routine all that much. I do have a desk at the “Times”
which is shared with Micah Cohen, also
of the 538 blog. But I go in probably two or
three times a week if I have something else to do in Midtown,
otherwise not. I find it hard, actually,
to do writing when you’re in a newsroom. To write, I need total quiet
or music playing. But I think writing is the
thing that requires more concentration than
anything else. Although programming, also– I’m not a very good programmer,
so when actually have to program the Staticode,
that requires a lot of effort. And for that stuff, I
think you need some silence and some quiet. Where offices are a
great place to go to socialize, right? [LAUGHTER] But I think for some types of
tasks, you need to think on your own as well. But my set-up’s not
very advanced. I have an ASUS laptop, which I
need to replace because it’s basically falling
apart right now. I use Stata for most stuff
and Microsoft Excel. I used to try to do everything
in Excel, and that kind of was a bad idea. I mean, Excel’s not
a bad program. It’s a pretty good program. You can kind of think of it as a
visual programming language. But in 2008, it took like two
hours to actually run the model on Excel, because
it was so slow. Now you can do it in about five
minutes, and I’m sure you could get it down to a minute
and a half, if I were a more competent Stata programmer. But to have some kind
of code is helpful. HAL VARIAN: What’s the next
thing you want to learn in this area? The tools area. NATE SILVER: I don’t know. I mean, maybe R. But it’s
hard, you know– HAL VARIAN: That’s
a good answer. [LAUGHTER] NATE SILVER: The switching
cost is pretty high. When you’re use to having the 15
Stata commands that I know, I know how to make them do
anything in ways that are probably really inefficient
in some sense. But it’s hard to switch. But it might be good to
go to R or something. I think that would be the one
that would make the most sense, I would think. HAL VARIAN: So what’s the most
statistically unsound tactic in professional sports? NATE SILVER: Well, I think in
terms of being something which is routinely done badly, and
which costs teams quite a bit of win expectation, is the way
that NFL teams operates on fourth down. They punt way too much, and to
a lesser extent, go for field goals instead of going for it
more often than they should. I’ve always found this ironic,
by the way, because the good, masculine thing to do would be
to keep your offense on the field and go for
it more, right? You’d think the bias would
almost work the other way. But it doesn’t. Teams punt way too often. They’re much too risk-averse. The announcers tend to be really
punitive to a guy when he decides to go for it. Basically on fourth and two or
fourth and one, you should be going for it almost all the
time, unless you’re deep in your own territory and you
have a big lead already. But you know, but your
incentives, again, become distorted, where if you go for
it and the quarterback that sacked or something, you’ll
come in for a lot of heat, when the normal thing to do is
to punt, and so you don’t risk as much there. But there are some teams– the Patriots are very
good about how they manage that stuff. And those decisions are worth
about half a win per year. Which doesn’t sound like much,
but over a 16-game schedule, that’s worth quite a bit. It’s worth basically millions
and millions of dollars, if you look for the price
that an NFL team pays for a marginal win. And just based on reading this
literature and keeping your offense on the field more,
it should be fun. But teams have screwed that
up for years now. HAL VARIAN: This is the
Paul Romer find. NATE SILVER: The Paul
Romer paper. Yeah, yeah. He was the one who found that
it’s actually half a win a year, just based on this one
type of decision, which will come up three or four
times in a game. But it’s a odd that so
few teams have done that reading, I suppose. HAL VARIAN: They don’t read
the “Journal of Political Economy,” for some reason. NATE SILVER: No, they– yeah. [LAUGHTER] HAL VARIAN: So in the North
Dakota and Montana Senate races, which your algorithm got
wrong, you excluded polls that turned out to
be accurate. So what went wrong, and do you
plan to make any changes in the future? NATE SILVER: So I guess in the
North Dakota case, the Heidi Heitkamp campaign– she was the
Democrat in that race– released polls that
had her ahead. And usually, internal polls
exaggerate a little bit, although the rule is kind of
like, if you’re actually winning, you don’t need to
lie in the internal poll. If you’re losing,
though, you do. So you kind of have
these asymmetries. Some people said the fact that
she was releasing polls and her opponent wasn’t
was interesting. But in general, we
don’t use polls released by the campaigns. We think in the long run– for
example, in the Wisconsin recall election earlier this
year, you had a number of polls released by Democratic
groups that had the race tied, whereas the public polls had
Scott Walker up by six. And Scott Walker won by
six, pretty much. And so that’s the case
more often than not. But North Dakota was a case
where the model, if it doesn’t have polls, it defaults
to what we call state fundamentals, which means that
hey, it’s North Dakota. The Republican will
probably win. [LAUGHTER] NATE SILVER: And so there wasn’t
very much polling data at all, so it gave the
fundamentals a fairly heavy weight. And I don’t know, I think it’s
a well-designed, sensible model, basically. Although that was a case where
people who were using the Heitkamp internal polls
often got the right call, and we did not. But I’m willing to
live with that. I mean, one other thing, too,
was that we modeled that, we calibrated it based on how well
polls have done since 1990, looking back on every
Senate race since 1990. And it’s not like the
fundamentals calculations are perfect, but the polls are
also often pretty bad in Senate races, so it helps
to hedge your bets a little bit more. If it’s true that the polls are
on a secular trend toward becoming more accurate,
then you could trust the polls even more. I’m still a little bit skeptical
that polls are really on a long-term upward
trend, when it’s becoming harder and harder to reach
people on the phone. Internet polls, including
Google’s, had a pretty good year this year. But so, I don’t know. Look, we’re going to have some
year where the polls do bad, probably, and someone will look
really smart for saying, don’t trust the polls. I mean, frankly, the polls in
the GOP primaries this year were a pretty wild ride. But look, if the polls are
right, then just trust the polls, and you’ll
do pretty well. It’s hard to add, necessarily, a
ton of value on top of that. HAL VARIAN: So 538 is
exceedingly transparent about its processes and inputs,
but it’s not 100% open. Is there any path to being able
to open up the source or allow others to run simulations
as well, or is that not in the cards? NATE SILVER: So there
are kind of three different levels here. One is the level of transparency
that we have in practice, where things are
described very fully, and I think there’s room– so sometimes you’ll have, say
the academic standard is maybe you release your source code and
all your data, and there’s a lot to be said for that. I think it’s also important,
though, to explain not just what you’re doing, but
why you’re doing it. You might have a fully-specified
model that’s publicly available but that’s
also a terribly designed model, an overfit regression
model or something that doesn’t make any sense, or
makes very questionable assumptions. So the fact that we’re kind of
walking and talking through the model and explaining why
we made the choices that we did, I think, is helpful. With that said, I would ideally
like to have fuller– not complete, but more
transparency, and more formal language describing
what we’re doing. Part of what happens is that
writing these articles that are methodologically based
take a lot of time right. And so I have an aspiration to
have a lot of this basically, a 10,000 word article that
describes the FiveThirtyEight model more fully, and then
there’s some breaking news story, Sanders makes some joke
about rape or something, and you wind up covering that and
not catching up with the methodological stuff. But I would like to move toward
a place where we have more disclosure and transparency
and it’s fairly explicit, and yes, people
can reverse-engineer it. I don’t think we’re literally
going to release source code. It is a proprietary product. One thing I have, is because I
license it to “The New York Times,” if you release that
source code, then I’m paying out of pocket, in essence, in
terms of when I next have the deal come up. But the idea is not really to
hide anything, and I’ve always answered all questions
about them all. I just don’t want to quite make
it literally so easy so someone can run the source
code and permute it themselves. You also have this perverse
thing, and maybe you guys have experienced this, different
teams here, where sometimes when you disclose more,
it won’t shut up the trolls, basically. It won’t. They’ll just find other things,
other little details to nitpick. So I think it just means,
what’s your own kind of ethical practice? As far as, just so
you’re clear– if you disclose things more
fully, you’re like, well, you guys know how this is working,
you guys know that I’m not manipulating these numbers. Like some people thought that we
actually go in and manually give a weight to every poll,
where’d I go, here’s “We Ask America” poll in Nebraska. 1.23653. Or just making up these long
numbers arbitrarily. Whereas no, it’s all done
algorithmically. But I think things
could have been a little bit better explained. I would strive to do more
of that going forward. But you do have to kind
of carve out time, I think, for that. And one thing also– sorry, I’m
kind of giving long-winded answers here. But we’re also hoping that by
the next election, I’ll have a bigger team around
FiveThirtyEight. I mean it really is, we have
great interactive graphics people at “The Times” and great
editors and so forth, but in terms of content
producers, it really is just me and Micah. There was not a big
FiveThirtyEight team. And the “Times” understand that
the 538 team needs to grow, or if I go somewhere
else, then probably the same thing. But so that’ll be different,
I think, next year. But a lot of it just is, like,
well, I ran out of time to do things I would otherwise
want to do. HAL VARIAN: So on this
methodological point, what do you think about market
approaches, political stock markets, like the Iowa
Electronic Market or Intrade, or these other cases where
you’re looking, basically, at crowdsourcing the forecasts? NATE SILVER: So look, in
general, I think that markets are the least worst solution,
a lot of the time. Very often. It doesn’t mean they’re
perfect, though. I think there is a question
of, how informed is the average trader at Intrade,
for example? And by the way, the government
decided, the CFTC decided to sue Intrade today. HAL VARIAN: Oh. NATE SILVER: I’m not sure that’s
really the best use of taxpayer dollars. So that could be an interesting
development. But I think that sophisticated
political observers could probably beat Intrade. I’m not sure they could, though,
if they were more real money on the line,
I guess I’d say. I don’t mean to sound totally
pejorative or anything. But look, if you really had
a good play, if you really thought you had a good model
of the election, then you wouldn’t be betting for a few
thousand bucks on Intrade. You be instead, rather, working
at a hedge fund, developing a portfolio. It’s pretty clear that, for
example, coal stocks were correlated with Romney’s chance
of winning at Intrade. Or green energy stocks were
correlated, conversely, with Obama’s chance of winning,
or different types of health care stocks. There are enough companies where
the have predictable enough economic affects on the
election, where even if you have some fuzziness in building
your portfolio, you can make a lot more that
way, potentially. But the other question is, you
can always have herding in markets, where you have the
blind leading the blind, potentially. I think there were times at the
end of the election where there were some odd activity
on Intrade, where Romney’s price would shift a lot with
no real reason in the fundamentals. There were some theories that
you had wealthy Republicans or whatever, conservatives, kind
of buying these shares to influence the way the race
was covered by the media. And you had divergences where
in Intrade, Romney might be priced with a 35% chance of
winning, but at other markets, he’d have a 15% chance or 20%
chance, really big spreads that are not consistent with
highly rational behaviors. So you have this kind of
abstract question of, what if you really had a highly liquid
stock market or prediction market, how well
would that do? Then you have the question of,
well, how well does Intrade do in practice, given the
constraints it has on people depositing money and
the lack of volume? And I think the answer
is that it’s useful. It’s a lot better than the
pundits, for example. But I think once people
start to invest it with godlike power– like, oh, it’s the market, it
must be right– that’s exactly when markets fail. It’s when you think markets are
infallible that markets become much less useful tools. HAL VARIAN: So can you put
historical polls in your model and simulate the predictions,
from 2000, 2004, et cetera? And it would be nice to know if
the elections are ever as close as the pundits say. NATE SILVER: Well, I think at
the end of the campaign, both the 2000 and 2004 elections
would have been closer. But remember, in 2008, on the
weekend before the election, the McLaughlin Group, they asked
their panel, who’s going to win the election
next Tuesday? And three of the four panelists
said it was too close to call. And in 2008, this wasn’t
even debatable. Obama was up by 7 points in
every single poll in every single swing state. I mean, just no excuse for that
at all, if you’re not just constantly trafficking
in bullshit, basically. [LAUGHTER] NATE SILVER: So you know, the
fact that in 2012, you met some resistance, it shouldn’t
have been surprising, exactly. I’m not sure. You know, in 1984, there’d
probably be some columnists saying, well, Walter Mondale,
don’t count Walter Mondale out. [LAUGHTER] NATE SILVER: People
haven’t voted yet. Anything could happen. We’re hearing reports of more
yard signs in Roanoke. [LAUGHTER] NATE SILVER: But there’s always
going to make the contrarian argument, and it’s
hard for people to kind of accept, I think, reality,
in some sense, for the end of a campaign. HAL VARIAN: So let’s switch
to a sports question. Coaches and managers are
generally judged by how many games are won and whether or
not they win championships. Do you believe there will be
a reliable way to measure coaching performance from
a game-to-game basis. NATE SILVER: It’s tricky. I mean, in the NFL, maybe you
can do a little bit more with those play-calling things
I talked about. But I do think that one bias
that stat-heads sometimes have is to assume that if you can’t
measure something easily, that it isn’t important. Hypothetically, if you had a
baseball manager who got his players to perform 5% better
than they otherwise would– and how do you define 5%? I’m not sure. Either it was because they’re
better rested or you have a better clubhouse environment
or anything else. Then that would be extremely
valuable, and that would outweigh the cost of laying down
too many sacrifice bunts or whatever else. Now can you measure that? It would probably
be pretty hard. You have too many degrees of
freedom in the problem to really pin anything down
in a more reliable way. But you know, both baseball
teams and basketball teams, I think, have learned, for
example, that as it’s become easier to measure defensive
contributions from individual players, that they were probably
underrating defense, if anything. Maybe stat-heads
in particular– Billy Beane’s earliest kind
of mid-’90s ace teams had terrible defensive outfields,
because you couldn’t measure it, and so it was kind
of assumed that oh, it doesn’t matter. But when you had Matt Stairs
playing center field or something, that created
big problems. And so they kind of learned the
hard way that you had to try and measures this
stuff and gain some advantage from that. But I think there could be
coaches that make more of a difference. It seems certainly, if you look
at, for example, college basketball programs, that a
coach can be hugely valuable. That gets tied into recruiting
and everything else. So it’s hard to say. But I don’t know. So are we ever going to be able
to have a coach metric? Probably not, unfortunately. It’s a hard problem
to address. But that doesn’t mean their
value should be dismissed out-of-hand, I think. HAL VARIAN: So which past
elections in the modern era would you most want to apply
your methods to, and why? Range over time and
space here. NATE SILVER: I think it would be
fun to apply your method to an election that had three
viable candidates. The 1992 election, that close
to the end would have been a lot of fun, potentially. Although also, the variance
is much higher in a three-candidate race. A lot more [INAUDIBLE] things
can happen, so you could wind up with more egg on your face. But yeah, that would be the
fantasy scenario, is to deal with an election where you had
three viable candidates and you states that could shift
between blue and red and green, and that would be fun. HAL VARIAN: Have you thought
about looking at parliamentary elections? NATE SILVER: Well, I tried this
for the UK, and our model kind of sucked, actually. [LAUGHTER] NATE SILVER: Look, Cleggmania
wore off, and the model was really bullish on Nick Clegg. But that’s also a case where
in the UK, you don’t have a lot of district-by-district
polls. So you basically have
national polls– imagine if we only had national
polls in the US, and then you had to infer from
that what the electoral college would look like. Any assumption you make is going
to be potentially crude, or it’s going to be crude
and potentially flawed. And that’s kind of what you
had to do in the UK. So that, plus the fact that you
did have three parties, and you did have strategic
behavior at the end, where the Liberal Democrat voters were
basically, more of them were Labour voters than Conservative
voters. And so when Cleggmania, when
that bubble burst, then Labour overperformed our predictions
there. But it is interesting. Look, anything can be modeled,
as long as you can accurately predict the confidence
intervals, basically. In the primaries, for example,
we would often say, well, we have no idea what’s going to
happen, that Rick Santorum could get anywhere from 10% of
the vote to 40% of the vote, because that’s what a realistic
confidence level looks like, based on how
accurate the polls have been in the past. Although, I would like
to do better. I think our model for working
on the primaries this year– I was kind of in a
minimalist phase. So I’m like, well, I know the
polls stink, but let’s just take the polls and see how well
they can do and have them be well-calibrated. And I don’t know, I think I’d
like to add a few more elements into the primary. So for the next cycle, we might
look at things like, for example, endorsements are a
really good predictor in primaries– not because they influence
voters, but because they show who the party supports, and the
party can tilt the playing field in all sorts of ways. Basically, there was a week at
which Newt Gingrich looked like he was actually a threat
to win the Republican nomination, so he basically
became a pinata, where anyone who had ever voted Republican
in their life and went on television, it was their
job that week to bash Newt Gingrich. [LAUGHTER] NATE SILVER: And so that was
how a party operates strategically to make
sure it gets the nominee that it wants. And believe me, Romney might
not have been the best candidate, but Obama versus
Gingrich would not have been a better result for Republicans,
I don’t think. [LAUGHTER] NATE SILVER: So looking at
those other factors in primaries might be
more interesting. And we’ll have, I guess, two
opportunities next cycle. Journalists should have been
rooting for Obama to win, because that way you get two
open primaries next year, instead of just one. Those years are a lot more
interesting to cover. HAL VARIAN: I think with the
success that we saw from internet polling in the last
election, polling’s going to be a lot cheaper. England, for example, has very
high internet penetration, and you could get much better
data in the future than you have in the past. NATE SILVER: Well, yeah. And it was good that the Google
poll, and YouGov, and a couple of the other internet
polls did well this year, because for better or for worse,
this is going to be the future, I think, inevitably. People don’t their phones,
certainly their land lines, in the same way that
they used to. I have a land line because it
came free with my cable package, and I use it if I have
do a radio interview or during hurricanes and
stuff, basically. But you’re not going to reach
me, the Quinnipiac Poll’s not going to reach me on my land
line, where people screen their calls. Even with cell phones now,
increasingly, a lot of younger people are using mostly text
messages to communicate, and you kind of use the phone
as a backup, almost. So that’s becoming a challenge,
and it’s kind of more natural, now, to encounter
people as they’re browsing online. Now the Google Consumer
Surveys project almost approaches being kind of
random internet poke. That’s what you do, ideally. It’s just someone’s browsing,
and then every X number of seconds, you have a random
chance of a pop-up window saying, well, who are you
going to vote for? Then it’ll let you
keep searching. That’s the equivalent
of the random-digit dial method, in essence. And Google is not doing
that exactly. But the fact that internet
polling did well is encouraging. And in fact that you have
companies that know how to use data doing this– I mean, a lot of pollsters
are fairly primitive. They’re using, likely, voter
models that were designed 50 years ago, and they’re very
defensive and stubborn when something goes wrong. I mean, for example, the Gallup
organization, their polls have been screwy
for years. People focus on the
end result– which also haven’t been that
good for Gallup– but the fact that they’ll have these wild
gyrations from day to day, I mean, that’s usually a sign
of a badly designed model. When you have wild fluctuations
that aren’t dictated by the fundamentals,
you’re probably doing something wrong with the way
they’re taking their information. And by the way, because you are
only reaching a fraction of voters now, all this really
is is a modeling question. It’s not a pure polling
question anymore. Even Pugh and the best pollsters
only get about 10% of people on the phone. The data has to be massaged, and
that means that you need statisticians to do it in ways
that are smart and accurate. And if you’re doing the same
things you were 50 years ago, then you probably won’t
be doing very well. HAL VARIAN: OK. I’m going to ask the last
question on my list here, and then we’ll open up
to the floor. Both campaigns used tech for
their “get out the vote” efforts, but the Democrats
seemed to have been substantially more effective
than the Republicans in this dimension. So that wouldn’t show
up in the polls. What would be a good
way to measure the effect of those efforts? NATE SILVER: Well, I mean,
overperforming the polls could be one potential indication
of that. But I think if you looked at
local voting patterns– did cities, for example, in Ohio
where Obama had a field office, did he get more of the
vote there than in comparable cities where he didn’t? And those could be hard to
design, because maybe there are reasons why Obama picks
certain cities, and so you have selection bias problems. But I think you’d want to look
at more macro-level data. Wen campaigns are testing the
effectiveness of advertising, that’s what they do. You run markets in Raleigh
but not Durham, or whatever, North Carolina. Although you probably
do that in practice. You can set up natural
experiments and see how well ads work. But it’s hard to say. I mean, people assume this Obama
ground game made a lot of difference. And I’m sure it made
some difference. But also, if you go back now
and look at states like California, well, Obama beat
his polls in California by several points on
Election Day. And he beat his polls in
Connecticut, states like that. So it wasn’t purely just in the
swing states where you he overperformed his polls. It was kind of everywhere. So the question is, to what
extent is the ground game edge priced into the polling
already? In theory, if you have voters
who are more enthusiastic or more engaged by the campaigns,
then they’re going to meet a likely voters screen to begin
with, and so they’ll actually be in the poll already. So I don’t know. They’re difficult problems
to figure out. I do think that, look, Bush
and Rove were very data-literate in 2000, way ahead
of Gore that year, and they had a swing state advantage
then, where they lost popular vote, won the
electoral college. Why Republicans didn’t
keep that up– McCain really abandoned the
ground game in 2008, in part because of lack of funding. Romney had plenty of money, but
I think he was just maybe a little bit stubborn. I think maybe there’s a
belief that well, our message is so powerful. Probably there’s some start-up
companies that think, like, our product is so good that
we don’t have to market ourselves, kind of thing. That’s usually a sign
of trouble. And maybe the Romney campaign
thought, you know, once people see how terrible the economy
is, well, then they’ll vote for us no matter what. When you really have to work on
the marginal voter pretty hard to get them to turn out
or to change their vote. HAL VARIAN: Sounds like you’re
saying that Romney should have hired a consultant. [LAUGHTER] NATE SILVER: So the other
question is– HAL VARIAN: Or two. NATE SILVER: Does the Obama
campaign have a systematic advantage because people who are
data-literate tend to be liberal or Democratic-leaning,
right? [LAUGHTER] NATE SILVER: And you know, I
mean, how many people voted for Romney in the Bay Area? Like maybe 20% or something? So you potentially have some of
these issues as well, where if you have a party that has
a reputation for being anti-science or anti-empiricism,
then it’s maybe not going to necessarily
get the best and brightest. And look, there are some bright
people working for– [LAUGHTER] NATE SILVER: For Romney, but I
think it actually is like a little bit of a challenge,
potentially. HAL VARIAN: One of my colleagues
at Berkeley once said, I’ve heard there’s such
a thing as a Republican, but I’ve never actually met one. [LAUGHTER] HAL VARIAN: So let’s
open it up. Now, the acoustics are
bad, and we have to repeat your question. So the floor. Back there. Loudly, please. So the question was,
what’s wrong with the polling companies? Are they not effective? Are they politically biased? What do you see the
problems there? NATE SILVER: I mean, I tend to
think for the most part, polling firms aren’t
deliberately weighing the data to match their political bias. There are incentives
to be accurate. I think you might have some
exceptions where, for example– or not for example. So Rasmussen Reports, for
instance, their business model seems to be to provide good
news for Republicans, basically, within a certain
distance of being realistic. Well, I’m just saying, that
seems to be almost a conscious decision on their behalf. Because they have polls, and
every year their polls have had a Republican lean, relative
to the consensus, so when Republicans do well, then
they’re polls are fine, but when they have a bad year,
they’re really far off. But that’d be something that
I think a company who was prizing accuracy would go in and
look at, and say, why are we always Republican-leaning and
often wrong, relative to everyone else? And you would fix it. But they seem to have a
different idea about the way that they can make the most
income and get their name in the news the most. But I don’t think Gallup
has a Republican partisan bias or anything. I think they’re just a stubborn
old company that wasn’t willing to open up the
hood enough, and wasn’t willing to admit that
they had a problem. And so we kind of get back to
this whole question of honest self-assessment. But people assume– when you had the Unskewed Polls
guy, he was convinced that the Fox News poll was part
of the conspiracy to rig the election for Obama. And believe me, when the “New
York Times,” I think, had some poll, where last year, Obama
had like a 39% approval rating, just an outlier that
kind of happens sometimes, people were like, well, you
know, the “New York Times,” they’ve become a conservative
paper now. This proves it. I mean, the polling wing of
a newspaper is usually the least-biased part of a news
organization, including, often, the reporters
themselves. There’s some great reporters,
but there’s a lot of subjectivity in how the
news is reported, even on the front page. And polling, at least if you
do the right things, if you kind of follow the recommended
practices, then you come closer to objectivity,
at least potentially. HAL VARIAN: OK. So the question was, you’ve
shown how statistics can help dramatically in sports
and in politics, so what’s the next area? NATE SILVER: If I knew,
I’d be taking meetings with VCs right now. [LAUGHTER] NATE SILVER: I don’t know. I think there are
a lot of areas– education is an area that
fascinates me, because you’re seeing more data used. And I don’t know a lot about how
it’s being used, but I bet it’s not being used all that
well, necessarily. There are a lot of things. I did a project for “New York
Magazine” where we looked at different neighborhoods in New
York and tried to rate them. We found, by the way, actually
that New York rental apartment prices were very efficiently
calibrated on a neighborhood-to-neighborhood
basis, where basically if you look at distance to Midtown
Manhattan plus parkland plus schools, then you could explain,
you get an r-squared of like 93% in explaining
prices per square foot. But things like that
might be fun. But look, I mean, for every
problem where statistics have made a lot of progress possible,
you also have areas where they’ve been probably
misapplied in different ways. In the book, you talk about
how earthquake predictions have not gotten a lot
better, for example. I think in economics– although some of stuff
that Howell is doing. I mean, how often do you,
do you talk with the White House and stuff? Or do you talk with– well, I can’t– [LAUGHTER] NATE SILVER: I don’t mean the
White House in particular, but I mean, do you talk with the
Bureau of Labor Statistics? HAL VARIAN: Oh, yeah. NATE SILVER: Yeah. Yeah. I mean, this is a rare space
where you actually have new economic data being produced
that might be underutilized. But in general, most economic
people have been trying to do economic forecasts for a very
long time, and not really gotten anywhere. So it might be a question
of saying, OK, well, we’re crying uncle. We’re giving up, but we’re
going to specify more accurately how much uncertainty
there really is in the forecast. So there are kind of two
ends that you can berm. We can say, well, if we want to
bring our predictions more in line with reality, sometimes
it’s a matter of applying better techniques, and
sometimes it’s a matter of admitting what we don’t know,
and problems that are intractable, at least as far
as the near-term goes. HAL VARIAN: Yeah, I think
there’s a whole area economists sometimes call
nowcasting, where the objective isn’t so much to
forecast, but to try to get an accurate measure of what’s going
on now, rather than wait for the government statistics
to come out six weeks or a quarter later. And with all the real-time data
that’s available in the private sector, I think we can
do a much, much better job of nowcasting the economy, tracking
what’s going on a moment-by-moment basis. But the one thing I did want
to suggest as a very attractive area, at
least for us here at Google, is marketing. Because market is a subject
that’s full of mythology and shamans and all this stuff. But if you do it right, as we
try to do here at Google, I think you could really
make some progress. NATE SILVER: Yeah. And probably industries like TV
and film are cases where– the studios are very
smart and very scientific about some things. But in terms of, what’s the
right portfolio of different film genres to build, or
questions like that. Or if you have a particular
product, to whom should you show that advertisement? That’s where Google probably is
working on analogous types of problems. But I think there’s probably a
lot a low-hanging fruit in those sorts of areas. HAL VARIAN: The question is,
when you put economic variables in your model, for
example, you just put them in with equal weight. And the explanation was you
wanted to avoid overfitting. But how do you know didn’t
distort the model in some other way? Why was that an attractive
choice in that circumstance? NATE SILVER: There’s some
literature, kind of in the forecasting literature, that
says that as a default, if you don’t have enough empirical
data, if the data’s not rich enough to develop weights from
your sample, then just tending to weight factors equally
is not a bad habit. And at least that way, you avoid
convoluted schemes where you’re weighting something at
80%, something else at 0.02%. Look, we use seven economic
variables and we’re trying to fit them based on 10
past elections. And they’re highly correlated
with one another. So we’re nowhere in the ballpark
of having enough data to develop reliable weights
based on the in-sample result. The [INAUDIBLE] philosophy too
is just that we didn’t want to develop an election-specific
measure of the economy. We just wanted an overall
economic index. Ideally, you would have some
economic index that someone else developed. Some banks will have
these, for example. Like Goldman Sachs, their
forecasting team has a version of GDP that’s like better than
GDP, because it includes more data series and reduces some
of the noise you get in the GDP estimate. So if we had something
like that, then that would be ideal. But there’s not a public
product that quite fit the bill. So we just wanted an index that
basically captured kind of what’s going on in the
economy in one number, to the extent you can. Which of course, is inherently a
little bit foolish, but it’s probably better to have
something that accounts for different major aspects– so jobs, income, industrial
production, inflation, different major categories. And we didn’t think you had to
get a lot fancier than that in terms of weighting them, because
it’s not clear what you’re weighting to. We wanted to weight to
correspond to how important is something to the economy,
basically. And then you’ve got to answer
the question of, well, how does the economy predict
elections? Instead of kind of saying,
well, this one obscure variable that we found
on Fred perfectly correlates with elections. I mean, those models have
not done all that well. People get too cutesy. But the principle that you want
to aggregate information, have more than one indicator,
is pretty helpful. And our model, just the economic
component, actually would have had Obama winning by
2 and 1/2 points, which is pretty close to the
actual outcome. And other models that people
used that took indices or that aggregated different economic
data points together also did pretty well. Whereas some of the ones that
just fixated on one data series were pretty
far off the mark. One had Romney winning by six
points or something, which was not a very good forecast
this year. HAL VARIAN: So again, I want to
repeat, the question was, what’s the craziest thing
or funniest thing a fan as said to you? Pretty tough question here. NATE SILVER: I’m not sure. I was on the plane
the other day. I was coach in a middle seat
out to San Francisco. And some woman in first class
said, I guess they don’t have any polls in coach, or something
weird like that. But there are these weird– so I found out that– this’ll wear off, I’m sure. But if I’m going out to dinner
with my friends, I’m booking a reservation in my name. So there’s like a 30% chance
that you get like a free round of champagne or something,
at some point. I shouldn’t reveal that
secret, right? HAL VARIAN: Yeah, I think
there’s going to be a lot of people trying that– [LAUGHTER] HAL VARIAN: Trying that
plan in San Francisco. NATE SILVER: So I went out to
some restaurant, it was called Local’s Corner– which was
really good, by the way, in the Mission District. Very good. But they said they had Googled
me to make sure, or Twitter-followed me,
to show that I was actually in the Bay Area. Because they’ve had this scam
before, where someone’s like, I’m Peter Fonda. And maybe they weren’t quite
sure what Peter Fonda looked like, and so they try and
play that game to get a better seat. So they actually followed
the paper trail and did their research. HAL VARIAN: What kind of seat
do you suppose Mitt Romney would get? NATE SILVER: In San Francisco? Well, I was thinking
what would happen– [LAUGHTER] NATE SILVER: Sorry,
we’re full. HAL VARIAN: Exactly. NATE SILVER: Because look, in
2016, if the Republican’s likely to win, then hopefully,
we’ll have the Republican winning in our forecasting
model. So I’d have to go to,
I don’t know, like Dallas, or something. We’ll give you a free plate of
barbecue ribs for correctly– [INAUDIBLE]. I like ribs. I like Texas food. But yeah. HAL VARIAN: All right. Last question. How can we improve the teaching
of statistics? NATE SILVER: That’s
a big question, an important question. HAL VARIAN: Important. NATE SILVER: And I’m not sure
why statistics education or math eduction in general is so
abstract in this country. I think you should just kind of
give kids data and let them play around with it,
at a certain level. And I don’t know exactly what
type of data you’d give to a high-school class exactly, but
to have kids conduct a survey or to have kids just start
counting things to come to some conclusions about them. I mean, the fact that people
are phobic of math is odd. I get the comment sometimes from
readers on my blog, like, oh, I thought I hated math, or
I’m terrible at math, but I like your blog. You’ll hear that. And probably, that person
is not terrible at math. Probably they had some teacher
who made them dislike math for some reason along the way,
because it wasn’t taught in a way that it had practical,
real-world applications. And I hope that will change. I don’t know enough about what
different curriculum teachers are using. I think probably the fact that
you have more teaching to the tests now is harmful, of
course, to any type of innovation in math and
science education. But I mean, that could be a
subject that I want to write a book on or something, down the
line, is how math is taught, and how that could
be improved. HAL VARIAN: OK. Nate, thanks for coming. NATE SILVER: Thank you guys. [APPLAUSE]


  1. Who says people good at math and science are nerdy? This guy is smart, intelligent, analytical and articulate with great language skill.

  2. I find the constant use of "right" rather endearing. To be honest, he uses it so much you have no other choice…… Still great stuff.

  3. I think the book was very interesting and I believe he is very intelligent, but to listen to him speak and say, "right" every ten words is going to make my head explode!

  4. One of the most revealing books i've read.
    A really good dissection of erros in statistical analysis, a must read for every anaylist of all sort.

  5. Right?– Right?– Right? He drove me MAD with that pet word!! 17 minutes and I had to quit. I can't stand pet words used indiscriminately.

  6. Hahaha, this guy and his false statistics just gets totally debunked one election at a time in the 2016 primaries.
    Sorry Nate Silver, time to apply for food stamps.

  7. Nate Silver got way too much credit for "The Signal and the Noise." In it he did not espouse any insights not already known, he use good tools and used them well and his predictions turned out to be right, which happens even when guaranteeing such a result consistently is not yet possible. Nonetheless he has risen to cult-like status. This is because the mass of people praying at his feet have zilch mathematical skills and believe anyone who does must be a magical creature.

    That's not bad enough, Nate Silver has taken this band wagon on the road, turned it into a snake oil operation, and working hard to cash in on his divinity status. He's not only unable to produce new mathematics, he's morally deficient.

    So next he's using the platform that has only swollen up with credulous zombies and is using to do analytics, but for political purpose. He's become a pundit and as such he's pandering to the donor class. Big surprise.

    So it's not a surprise that on the wake of the progressive surge in the this week's primaries, he's blundering forth his non-mathematical analysis that the progressive are not doing well. To be precise, Nate Silver stated that Justice Democrats "had the worst win rates." In other words, he couched his conclusion in mathematical terms "win rates" so that his base would be impressed and defer to him. Win rates? WTF is that? . . .and more importantly who cares? We all know discrete results do not a trend make no matter how hard Nate Silver would like to extrapolate some self-serving meaning out of them.


    Nate Silver is going to give Data Analytics a bad name and those who care about these things are here warned of the need to establish standards before their beloved and rather new discipline is taken over by con artists.

Leave a Reply

Your email address will not be published.