8 September 2020 #BetweenTheLinesDotVote Analysis A Theory On Dornsife's Changes Emerges WARNING - CHEF'S ONLY ALLOWED! This is NOT analysis for those who only want to eat the delicious meal. This is a VERY technical analysis on a possible method of cooking the data.
2) But for those who love to work in the kitchen, as I do, and come to understand polling and its function in this 2020 election, let me tell you, we'll have some sparks and flashes in today's work. Oh, this may come in parts over time, as work and life intrude...
3) To get to the heart of the matter, let's zoom in a bit. We'll zoom back out again, later. There's this interesting gray area I've highlighted. If you look down to the lower left you'll see it's definition. It may take a bit of thinking to follow, believe me I struggled!
4) Oh, I failed to give you the live link. The images above are screenshots I just snapped. Before I go any deeper, and assuming your a #DataChef like me, head over here and bop around. Get comfortable! election.usc.edu
5) Now, what about that gray area. Here's the exact phrase from the graph: "Difference not statistically significant if lines are the the gray area." I broke down what this means back in 2016. In a word, its charts the "too close to call" area, statistically.
6) Another way to say this is also a graphed image of their "margin for error zone." If the data for either team falls into that area, then the prediction loses some or all of its meaning. To be predictive, the numbers MUST be OUTSIDE the gray area. Got that?
7) As a point of bragging, when I was watching their numbers so tightly, I found an error they made, contacted them, and they corrected it! Now, note the date in the zoomed in image above: 8/21. And note all four applicable numbers.
8) Trump's number, as in every day of the poll so far, comes in beneath the gray area, at 41.37. Then, low end of gray area = 42.89, just about a point and a half above. Then gray zone high = 48.81. And last, Biden, as always so far, is ABOVE the gray zone at 50.33.
9) Maybe you can catch my theory for yourself. Here's a clue. Note the same four numbers - they'll always be in the same relation - as they flow. And look especially at these three dates: * 8/21 * 9/02 * 9/05
10) Here's the trend I see. The numbers so far, at first glance, just appear to be steady. Biden always above 50, Trum always beneath 43, and minor fluctuations, too boring to draw lines on. There's just no change, no volatility worth watching. See it?
11) But focusing on the gray zone, a different story begins to emerge. As the two campaigns diverge most, seen from 8/24 - 8/29, they then draw back closer and the separation shrinks. Thing is, as I see it, it's not shrinking toward each other so much as towards the gray area.
12) If, as I fear, the data is in any way being cooked, it might show up just like this. Set a rule, the data we select will show Biden as always above the gray area, Trump always below it. This would take some amazing programming and extraordinary algorithm control.
13) Please understand, I am NOT qualified to assess, nor do I have access to, their guiding statistical magic. I can only monitor the output. Also, I have NOT confirmed this as an actual theory, it remains only a plausible, given the data set so far, to my eye.
14) In Dow theory, we're told to test any line, support or resistance, by seeing the numbers bounce off it - the more dramatically the better - before we propose any sort of non-random movement, otherwise known as a trend. My 3 confirmations do NOT assure my theory.
15) But, these three confirmations are obviously enough for me to be telling you about my brand new little baby almost theory. Let's keep a close watch on this friends in coming days. I'll be watching everyday, for sure. Join me!
16) I am a very disciplined man. I don't know if I'll have the discipline to try to go back to 2016 to assess all the changes here in 2020's Dornsife method. Probably not, and certainly not today. Rather, we'll keep our focus on 2020's methods, and I'll use my memory.
16) Diving in, we turn now to their self-eplanation: "The 2020 USC Dornsife Presidential Election Poll is tracking changes in Americans' opinions throughout a campaign for the White House.
17) "Around 6000 respondents in our representative panel are asked questions every other week on what they care about most in the election, and on their attitudes toward their preferred candidates.
18) "The "Daybreak poll" is updated just after midnight every day of the week. Each day’s data point represents the estimate among voters over the previous seven days, representing approximately half of the poll’s full sample."
19) As a reminder, or for those of you who don't know. This is a very personal and important analysis for me. In advising the Trump Campaign in 2016, this was the only poll I used. And I picked the best of the best. That I fear I won't be able to trust this poll now is painful.
20) I have covered some of this in an analysis posted on 20 August, already. If you're really interested, you can check that one out too. We'll dive deeper now, by taking each point above and unpacking it.
21) The first point above states what the poll does: "tracking changes in Americans' opinions throughout a campaign for the White House." This is genius. It was the poll's original innovation. Instead of taking snapshots of where people are right now, it tracks changes over time.
22) There is no way to overstate the significance of this innovation. The method was first tested in 2012, and was one of the few to strongly predict Obama's win, when so many others wrongly favored Romney. Then, 2016 was simply amazing.
23) As we'll see in a moment, it's more than just the fact that they track changes, their predictive model was incredible as well, all the way down to their gray zone margin or error area. But the key thing was to track the SAME group of people longitudinally over time.
24) Watching the shifts in opinion from this one group - around 3,100 or so in 2016, they were able to track the pulse, the heartbeat of America's shifting leanings. It is an absolutely brilliant method.
25) Here's 2020's sample size: "Around 6000 respondents in our representative panel are asked questions every other week..." That was not my first Waring Will Robinson moment, but when I saw that my stomach dropped. Why? For what reason? 3,100 had been perfect. Why change?
26) You might be interested to know that in 2016, Dornsife paid its participants $100/month to answer their 3 and ONLY 3 questions, once/week. Easy peasy. Kind of fun, and enough for a decent dinner for 2 with wine and desert. Cool.
27) You engage your people so they're paying attention, but you don't bug or overwhelm them at all. And, as I've said, it worked like gangbusters. So again, why the change to 6,000 now, where you only ask them questions twice/month? We'll find our answer below.
28) "...are asked questions every other week on what they care about most in the election, and on their attitudes toward their preferred candidates." If you bop around at their site you can learn about all the many other areas they're now asking about. This matters big time!
29) Essentially, this is simply NOT 2016's Dornsife. Consider those 3 and ONLY 3 questions they asked once per week: 1) Will you vote? 2) Who for? 3) Who will win? Those questions were GENIUS. They defined the poll itself.
30) So the first easy guess we can make about the change from 3,100 to 6,000 is question fatigue. When you start asking too many questions, you simply cannot repeat the process nearly as often. This means the poll itself now has a totally different flavor and feeling.
31) As to the cash, they're still paying out roughly the same amount in total, as they now pay poll takes only $50/month. Double the volume of people polled, halve the number times polled, and halve the amount payed per month, cashwise, you end up about even.
32) But consider how far we've already come from the stark, cool simplicity before, and the fun deal for the participants, encouraging real participation. Now, you can have a decent meal with wine and desert...for 1 person. Ouch. Where's the motivation there?
33) And then, when you throw in fancy-pantsy stuff like "care about most in the election," and "attitudes toward their preferred candidates," well, credibility simply plumments.
34) Here's a last technical point. I won't quote, I'll just explain. In 2016, they employed a rolling average of the entire polled group, weighted over each week. The rough numbers are simple. 3,100 / 7 days = about 443 email polls per day. The weighting formula was proprietary.
35) As I recall, the current day's votes counted more, and previous days' counted less until the 8th day fell out of the average. This gave the poll phenomenal sensitivity to every change, living up to its mission.
36) Now... "Each day’s data point represents the estimate among voters over the previous seven days, representing approximately half of the poll’s full sample." And my gut tells me, reducing the poll's sensitivity to changes by half, halving the poll's value and integrity.
37) Integrity. Yeah. Again, why mess with perfection? Why not hit it again, and nail this coming election, proving the method's dominance and glory? Why not? Enter politics stage left. Pun fully intended.
38) Remember, the Dornsife Poll is funded by the LA Times. My theory is that in 2016, the Times was so certain of an HRC victory, they didn't really care what the egghead pollers at USC came up with, and paid it little attention. And then came Trump. And THEIR poll called it?
39) So, how hard is it to imagine to top brass at the Times calling in the minions over at USC and laying down the new, 2020 law. This is the way we do it now, I can hear them say. Okay, sure, yes sir, yes ma'am, we'll do, we're on it, you got it. Integrity gone.
40) Imagine with me this. This time round the Democrat Powers know how dangerous Trump is, and how dangerous the MAGA movement. Unlike last time, they're taking nothing for granted. So, what's been their playbook this time? Fake polls is strategy number one.
41) I will continue to watch Dornsife daily, and with a fadingly, foolishly, still hopeful eye. I respect them so much. And, maybe I'm reading these early signs wrong. I know exactly what I'm looking for. In my next installment here, I'll draw up the graphs to show you.
42) For those who follow simple Dow Theory, as I said above, my current theory is that the top of the gray area is their chosen - as in pre-chosen, before the data comes in - line of support for Biden; the bottom of the gray area the pre-chosen line of resistance for Trump.
43) And as you can tell from today's installment, my theory is bolstered by all the changes they've made to their method, weakening if not obliterating their ability to live up to their mission, to the spirit of their own innovation, to honestly track changes.
44) If I can see this theory broken, I will be the first to say so. I would truly love to be wrong. Right now, happily, it remains ONLY a theory. Honest, let's hope I'm wrong.
45) And in the meantime, please do head over to my own attempt to live up to the Dornsife 2016 spirit, with some important innovations of our own, at...
Thread temporarily ends at #45.
We conclude our analysis today, picking up at #46 with my own graph of Dornsife's data, shown here as well.
46) It's such a strange term Dornsife chose, in representing their margin for error, graphically: Insignificance Area. As much as I've studied it, and to the degree I understand it, I still feel, well, some confusion over what it really means. Drawing it helps, but still...
47) Their simple statement: "Difference not statistically significant if the lines are the gray area" rather has to mean something like "too close to call," don't you think? Sadly, I'm not exactly sure. I call such statistical things "magic." I often really don't understand.
48) Allow a digression. I got an A+ in my one statistics course in college, and loved every minute of it. It was the 101 Intro. Alas. I so wish I'd taken more courses. Still, when I met my future wife @KateScopelliti, she was taking the same course the following year.
49) Coaching Kate through Statistics 101 was one of the funnest things I've ever done. Isn't young love wonderful? Digression complete.
50) So, why put so much effort into visually graphing an "insignificance area?" It seems a contradiction in terms. Here's this thing this NOT significant, and we REALLY want you to pay careful attention to it! You have to admit, that's pretty damned confusing. Right?
51) I really am not able to enter into a properly trained statistician's mind. It makes me wonder. Do all polls, when they publish their "margin of error" number, feel the same way? Long as we're outside that black hole of insignificance, our work is...significant? Don't know.
52) So, what are we looking at here that we can be clear about? The first thing to do is pretend it's true. I know, that's a big stretch. Especially for me, as I've already judged this data to be false. It's called the hypothetical method. A hypothesis is allowed to be wrong.
53) Pretending it's true, the first obvious conclusion is that Biden's got this election, cold. Trump doesn't stand a chance. And note, really look, we're NOWHERE NEAR THE Insignificance Area. Not for a moment. This means that we know. We're not guessing. We know. Biden!
54) Follow me on the graph. Biden's support is ALWAYS ABOVE the Insignificance area. That means his support is both significant, and the salient fact. His support is simply NOT in question. When it drops, it instantly rises again. Biden all the way!
55) So Trump. Resistance to him is significant, and salient. It is Trump's resistance that matters, his support is not meaningful. Trump's resistance is ALWAYS beneath the Insignificance Area's bottom. It never threatens to become insignificant, at all. Trump resistance is real.
56) You do understand that I sadly reject Dornsife's data. Not 100%. That is, I will still watch and monitor their data assiduously. It remains possible I might reverse my judgement, and if so, I'd be the happiest guy. I don't want to be right.
57) But if my current theory is right, it's that damned Insignificance Area that tells the tale. I'll say it again. My theory is: Biden will always be above, Trump always below Dornsife's Insignificance Area. Let's call this the Slush Theory.
58) Review... In 2016, only three questions: 1) Will you vote? 2) Who for? 3) Who will win? No slush. 2020: "what they care about," and "their attitudes." Slush.
59) Here's a fantasy. I got myself a PhD in statistics, and rose up through the ranks in spite of my weakness for and obsession over truth. In this fantasy, I could tell you precisely, and simply, what makes up that damned Insignificance Area.
60) Here in this real world, I can draw, let alone read a data graph. Ha! And this one tells me, baloney. I call a rat. Something smells in Southern California. I'm calling...agenda. The data is chosen to support the predetermined desired outcome. It stinks.
61) Brilliant as my attack may be, so what? What am I going to do about it? Anyone can attack, complain. What am I going to do? The answer is simple. I have already picked up the torch of honest polling. I will not fail.
62) I will ALWAYS give credit to Dornsife for my inspiration. 3 of my 5 questions come straight from them. I have added: Who should win?, and How do you identify? I have also constructed my own proprietary prediction formula.
63) Please note, NOT ONE of my five questions are, in any way, slushy: 1) Will you vote? 2) Who for? 3) Who should win? 4) Who will win? 5) How do you identify: D, I, or R? No slush.
64) I have additionally rejected, outright, the very idea of a MARGIN FOR ERROR. Come on. Polling is always and inescapably error ridden. We don't know if we're doing it right. Presenting a MARGIN FOR ERROR, as if we knew, is a lie. Let's just admit we're guessing.
65) The goal in polling, as in all of science, is to make guesses better, to make our guesses less bad. We do our best. We follow the scientific method. And we NEVER believe we know when all we can do is guess. Good guesses are better than bad guesses. That's what we get.
66) Let's finish up with politics and media. Even when they were lying, in the past the media at least knew how to tell a story with: Who, What, When, Where, Why, and How. All of that was supposed to be in the first paragraph. And the headline couldn't distort the story.
67) We know that for decades, the politicians and their media allies whittled away at that format. We get political messaging as headlines now, and gross generalizations as the story. What we learned in 2016 was that polling, political polling, had gone the same route.
68) There is an answer. In polling, we must publish our prejudices, boldly. I am a bold Trump supporter. I will therefore test every decision I make by DISTRUST of myself, due to my Trump leaning. I will rigorously test all my data against the fear of my own prejudice.
69) But what I won't do is change the questions I ask, or the order in which I ask them. There is zero bias in my questions. And, there will be zero bias in my data. My interpretations are at risk of bias, which I will rigorously seek and destroy to my ability.
70) But if you're interested in honest polling, done honestly, by an amateur who proclaims no priestly powers, then do head over to our poll. You will see. We will NEVER break our oath. Honest Questions. Honest Answers. Honest Data. BetweenThenLines.Vote
Analysis ends at #70.