PredictWise is a digital-first audience technology company for the progressive ecosystem. At the heart of our operation is a massive database containing every adult American, and allowing us to match attitudinal with behavioral data for all records, keyed to digital identifiers and PII (ToolBox). In this post, we’ll detail some of the many ways we helped Democrats and progressive causes up and down the ballot this last election cycle. Overall, our efforts reached over 20 Million people directly on their cellphones and provided the DNC and key swing states with fast alternatives empowering them to avoid the fall-out from the Facebook political ad shutdown. Our unique, verified data on 50 Million high-precision unregistered powered countless conversations; in just one example it was used to motivate over 90,000 unregistered women to use same-day registration in key rust belt swing states while avoiding issues plaguing such databases (that they’re filled with registered people who are simply mismatched in the voter file, and that they skew Republican because commercial data most frequently used — credit card files — tends to include wealthier, older people).
Below, we’ll describe our efforts in more detail:
(1) Machine learning-based targeting, built on continuous, always-on collection of attitudinal and behavioral data
(2) Platform-agnostic targeting technology, crucial in a year in which the Facebook political ad shut down dominated the news
(3) The impact we had in the field
Better targeting through machine learning and always-on data collection
At PredictWise, our bread and butter is providing custom, digital-first audiences — powered by state-of-the-art Bayesian machine learning algorithms leveraging unprecedented amounts of data, including years of surveys and individual-level GPS and phone telemetry data (i.e., anything that is passively tracked on cell phones).
The science: personalized surveys and telemetry data
Survey-based machine learning support scores: We scored every individual on our voter file (over 260 Million Americans) on over 30 different characteristics, including which party and candidates they support as well as issue-based scores such as their support for environmental regulations, health care rights, and criminal justice reforms. This process involved both state-of-the-art Bayesian machine learning models (built on Stan and run on cloud-based graphical processors) and data, including hundreds of thousands of survey responses for multiple questions per issue over the last four years, as well as individual characteristics (such as demographics, educational attainment, voting history) and characteristics of where they live (such as historical election outcomes, demographics, density).
Of course, these scores are only useful if they’re accurate. The figure above shows the relationship between each of our scores to each other and an individual’s (modeled) party for almost 250 Million Americans — our scores have both high face validity and provide information on top of just an individual’s predicted party membership.
Cell-phone location data to analyze movement: We used individual-level GPS data to identify individuals who stayed at home during the Covid-19 pandemic. In particular, we hypothesized that Republicans who were staying at home more often were open to messaging about the risk of Covid-19 and the mismanagement of the crisis by the Trump administration. Leveraging 3 months of (read: terabytes of) real-time mobile GPS data for tens of millions of Americans matched to the PredictWise ToolBox, we calculated a score for each individual measuring how much they reduced (or didn’t reduce) their movement during COVID compared to other people. We then asked Republicans who scored high and low on our COVID-concern-score about their self-reported concerns. The difference in self-reported concern for COVID between Republicans scoring low and high on our behavior-based COVID concern scale was almost 30 percentage points, a clear and first-of-its-kind demonstration of the power of such behavioral data.
We used this data to target almost 380k COVID-concerned Republicans in Arizona, Ohio, and South Carolina, with Covid-based advertising.
We first use our massive database to identify exactly the right individuals to target, and then we provide the best way to reach them online: A core asset we’ve built over the past few years is a digital-first voter file that keys each individual on the voter file to their Mobile Ad ID (MAID), instead of their personal information. These MAIDs can be used to target individuals on an advertising exchange, even outside the Facebook and Google walled gardens, distributing ads inside mobile phone applications for example, without the need for unreliable data clearinghouses as middlemen. In other words, we brought the heavily analog voter file online. If you followed the advertising news ahead of this year’s election, you know that Facebook banned the upload of new political advertisements the week before November 03, hurting Democrats in particular. In advance of this ban, we provided the DNC with MAIDs tied to their own support model scores for over 79M individuals across every state. This data provided a fast, direct alternative for launching paid content campaigns, and helped avoid potentially disastrous consequences.
Overall, we used this technology to help progressive organizations target 7,329,946 million individuals, ranging from unregistered minority voters to frequent voters who are Union supporters, and everything in between, and serving over 42M impressions to 16.8M unique individuals ourselves. Three campaigns that stand out in terms of impact achieved:
- Future Majority, for which we ran a campaign targeting 400k unregistered women with progressive views in three key states: Minnesota, Wisconsin, and Michigan. PredictWise curated targets and built a mobile-first static creative, serving over 5M impressions. Out of these 400k unregistered women we targeted, over 92k ultimately registered: 19k in Wisconsin, 52k in Michigan, and 21k in Minnesota.
- Florida Democratic Party, for which we served 15 Million impressions in less than 4 days, reaching 5,341,897 individuals, when last-minute issues with Facebook blocked the FDP from running new campaigns and creatives.
- Ohio Democratic Party, for which we build our specialized custom audiences for women and BIPOC communities, for turnout, registration, and persuasion. We successfully helped Ohio reach larger audiences through programmatic channels, increased video views, and hyper-targeted messages, serving over 20M impressions. In total, ads performed much better than ad spend on Facebook based on behavioral metrics/CPM, as detailed below:
We believe that measuring impact and validating effectiveness is a key part of the process. We’ve made our methods primer available online, and will briefly overview some additional experiments we ran this cycle. These efforts together demonstrate the effectiveness of every part of our technical pipeline, from data curation to targeting to delivery.
We take a multi-tiered approach to validate our methods and measure impact. In off-cycle years, we build and validate our core methodologies. This cycle, this meant building out the MAID-first voter file and survey infrastructure and then testing it in midterm elections. During the cycle, we closely monitor ad performance and iterate on spending across ad campaigns. Finally, for a select few of our ad campaigns, we run true randomized control trials.
In our primary randomized control trial, we partnered with TruthNotLies to measure both their campaign ad effectiveness and our targeting and delivery performance. Full details forthcoming, but for now: ads targeted to a treatment group of 400k individuals of 2016 Trump voters in Pennsylvania significantly decreased Trump support over the control group, as measured by surveys sent to each group after the election.
In a second experiment, we tested whether our support model scores truly capture opinion above and beyond partisanship, as well as our ability to deliver targeted ads through MAIDs and mobile applications. We created a specialized audience of people who score high on our Environmental support score, as well as otherwise equally progressive individuals who score low on the Environmental score. We then randomized each audience to be shown either a generic ad or an ad for an environmental protection group and found that the Environmental group indeed performed best relatively (over both other ads and other audiences) when shown the pro-environment ad.
PredictWise has always believed that maximum impact is not about frantically spending resources close to election day, but is rather achieved through continuous dialogue with voters through targeted, policy-driven messaging, aligned with each individual’s values and narrative frames, and through both organic and paid content. We believe that a single backend, streamlining and processing streams of both survey and alternative/telemetry data, can drive this effort. Our goal has always been to feed into such a structure or build this structure ourselves. The end goal for PredictWise is sharing better, more accurate, deeper data in service of building an always-on messaging campaign, leading to more meaningful interactions with everybody from habitual voters to unlisted. We believe that every progressive organization, from national to school board campaigns, should be able to access such a repository without any hurdle.
As PredictWise is repositioning its offerings, we are exploring ways to make this proposition a reality. More to come!