UK exam debacle, US pollsters’ inaccuracy show algorithms bereft of knowledge
Plato once sagely observed, “A good decision is based on knowledge and not on numbers.”
That advice resonates today, even as we deepen our reliance on number crunching — in the form of the unceasing machine learning algorithms whirring away in the background of our lives, setting in motion many of the routine decisions each of us make daily.
However, as Plato seemingly foresaw, the underlying algorithms we’ve come to rely on are only as good as the human knowledge they spring from. And sometimes the knowledge transfer from humans to math formulas falls well short.
Last August, an attempt by the UK government to use machine learning to conjure and dispense final exam grades to quarantined high-schoolers proved to be a disastrous failure. Instead of keeping things operable in the midst of a global pandemic, the UK officials ended up exposing the deep systemic bias of the UK’s education systems, in a glaring way.
Then, in November, the algorithms pollsters invoked to predict the outcome of the 2020 U.S. presidential election proved drastically wrong — again, even after the pollsters had poured their knowledge into improving their predictive algorithms after the 2016 elections.
These two high-profile knowledge-transfer failures are instructive. They come at a time when companies are grappling with something marketing experts refer to as the “personalization vs. privacy paradox.”
This refers to the difficulty of crunching data to influence human behaviors, without crossing a very thin line into privacy invasion. Data gathering and number crunching has long been at the core of marketers’ efforts to shape and steer consumer demand.
And as we advance deeper in the digital age, these familiar advertising tactics are being increasingly adopted for other purposes, such as helping us navigate global pandemics and dispense political power. An example of the former unfolded in the UK when school officials set out to leverage algorithms to deal with having to cancel nation-wide A-level and GCSC exams — high school seniors. This is the testing that dictates college education options for some 300,000 students in England, Wales and Northern Ireland.
To replace the exams, the officials conjured an algorithm designed to correlate teachers’ predicted grades to the historical pattern of grades issued at each school. However, English private schools have smaller classes and use grading systems that are incompatible with the public system models.
So the algorithm ended up categorically spitting out much lower grades for public school students, especially from schools in disadvantaged neighborhoods.
Exam grades for some 40% of English students came in significantly lower that the grades predicted by their teachers. This put in jeopardy the planned futures for more than 200,000 students, many of whom lost out on places at their chosen universities because of low exam grades issued by the algorithm. Meanwhile, private school students from affluent families received higher grades than their teachers predicted.
Several days of vociferous protests erupted. Students convened in London's Parliament Square and burned printouts of their algorithm-derived exam scores. Education officials were forced into retreat; they ultimately scrapped the algorithm and accepted the teachers’ estimated scores unilaterally. Score one for humans vs. machines.
Poorly-informed algorithms also came to play in a high-profile way with respect to polling anticipating the outcome of the 2020 U.S. presidential election. Pollsters spent four years improving the algorithms that were so off the mark in the 2016 elections. Yet, somehow, they missed the mark, yet again in 2020.
Pollsters anticipated that a record-number of mail-in votes would take an extended period to tally; at least they got that part right. Even so, pollsters’ now admit they downplayed Donald Trump’s support and overplayed Joe Biden’s support. They did so, even after refining their algorithms to better account for the education level of voters and other variables.
Still, predictions that Biden could win Florida and possibly pull an upset in Texas, as well, were dead wrong. While Biden won comfortably and is now president-elect, the race for the White House, as well as the races for Congressional seats, were much tighter than the majority of polls predicted.
The pollsters had overlooked Trump’s support among Hispanic voters in Florida and Texas, especially expatriate Cubans and Venezuelans. And they again failed to get an accurate reading of not just the number of die-hard Trump supporters, but also, to gauge the leanings of newer converts attracted by his grievance-based politics.
The long run implications of pollsters getting it wrong in the 2020 elections — despite access to leading edge machine learning systems and algorithms — remains to be seen.
Every waking moment, each of us generates mountains of data as part of our encounters with endless internet-connected sensors, embedded everywhere we turn. We also leave a fresh trail of digital footprints when we open email, use search engines, consume streamed content, use collaboration tools and post on social media.
All of these numbers get hoovered up by machine learning and artificial intelligence systems, which run the algorithms that monitor our transactions and control the distribution of content that gets sent our way. Algorithms influence everything we do, from binge watching movies on Netflix to assimilating both facts — and falsehoods — flowing through news outlets and social media feeds.
Covid-19 and the US presidential elections gave us examples of what can go wrong when imperfect knowledge gets factored into high-profile algorithms. These scenarios are going to increasingly turn up as we move forward. Employers are increasingly using algorithms to monitor and influence employee behavior, for instance. And marketers and political operatives are just getting started with profiling and influencing, using algorithms steeped with their biases.
As we move forward in the decade of the 20s, it will be vital to do as Plato advised and instill the appropriate measure of human knowledge into the algorithms. It’s not going to work to leave it up to the marketers and propogandists to design these algorithms — without any meaningful input from individual consumers.
Transparency is needed; consumer control ought to be the default. In these vexing times, it’s crucial for the algorithms that dictate our daily lives to come a whole lot closer to reflecting core societal values, such as fairness and equality.
Real war has come again to remind us that cyberwar, for all its terrors, is not yet on par with the damage done to flesh and family by bombs and bullets.
The rise of online tribes requires new strategies for countering the spread of misinformation. This is because your entire identity is wrapped up in a tribe and once you're a member, leaving can come with dire consequences.