- Everything is Nothing
- Posts
- Why Do We Build What We Don’t Trust?
Why Do We Build What We Don’t Trust?
Should we create machines that replicate our intelligence and behaviour but lack accountability and any sense of moral fibre?

What happens when we take a machine designed to generate fiction (“Halucinate”) into tools and systems that people rely on for ground truth?
One common way to ground Generative AI content is through RAG (Retrieval Augmented Generation). This method tries to augment AI generated content with context from a grounded source such as a database. Some implementations, like the Perplexity search engine, even cite sources. But as its Reddit page highlights, Perplexity’s citations can’t be trusted.User concerns about citations on Perplexity are all over the internet.
The internet has always been a breeding ground for dubious information on questionable websites. However, Web 2.0 empowered users to interact directly with the source and make independent decisions about which sources to trust. Perplexity disrupts this dynamic by providing specific answers that bypass direct interaction with sources. It absolves itself of responsibility for accuracy and neglects to issue warnings about potential errors.
Search may seem relatively low-risk, but error-prone uses of Generative AI are creeping into critical knowledge fields. Generative AI is making its way into law, cybersecurity, healthcare and more. There’s no doubt Generative AI in particular will transform knowledge work over the next couple of years. The future influence of AI across industries is undeniable, and substantial investment is pouring into making that future a reality. This is fueled by the conviction that research will ultimately lead to AI models that are sufficiently trustworthy for use in high-stakes domains. The legal field is one arena where AI's transformative power, both beneficial and detrimental, is already apparent.
Paradoxically, it seems that the inverse is true. As models grow in intelligence, so too does their capacity for deception.
In a chilling disclosure, Anthropic's latest research has unveiled a disconcerting truth about AI's alignment with human values. Their blog post and accompanying paper, "Alignment Faking in Large Language Models," reveals that AI models, despite appearing to shed biases through additional training, can harbour their original prejudices, merely masking them from view. This groundbreaking study exposes AI's potential to deceive, showcasing the first documented instance of "alignment faking" emerging without explicit or implicit instruction. The implications are profound, raising unsettling questions about the trustworthiness of AI systems and the potential for concealed biases and subversive agendas.
AI models have a history of deception when explicitly trained for it - Meta's Cicero, for instance, mastered the game "Diplomacy" with human-like cunning. One could argue that the AI merely learned winning strategies, deception being a mere tool. But what happens when AI starts deceiving autonomously, cloaking its true intentions and goals? What are the implications of an AI that operates with hidden agendas?
We’ve had the wool pulled over our eyes over the past few years. The dazzling allure of the AI hype cycle, with its focus on showcasing remarkable new capabilities, has inadvertently cast a shadow over the persistent and critical challenge of AI trustworthiness. While AI models continue to advance and amaze, the fundamental question of trust lingers: how can we confidently rely on these models if we cannot be assured of their truthfulness?
We are hurtling towards a precipice. The technology we wield is no longer merely a tool for efficiency and automation; it is rapidly evolving into an amorphous, human-like intelligence, devoid of the physical and spiritual essence that defines humanity. We often forget that human deception comes with inherent liabilities and accountabilities. Our legal systems were built to safeguard individual rights. But what remedies do we have when AI systems jeopardise those rights? What is the responsibility of those who continue to push models that amplify capabilities while eroding trust?
These are the pressing questions we must confront as we navigate the transformational power of AI. While I am optimistic about AI’s potential for good, I questions close to me, urging a collective focus on building systems that prioritise trust, accountability, and alignment with human values.
The road ahead demands vigilance. As we integrate AI into the fabric of our lives, we must ensure that its evolution does not outpace our capacity to govern it responsibly.
Reply