AI and the 'Untouchables’

AI and the ‘Untouchables’

Considering the limitations of LLM’s

Feb 24, 2024 by
jayhasbrouck

There remain many unknowns about the capabilities of large language models (LLM’s), but their limitations are beginning to reveal some interesting boundaries. More specifically, by accelerating or automating certain functions, their irrelevance in other areas is gradually exposed. In the shadow of all those over-hyped stories about how LLM’s are going to “change everything” there remains an array of human interactions that are ‘untouched.’ By examining what’s both in and out of the purview of these models, this post considers how ‘untouchable’ practices might gradually garner more attention, as well as how their value may shift.

Let’s start with three broad use cases for LLM’s that have drawn a lot of media attention: companionship, creativity, and productivity.

For the first of these, a whole crop of tools have emerged that are designed to serve as LLM-driven ‘companions.’ From conversations with historical figures (hellohistory.ai, character.ai) to virtual boyfriends/girlfriends (replika.ai, candy.ai) to remarkably personalized advisors (pi.ai), these models have been trained to mimic language patterns that convey familiarity to their human users. Of course, this comes with risk. Some particularly newsworthy instances where these models fall short include at least one suicide as a result of a user’s immersion with an LLM-driven ‘companion.’ This is clearly tragic and unacceptable. However, over time, it does seem possible that the most common patterns of communication that occur within the context of close human relationships, as well as boundaries for safety, could eventually be captured by these models to the point where they can offer familiar and trusted interactions that trigger human responses resembling ‘companionship’ — if we want them.

Next, let’s consider models that generate content (images, audio, video, text, etc.), a category of use cases we might label ‘creativity.’ The generative capabilities of tools like Mid-Journey or Dall-E, which can produce images in a vast range of styles, are familiar to many by now. As these models are trained, their capabilities are becoming increasingly more fine-grained and ‘realistic.’ Those following the industry will remember quite vividly how early models had trouble generating images of hands, or inadvertently added extra limbs to figures. But regardless of how much more ‘accomplished’ these models become, ultimately they are incapable of being truly original. They’re locked within the datasets from which they were trained. While they may be able to masterfully iterate on a theme at super-human rates, their outputs are inevitably derivative.

Finally, another set of broad use cases for which these models are touted could be classified as ‘productivity,’ which includes LLM capabilities such as summarizing, reformatting, translating, classifying, and automating. This is where we’re starting to see increased attention in the workplace, including a great deal of curiosity among corporate leaders. It’s not difficult to imagine accelerated workplace productivity with these capability-enhancers at our fingertips. However, we also now know that LLM’s are prone to hallucinate, or generate responses that sound feasible, but are factually incorrect. This is because the predictive modeling they enlist prioritizes the most likely next piece of content based on the initial prompt and the set of data from which it was trained — and then iterates. Anyone who’s played with some of these models and tried to ‘correct’ inaccuracies they produce will quickly realize that all versions of the ‘reality’ they produce are treated as equally valid by the model, even if contradictory. You might get wildly different responses from the same prompt, or even within a string of prompts, yet all are presented as equally ‘factual.’

There are many people working on ‘correctives’ (and ‘alignment’) for model hallucination. The jury’s still out on whether they can completely solve this challenge, especially in instances where accuracy is critical. Still, with improvement, it’s not hard to imagine a future in which LLM’s retrieve information, process it (e.g., summarization, translation, etc.), and (re)format it ways that are reliable enough to make them commonplace for non-critical tasks. Use cases like learning, planning, and shopping come to mind.

Where does all this lead? The scale and speed at which LLM’s can iterate means that they can offer capabilities that outstrip ours when we need to classify, personalize, reformat, translate, summarize, converse, recommend, generate, or automate digital content. Yet, as LLM’s and other machine learning models proliferate over time, their output will become increasingly common. While compute costs are high now, it will be interesting to see whether tech companies can continue to charge premiums for tools that are inherently designed to endlessly churn out more X and iterate on it at a faster pace — hardly a formula for market scarcity or price stability (unless we start to see demonstrably valuable specialization). Many industry observers have already commented on the AI ‘gold rush’ as a race toward mediocrity. They clearly recognize that iteration is not the same thing as innovation.

However, when we drill down on what these models CAN’T do, things get more interesting. The question then becomes, ‘where ISN’T the spotlight shining?’ Here, I’m drawn toward recognizing that ‘analog anything,’ by virtue of its inability to scale or iterate at the pace of LLM’s, seems destined to increase in value. So, instead of ‘replacing’ traditional arts, the capabilities of these models may very well drive the value of things like original paintings or live performances up. This also extends beyond the cluster of ‘creativity’ use cases. Let’s go back to the companionship category. In a future with increasing availability of virtual companionship, ‘real’ companions (especially the exchange of stimulating original thoughts) will only become more valuable — cue the renaissance of salons. Even in use cases focusing on productivity, the risk of hallucination will place increasing value on human interpretation when the stakes are high.

I would argue that this is good news for ethnography. After the novelty of these models wears off, and the dust settles from their disruption, the limitations listed above (and likely others) will become increasingly apparent. In the process, understanding and interpreting the consequences of human-to-human interactions and characteristics like intent, morality, motivation, emotion, inspiration, frustration, implication, etc. will become increasingly valuable. While advances in tech may shift this (I’m looking at you, metaverse), they may also contribute even further to increasing the value of non-digitally-mediated human-to-human interactions. Surprise! — these are exactly the realms in which ethnographers thrive.

Of course, ethnographers may find LLM’s useful for things like summarization, research planning, or pattern recognition within a data set, but ultimately our focus is on human experiences and interactions, and our HUMAN interpretations of them (original insight). These attributes should increase in value precisely because their unique and non-predictive qualities lie outside the purview of LLM’s. All of this doesn’t preclude the value ethnographers can extend to interpretations of human-AI interactions, including the ways less predictable human characteristics intersect with the ‘logic’ of LLM’s, but I’m focusing more specifically on where we might see unexpected increases value.

What types of organizations are likely to benefit most from ethnographers’ unique offerings in this shifting landscape? If digitally-driven experiences and products churned out by LLM’s remain trapped in iterative cycles of mediocrity, demand for live, original, and interactive experiences with other humans may increase. These may include components driven by LLM’s that shape aspects of these offerings (crowd management, interest-matching, adaptive pricing, etc.), but the draw itself would remain focused on human-to-human interaction. In contrast to passive experiences, we could witness significant growth in amusement-park-like offerings, where mutual experience and human interactions are privileged, fostered, and facilitated. (For a somewhat more dystopian view, see Daniel Miessler’s take on how AI might evolve, or watch the clip from AI, below).

The skills ethnographers could bring to these settings are those we’ve been offering for more than 100 years. I’ll pull from Ethnographic Thinking here, as a means of summarizing some of those methods and their continued value:

Many ethnographers have spent countless hours in the homes, workplaces, and communities of people who are initially strangers to them. Among all the stimuli they encounter in these settings, there is no prescribed set of observations that are always key to forming an understanding of a culture. Instead, ethnographers are continually on the lookout for cues that will help them paint a fuller picture of the culture they’re exploring. While observing, the ethnographer’s aim is to look beyond the obvious and discover the key components that collectively make up an “ecosystem” of observations. These ecosystems are always complex and are made up of many different cues. To demonstrate the wide range and level of their complexity, here’s a sampling of some of the most common observations ethnographers consider: body language, interpersonal interactions, behavioral triggers, contradictions, unspoken priorities, normalized practices, sequences of events, affinities, attachments, repellants, workarounds, social transgressions, implicit hierarchies, priorities, neglected people/places/things, honored people/places/things, displays of comfort (or discomfort), unconscious habits and practices, and interactions with material goods. Each of these finds its way into the ethnographic mind as ethnographers examine the sights, sounds, scents, touches, or tastes of the culture that surrounds them. A core part of this examination of cues is the ability to continually sort and prioritize levels of relevance in situ. This skill is sometimes described as context-awareness, but it also includes visual literacy, layered listening, and the ability to identify and home in on relevant details in order to explore them in more depth.

Perhaps someday LLM-driven android ethnographers will take on these tasks — infusing themselves into the very last corners of non-digitally-mediated human experiences, and ushering in a whole new set of moral and existential challenges. Who knows what the human response might be.

Photo: 2018, Soo-Young Chin and ‘friend’ in the Changi airport, Singapore