Introduction
In Jack Williamson’s 1931 short story ‘The Doom from Planet 4’, one of the earliest science fiction texts to imagine scenarios of machine vision, alien robots with camera-like eyes watch and track two inhabitants on a remote island. Late one night, one of the machines awakens the inhabitants with its ‘soft, insistent purring sound’. The narrator describes the eerie feeling of being observed:
The sound came from a glistening metal machine which stood half-hidden in the brush a dozen yards away looking at him! The thing was made of a lustrous, silvery metal, which Dan afterwards supposed to be aluminium, or some alloy of that metal … That is, it was an oblong metal box, tapering toward the ends, with the greatest width forward of the middle. Twin tubes projected from the end of it, lenses in them glistening like eyes (Williamson, 1931).
The scene invokes the image of a large, oddly shaped metal object that has become synonymous in popular culture with alien arrival. Through this trope it thus develops a comparison with the human face to stage its imaginative act of machine vision. When the narrator describes ‘twin tubes’ protruding from the machine, each with a lens that glistens ‘like’ an eye, our perceptual framework positions us to imagine a human-like eye, even if the body to which the eyes belong is patently non-human. Moreover, the narrator’s observation that the machine was ‘looking at him’ intrinsically confers upon the machine a face, reflecting Hans Belting’s idea that a face ‘does not truly become a face until it interacts with other faces, seeing or being seen by them’ (2017: 1). In this way, the interpretive impulse prompted by Williamson’s prose connects the human face and machine vision, implying that advanced machine intelligence is ‘too complex to reside in a robot body’ (Murphy, 2019: 1).
Science fiction narratives have imagined the future of machine vision since at least Mary Shelley’s Frankenstein (1818), in response to which the dangers of creating artificial life became a topic of considerable literary interest. Alongside the publication of science fiction and speculative literature about machine vision, cultural theorists have a long history of critiquing the evolving intimacy (and anticipated hazards) of machines that have the capacity for vision. From John Berger’s Ways of Seeing (1972) and Martin Heidegger’s ‘The Age of the World Picture’ (1977), to the Foucauldian-inflected Techniques of the Observer (Crary, 1992) and Paul Virilio’s The Vision Machine (1994), studies from a range of fields (art history, philosophy, literary studies, cinema and game studies) have informed and shaped a wider debate about the detection and recognition of objects, texts, scene types, people and faces. As with all aspects of the sociotechnical world, science fiction and speculative fiction have kept pace with these changes in their envisioning not only of the mechanical and algorithmic manifestations of potential future machine vision technologies, but also, increasingly, the cultural, ecological and affective consequences of emerging devices. Recent texts in which artificial intelligence is examined as an enabling technology for machine vision, in contexts ranging from police and military operations to schooling and the corporate world, include: Daniel Suarez’s Kill Decision (2012), Annalee Newitz’s Autonomous (2017), Todd McAulty’s The Robots of Gotham (2018) and William Gibson’s Agency (2020).
A more recent novel, Kazuo Ishiguro’s Klara and the Sun (2021), diverges from the trajectory set by these accounts in its treatment of machine vision as a theme of introspective and subjective significance, rather than a predominantly social or technological one. In the novel, instances of machine vision become overlaid with a mode of what might be called ‘face reading’ that presents complex intersubjective exchanges narrated from the perspective of an intelligent machine. Set in a speculative future where children are educated in a dispersed, socially enmeshed digital format delivered by Artificial Intelligence systems affectionally known as ‘Artificial Friends’ (AFs), the novel is narrated from the perspective of Klara, an inquisitive and amiable android. This AF accompanies a chronically ill young girl named Josie, playing in turn the role of teacher, friend, companion and servant. Constantly observed, learning and accessible, the AFs exist in a milieu of pervasive scrutiny and surveillance, which is organised around the objectives of education and self-improvement for both human and machine. What distinguishes the novel as a crucial site for understanding the implications of advancements in facial recognition technologies and machine vision is the subjective precision with which Ishiguro develops Klara as the protagonist. The character provokes uncomfortable and contradictory responses from the reader about the relationship between artificial intelligence, human insight and emotion, and machine vision. To that end, Klara and the Sun reveals the ways in which algorithms and facial recognition are ‘not just abstract computational processes’, but also have the power to enact ‘material realities by shaping social life to various degrees’ (Bucher, 2017: 40; Beer, 2013).
In what follows, I demonstrate that Ishiguro’s novel exemplifies the potential of contemporary speculative fiction to offer productive social and cultural responses to machine vision. By focusing on the ways that Klara and the Sun enacts, stages and dramatises cognitive and emotional acts of interpretation, comprehension and empathy through complex acts of face reading, I argue that the novel serves a crucial role in imagining the future implications of data-intensive facial recognition technologies. This reading combines a critique of machine vision with a long tradition of interest in the ways that our understanding of the face continues to change ‘the space of possibilities for personhood’ (Hacking, 1986: 229).
To do this, I use Guillemette Bolen’s concept of ‘kinesic imagination’ to read the facial and gestural movements through which the AFs communicate with both human and other non-human subjects. Spanning close readings of a range of texts, periods and genres, Bolens shows how literary sentences work to activate a cognitive process called ‘perceptual simulation’ (Bolens, 2012: 6). In neuroscientific terms, this process involves the ‘reactivation of a type of knowledge that is sensorial (i.e., derived from sight, hearing, touch, taste or smell), motoric (i.e., kinesic, kinesthetic, proprioceptive), and introspective (e.g., pertaining to emotions and mental states)’ (Bolens, 2012: 6). I read the two-way acts of facial expression and interpretation in Klara and the Sun alongside Bolens’ work to consider how readers of narrative fiction come to understand non-human faciality and machine vision that are presented via a persuasively human-like narrative voice.
To examine how these corporeal and machine-like intelligences unfold at an affective level, I also use Sianne Ngai’s theorisation of ‘ugly feelings’ as a model for reading the ambiguous and uncomfortable emotional dynamics in Ishiguro’s novel. Ngai’s study, which dwells on ‘affective gaps and illegibilities, dysphoric feelings, and other sites of emotional negativity in literature’, offers a productive framework through which the range of ‘artificial’ feelings canvassed in the novel (as well as readers’ difficult responses to them) can be theorised according to a non-sentimental narrative logic (Ngai, 2005: 1).
In the first section I examine how machine vision is represented in Klara and the Sun, with a focus on the novel’s rendering of faces through intelligent machines. Specifically, I consider how the novel understands machine vision across not only technological and sociotechnical registers, but also as an affective, aesthetic and intimate phenomenon that can be articulated through literary language. As Bueno and Abarca observe, ‘in an age dominated growingly by machine learning technologies, it is possible to speak not only of machine vision but also of a machinic imagination and a machinic unconscious’ (2021: 1178). Reflecting this aspiration, I ask how Klara and the Sun might offer an alternative way of conceptualising the visual field of contemporary digital culture, particularly in an age where the provenance and integrity of images is under considerable ideological and commercial threat.
In part two, I trace the novel’s exploration of facial recognition technology as a crucial site for the study of contemporary machine vision. In a context where ‘images are being made by and for machines and the human eye is being gradually replaced by an algorithmic gaze’, I argue that the novel presents a timely revisiting of concepts such as the gaze, interpretation, identification, observation and misrecognition, through its disruption of the logic of what it means for a machine to ‘read’ and understand a human face (Celis, 2020: 298). In the final section, I turn to the ‘ugly’ and uncomfortable feelings that new relations between seeing and knowing engendered by advanced machine vision technologies produce in human subjects, with a focus on how these systems interpret the human face. While today’s computer-based facial recognition brings with it the ‘promise of the infallible and all-insightful eyewitness account’, I argue that Ishiguro’s novel offers a timely reminder that seemingly neutral systems are themselves human-made (Andrejevic and Selwyn, 2022: viii). To that end, I suggest that the ontological task of thinking about how machines see, as prompted by literary texts such as Klara and the Sun, is critical for our understanding of how vision, emotion, representation, subjectivity and interpretation function.
I. Machine Vision, Automation, Pixel
Coined by Paul Virilio (1994), the term ‘vision machine’ describes technologies that have sought to automate visual perception. With the later arrival of machine learning algorithms, machine vision now facilitates complex automated, data-driven processes such as assembly robots, drones, self-driving cars and automatic border controls (Hoelzl, 2018: 361). While these definitions and systems endure, recent developments in neural networks, Large Language Models (LLMs) and, within these fields, Text-to-Image Diffusion have thrown into question the basis of what it means to see, because both the practice of interpreting visual information and the nature of that information itself are in flux. Consequently, in contemporary digital culture, the act of seeing ‘as a position from a singular mode of observation’ has been wholly transformed because the various visual elements and techniques that comprise observation are now ‘highly distributed through data practices of collection, analysis and prediction’ (MacKenzie and Munster, 2019: 3). Analysing machine vision in this new context therefore necessitates revisiting basic tenets with regard to seeing, interaction and the human senses. Reflecting this need, for instance, Perle Møhl asks: ‘If vision is not inherent in the body, if ways of seeing are enskilled and change from one setting to another, what are the implications and effects of working with and seeing through a seeing machine?’ (2021: 1244).
Such a question speaks back to a longer history of cultural theory, art history, literary representation and other modes of humanistic inquiry aimed at understanding how we perceive visual stimuli from the world around us. John Berger’s 1972 Ways of Seeing, for example, has profoundly influenced popular and scholarly understandings of visual culture. Berger writes:
We never look at just one thing; we are always looking at the relation between things and ourselves. Our vision is continually active, continually moving, continually holding things in a circle around itself, constituting what is present to us as we are. Soon after we can see, we are aware that we can also be seen. The eye of the other combines with our own eye to make it fully credible that we are part of the visible world (1972: 9).
In this oft-quoted work, Berger outlines the visual logic of how acts of comprehending physical objects or other visual phenomena (a painting, a landscape, a sketch), are as much determined by the conditions within which we observe as they are by the objects themselves.
The phenomenological basis upon which Ways of Seeing is conceptualised is the product not just of the intensifying visual culture of the modern period, but also the technological tools with which that culture could be rendered, observed and reproduced. It was in this period, Lev Manovich writes, that ‘the arts started to systematically develop new aesthetics that strives to fill every possible “cell” of a large multi-dimensional space of all sense dimensions, taking advantage of the very high fidelity and resolution of our senses’ (2021: 1148). As this visual field has become increasingly non-human, especially over the last decade, the ubiquity of machine vision has prompted a dramatic shift, whereby previously abstract or theoretical questions of seeing and interpretation are now directly relevant to the lived experience of everyday people around the globe. Responding to this development, Ishiguro’s novel and other works of speculative fiction have sought to direct their narrative focus towards a question that sits at the core of machine vision, but which also has implications for fields in traditionally humanistic traditions: how do we see and understand a world in which an ever-increasing proportion of that world exists in non-human arrangements?
Klara and the Sun addresses this question at the level of both content and form, through the narrativized interweaving of Klara’s subjectivity and introspection with detailed descriptions of the technical process of her machine vision. The novel opens in a boutique electronics store where Klara and other AFs are on display for purchase by the wealthy parents of gifted children. Positioned at the back of the store, Klara aspires to be moved to the front window to achieve greater exposure to the sun and a better view of the street outside, declaring: ‘I’d always longed to see more of the outside—and to see it in all its detail’ (Ishiguro, 2021: 12). As children and their parents move about the store, Klara’s observations are articulated via a perspective that is simultaneously meticulous in its cataloguing of visual data, yet socially constrained because the AFs are not permitted to move from their fixed positions:
The girl went straight to Rex and stood in front of him, while the mother came wandering our way, glanced at us, then went on towards the rear, where two AFs were sitting on the Glass Table, swinging their legs freely as Manager had told them to do. At one point the mother called, but the girl ignored her and went on staring up at Rex’s face (Ishiguro, 2021: 3).
In this formative scene, readers are presented with an AI who is eager to interpret and make sense of the external world in a confined, protosensory setting that precedes being purchased and taken to a new home. The descriptive language Klara uses to explain the activity unfolding around her makes it clear that, although she is highly perceptive, her machine vision capabilities exist primarily to serve human interests. When she sees the other AFs swinging their legs from the Glass Table, for example, and qualifies the observation with ‘as Manager had told them to do’, Klara underscores the theme of affective labour that runs through the novel (Hardt, 1999; Du, 2022). While she remains in the store awaiting an owner, the complexity and structure of Klara’s machine vision becomes a central focus of the plot. Looking across the store one day and adjusting her gaze over the magazines table to focus on the front alcove, her view is divided between various boxes, each a partition that divides elements from a single visual scene into discrete parts:
[M]y attention was drawn to the three center boxes, at that moment containing aspects of Manager in the act of turning towards us. In one box she was visible only from her waist to the upper part of her neck, while the box immediately beside it was almost entirely taken up by her eyes. The eye closest to us was much larger than the other, but both were filled with kindness and sadness. And yet a third box showed a part of her jaw and most of her mouth, and I detected there anger and frustration. Then she had turned fully and was coming towards us, and the store became once more a single picture (Ishiguro, 2021: 25–26).
Klara’s efforts to understand detailed visual stimuli by synthesising discrete external information into a unified picture involves a subtle interweaving of self-reflexive stream of consciousness with a quasi-scientific method. Resultantly, her interpretive process appears suspended between two opposing registers: the affective and the machinic. Rather than narrating her observations in purely objective terms, Klara uses qualifying clauses (‘But both were filled’; ‘And yet’) that are suggestive of a contemplative mind at work. While her immediate aim is the processing of visual data, Klara thus habitually imbues interpretations with inferences about the emotional state of the humans in her visual schematic.
Following this logic, to the extent that Klara seeks clarity in her interpretations, her name symbolically mirrors the stylistic literary strategies that Ishiguro uses. From the Latin name Clarus, meaning clear and bright, Klara (or Clara) is tied to the rhetorical concept of clarity (claritas/perspicutas), by which an object can be seen and known. However, as the hermeneutic instability of Klara’s observations suggest, clarity is not only connected to the external characteristics of objects, but also their ‘structure and form’ and the proper relation of ‘parts to each other’ (Styka, 2017: 120). Through complex literary tropes such as these, Klara and the Sun diverges stylistically from traditional science fiction by presenting a protagonist who, for reasons that become apparent over the course of the novel, aspires to synthesise her computational logic with human empathy and is therefore not simply a ‘magical plaything’, but instead represents a ‘plausible development in artificial intelligence’ (Ajeesh and Rukmini, 2022: 4).
However, although Klara’s narration of her machinic optics reveals advanced cognition, her observations are also frequently characterised by possible misperception. When she notes that in the pixel containing part of the Manager’s jaw, and most of her mouth, she ‘detected’ anger and frustration, Klara’s choice of words subtly hints at the unstable interpretive parameters of AI-driven machine vision. Foregrounding the process of detecting emotion, as opposed to intuiting or perceiving it without being directly conscious of doing so, reveals the substantive gap between what Klara sees and that which she computes. In constructing this narrative tension, Ishiguro draws attention to the ways that machine vision technologies are ‘inherently biased not only because they rely on biased datasets’, but also ‘because their perceptual topology, their specific way of representing the visual world, gives rise to’ what can be thought of as ‘perceptual bias’ (Offert and Bell, 2020: 1133). For Klara, this topology also takes the form of invented categorisations and idiosyncratic adjectives that, by estranging daily experiences that are already familiar to us, draw attention to the computational framework through which she sees (Du, 2022: 556). Later in the novel, for example, when Klara observes the interactions between various children playing in the family home, she perceives Josie’s gestures by placing them into distinct categories:
I saw again Josie’s hands at various points during the interaction meeting—welcome hands, offering hands, tension hands—and her face, and her voice when someone had asked why she hadn’t chosen a B3 and she’d laughed and said, ‘Now I’m starting to think I should have’ (81–82).
Drawing together a mosaic of body parts, Klara breaks down the tools of human communication—gestures, facial expressions, voice—into discrete components to process the affective tone behind Josie’s suggestion that she may have chosen poorly in selecting an inferior model AF. In other scenes, the disjunction between Klara’s discerning observations and her requirement to adhere to Manager’s rules can be read as an allegory for the shortcomings of programmed technological tools. ‘I noted all of this,’ Klara says early in the novel, ‘but kept my eyes fixed on the Red Shelves and the ceramic coffee cups’ (Ishiguro, 2021: 31). These instances of observational constraint or contradiction reflect the limitations of machine vision to coordinate a holistic reading of external stimuli akin to that of human perception. MacKenzie and Munster use the expression ‘platform seeing’, to describe contemporary image ensembles that are ‘not simply quantitatively beyond our imagining but qualitatively not of the order of representation’ (2019: 5). To that end, their ‘operativity cannot be seen by an observing “subject”, but rather is enacted via observation events distributed throughout and across devices, hardware, human agents and artificial networked architectures such as deep learning networks’ (2019: 5). The paradigm that MacKenzie and Munster theorise is not simply machine vision, but a distinct mode of ‘invisual’ perception in which space is occupied by a different kind of perception. The effect of this is to challenge the assumed autonomy of AI-driven machine vision by thinking across platform, hardware, algorithm, ensemble and other, more intangible vehicles of visual perception.
Finally, to the extent that self-reflexivity and a discourse of limitation and constraint characterise Klara’s descriptions of machine vision, so too does the literary elegance of Ishiguro’s prose. While the language used to describe how Klara sees is often direct and pared back, it is also at times highly poetic and imagistic. In a critique of the synthesis of surface and depth in Klara’s depictions of the visual world, Ivan Stacy argues that although Klara’s experience of the world is ‘flattened’ to create the effect that ‘individuality and meaning are evacuated’, her perception of surfaces nevertheless prompts a ‘modernist desire for depth’ that suggests the presence of a thinking, sensitive mind (2022: 1). In a moment of ostensible tenderness towards Josie, for example, Klara notes how she ‘raised the bedroom blinds to let the Sun’s pattern fall over her’, suggesting a sensitivity to the dynamics of light and shadow beyond a purely utilitarian objective (Ishiguro, 2021: 86). Later in the novel, as Klara observes the sun setting during a visit to a barn near the family home, she contemplates the depth and surfaces of the scene in a way that is suggestive more of image aesthetics than computational logic:
I wasn’t looking at a single picture … in fact there existed a different version of the Sun’s face on each of the glass surfaces, and what I might at first have taken for a unified image was in fact seven separate ones superimposed one over the other as my gaze penetrated from the first sheet through to the last (Ishiguro, 2021: 277).
Through these heightened, surreal-like observations, Ishiguro introduces readers to a version of machine vision that might be understood in aesthetic or subjective terms. Klara narrates the process of her own computational reflection (‘what I might at first have taken for a unified image’), but with attention to how the object being observed changes depending on how it is looked at (‘as my gaze penetrated’), hinting that a phenomenological process is taking place. Interpreting the world in this way, Klara’s optical logic suggests that machine vision systems ‘make judgments, and decisions, and as such exercise power to shape the world in their own images, which, in turn, is built upon flattening generalities and embedded social bias’ (Azar et al., 2021: 1095). By using the aesthetic, introspective and reflexive qualities of literary language to imbue a non-human character with complex emotive and intersubjective traits, Klara and the Sun presents a site where the relation between technical developments in machine vision technology and algorithmic learning, as well as the essential concept of how we ‘see’ as human beings, become unsettled, challenged and reimagined in literary prose.
II. Physiognomy, Faciality, Partition
As the ‘glistening’ eye-like lenses of Jack Williamson’s alien machine from this article’s opening excerpt remind us, depictions of machine vision in narrative texts have long been coupled with depictions of the human face. This stems at least in part from the long history of the relation between hermeneutics and physiognomy, in which it was considered possible to discern the ‘inner state of a person’ through the external appearance of their face, and to ‘extrapolate the existence of a similar character from similar faces’ (Lavater, 1775; Belting, 2007: 64). Ideas about the essential ‘readability’ of human faces and the capacity to detect specific emotions from exterior anatomical signs thus remain tied to a history of physiognomy that was organised around the assumption that the face provides a reliable image of internal, subjective states.
In the technological epoch in which Ishiguro is writing, the relationship between machine vision and face recognition is far more complex and fraught than that of a century prior. What began as unidirectional face detection surveillance in devices such as CCTV cameras in urban streets is now a lucrative global industry controlled by companies that own gargantuan volumes of surface-level data about human facial expressions, micro-movements, gestures and body language; extracted via everyday processes ranging from social media, virtual employment screening and iPhone face-detection to targeted advertising, border control and real-time police profiling. Adding to this trend, the mass global uptake of videoconferencing because of the coronavirus pandemic has intensified the relationship between facial interpretation and virtual technologies that use machine vision to render, decode and present the human face (Sumner, 2022). The combined result of these developments has been the total collapse of the disjunction between earlier facial recognition systems (which aim to identify particular individuals) and affect-recognition technologies that are designed to detect, categorise and analyse human emotions by surveilling any face. Such is the algorithmic framework that Kate Crawford describes as, ‘at best incomplete and at worst misleading’ (2021: 17).
Concerns about the way faces come to be read, interpreted and understood through machine vision are commensurate with a wider critique about what gets lost, practically and emotionally, with the automation and datafication of knowledge. In other words, when faces are observed by machines, ‘we have a programmed perception that is no longer based on observation and reflection of the object observed’ (Parisi, 2021: 1281). In this mode of interpreting the human face (if interpretation is the correct term), ‘it becomes evident that the feedback function of algorithms incorporates the world in terms of input data through which the world is predicted and acted upon in anticipation of its happenings’ (Parisi, 2021: 1281). By using stylistically elegant, literary language to narrate the perspective of an intelligent machine undertaking facial recognition, Ishiguro joins a growing number of artists, writers and performers who have sought to challenge and subvert the technological, epistemological and cultural paradigms upon which facial recognition surveillance has been developed and deployed.
In Klara and the Sun, this critique takes the form of a complex interplay between the objective external facial data that Klara detects, her predetermined and learned affective intelligence, and a subtle yet powerful kinesic inflection in which the movement, depth, structure and shape of facial expressions become part of how she understands human emotion. Klara’s insightful narration of instances where machine vision registers and mediates the human face draws attention to how ‘perceptual simulations are dynamic cognitive acts’ that are engaged at the level of language itself (Bolens, 2012: 6). Faciality is therefore central to the novel’s commentary on artificial intelligence and machine vision, in that face reading becomes the literary and semantic tool through which Ishiguro positions an ‘empathetic programmed machine’ as a ‘refractive medium’ for reflecting on the role of technology in our own lives and relationships (Sahu and Karmakar, 2022: 2).
Face reading in Klara and the Sun takes two interrelated forms, the first of which involves Klara’s enacting of facial expressions in accordance with the role or task she is programmed (or believes she is programmed) to undertake. As she interprets visual information around her, she narrates the process by which she comes to decide which facial expression best matches the dynamics of her perceived situation, drawing attention to the ways in which algorithmic face recognition ‘operates not by linking a facial template to a pre-existing subject’ but rather ‘by a process of interpellation that connects a given statistical calculation to a certain individual body’ (Celis, 2020: 307). Moreover, by foregrounding the interpretive processes that are enacted when the movement, depth, tone and shape of facial expressions are ‘read’ by a machine, Ishiguro draws attention to the ways in which the ‘patterns of normal behaviour are not natural traits unveiled thanks to the use of algorithms’ but are rather ‘the result of specific social structures that are transmitted to the machine through the training data sets employed in the machine learning process’ (Celis, 2020: 297).
In scenes set in the electronics store, Klara makes decisions about which facial expression to display in each scenario by synthesising directions given by the Manager with her observations of the other AFs in the store. Observing the responses of an adjacent AF, Rosa, she notes how:
Rosa only looked elsewhere for any length of time when a passer-by paused in front of the window. In those circumstances, we both did as Manager had taught us: we put on ‘neutral’ smiles and fixed our gazes across the street, on a spot midway up the RPO Building. It was very tempting to look more closely at a passer-by who came up, but Manager had explained that it was highly vulgar to make eye contact at such a moment. Only when a passer-by specifically signaled to us, or spoke to us through the glass, were we to respond, but never before (Ishiguro, 2021: 3).
Although her choice of expression is often involuntary and enacted in response to the requirements dictated by the Manager, the intersubjective tone of Klara’s reflections (‘It was very tempting to look more closely’) suggests a sociality and intent to engage with the humans around her that exceeds her pre-programmed objectives. These subtle yet powerful moments of introspection, in which Klara expresses tension about how best to read the faces around her, are suggestive of the ways in which ‘seeing the face of another and recognizing that face as the face of another is a foundational act that defines one both as an individual and as an essentially social being’ (Bollmer, 2017: 69). While the AFs in the shop window do not necessarily need to gather data about customers who do not directly signal to them, Klara’s curiosity and desire to observe people disrupts the scientific logic of the scene by presenting machine vision as self-reflexive and social.
In other scenes where Klara ‘puts on’ specific facial expressions, the implication is of a conflicted or divided subjectivity, in which the displayed expression is either entirely contrived or otherwise incongruous with her emotions. Having been rejected by a potential buyer, Klara states: ‘I nodded, putting on a sad face, though I was careful to show I wasn’t serious, and that I hadn’t been upset’ (Ishiguro, 2021: 23). Later, when another potential owner is closely observing her, she performs the neutrality that she has been trained to display, narrating: ‘But I didn’t smile at her. I kept my expression blank, throwing my gaze over the girl’s spiky head to the Red Shelves on the wall opposite’ (Ishiguro, 2021: 30–31). As her cognitive and emotional capacities evolve, and she comes to better understand the range of human facial expressions, Klara nevertheless occasionally reverts to the learned expressions she and the other AFs were directed to practice in the store. For instance, when Josie has a group of friends visit her house and one child attempts to interact socially with Klara, she reverts to her pre-programmed mode of interaction. After a long-armed girl declares ‘Come on, Klara. A little greeting at least’, she responds neutrally, noting: ‘I’d by now fixed a pleasant expression on my face and was gazing past her, much as Manager had trained us to do in the store in such situations’ (Ishiguro, 2021: 77).
Klara’s reflections in these scenes, which hint at a disjunction between the external expressions she presents to the humans around her and the feeling she intuits, point to a form of kinesic intelligence that, as Bolens outlines, ‘is grounded in kinesthesia and brings together neurophysiological and sociocultural parameters that underlie all human interactions’ (2012: 2). When Klara notes that she kept her expression blank yet threw her gaze over the spiky girl’s head, we are activated as readers to understand facial expression less as a fixed image and more as a dynamic scene of cognitive and emotional interactivity. Explaining the kinesic effect of literary language in activating this response, Bolens writes how, ‘instead of conjuring up a static, clearly delineated picture’, readers are ‘led to imagine the intricate blurriness of a mobile facial expression, which in fact achieves a higher degree of precision than a static image, since a facial expression is a phenomenological event that makes sense precisely because of its mobile complexity’ (2012: 8). The effect that Ishiguro achieves in these complex moments of perceptual facial recognition is therefore to blend the machinic with the human by using literary language to tap into our kinesic imagination and, with it, our empathy for the intelligent machine.
In the novel’s second form of face reading, the partitioned structure of Klara’s machine vision becomes the focus of scenes where she attempts to process facial data to deduce human emotion. As Josie grows increasingly ill, it becomes clear that her mother has purchased Klara with the intention of training a digital resurrection of Josie—a revelation that is foreshadowed during an outing to Morgan’s Falls, midway through the novel. In a scene in which the mother attempts to deduce Klara’s potential to mimic Josie, face reading becomes unsettlingly dramatised. Describing her interaction with Josie’s mother, Klara observes:
She was gazing straight at my face, the way she’d done from the sidewalk when Rosa and I had been in the window. She drank coffee, all the time looking at me, till I found the Mother’s face filled six boxes by itself, her narrowed eyes recurring in three of them, each time at a different angle (101).
As the mother moves up close to Klara, gazing into her screen (face), the technical components of machine vision (boxes and divergent angles) become the focalising register through which the scene is organised. Even though it is the mother who is attempting to read Klara, the narrative voice creates a sense of two-way interactional intensity by positioning readers to imagine the act of facial interpretation from Klara’s perspective. Such an inversion, in which ambiguity is registered by a machine rather than a human, reflects Jessica Helfand’s discussion of face reading in which she notes how it is ‘one thing to deploy a picture to visually represent an external physical condition’, yet it is ‘quite another to bestow upon it the far more intangible qualities of emotion, expression, even—and perhaps especially—the assessment of mental stability’ (2019: 41). While it is the mother who is closely observing Klara for signs of insight, verisimilitude and even some form of mental stability, Ishiguro cleverly diffuses the interpretive dynamic of the scene across both characters simultaneously.
The scene’s intersubjective impulse, then, depends upon a reader’s kinesic imagination of a machine’s interpretive lens, even if their understanding of how machine vision works in practice might be limited. In this way, to return to Bolens’ theorisation of face reading, the kinesis stimulated draws upon the ‘interactional perception of movements performed by oneself or another person in relation to visuomotor variables such as the dynamics, amplitude, extension, flow, and speed of a gesture or the relation of limbs to the rest of the body, as in the change in orientation of the head or the modification of angles formed by the elbow or the shoulder’ (2012: 2). As the interaction between the two characters intensifies, Klara observes: ‘The Mother leaned closer over the tabletop and her eyes narrowed till her face filled eight boxes, leaving only the peripheral boxes for the waterfall, and for a moment it felt to me her expression varied between one box and the next. In one, for instance, her eyes were laughing cruelly, but in the next they were filled with sadness’ (104). The mother then leans even further across the table and Klara detects additional emotions, observing how she ‘could see joy, fear, sadness, laughter in the boxes’ (104). To achieve affective intensity, the scene presents readers with a combination of ‘concrete sensorimotor ideas’ via the digital partitioned boxes, and a range of abstractions in the form of a mosaic of human emotions which, together, generate ‘the perceptible effect produced by the movements of facial muscles’ (Bolens, 2012: 5–6). Ishiguro’s literary prose thus leaves readers suspended halfway between the objective yet evocative image of the mother’s face divided across the partitioned screens and the persuasive interiority we encounter in Klara’s reflection upon what is displayed, creating the effect of a divided subjectivity that occurs in the exchange between a human and a non-human face.
III. Artificiality, Affect, Persona
While the first half of Klara and the Sun is focused mostly on Klara’s machinic, social and perceptual encounters as she leaves the electronics store and settles into her new role as Josie’s AF, the novel takes a dramatic and uncomfortable turn when we learn that the mother intends to use Klara as a digital surrogate for her chronically ill daughter. Although readers are introduced earlier to the mother’s plan via subtle cues in the dialogue, it becomes clear that, despite her otherwise heightened perception, Klara does not know (or fully understand) the mother’s intention to use her to replace Josie. Having established this epistemological tension, Ishiguro then turns once again to a complex form of face reading to dramatise Klara’s attempt to understand the perplexing behaviour of the humans around her, especially that of the mother. As she begins to sense that something is amiss, her interpretations of external visual data become increasingly suffused by emotions, assumptions, hesitations and a form of paranoia that suggests her machinic faculties might be defective. Through this literary technique, as Sahu and Karmakar argue, the novel ‘serves as a prism to sensitize readers to the spectrum of emotional complexities pertaining to human relationships in the rapidly changing posthuman world’ (2022: 3). As Klara becomes markedly uncomfortable with her capacity to interpret the world around her, readers also develop an emotional response—ugly feelings—to the ethical predicament at the centre of the novel.
Drawing on Adorno’s discussion of aesthetic autonomy in Aesthetic Theory, Sianne Ngai’s Ugly Feelings helps explain how literature is perhaps ‘the ideal space to investigate ugly feelings that obviously ramify beyond the domain of the aesthetic proper’ (Adorno, 1997; Ngai, 2005: 2). Ngai’s theory of ugly feelings— covering tone, animatedness, envy, irritation, anxiety, stuplimity, paranoia, disgust—approaches emotions as ‘unusually knotted or condensed interpretations of predicaments’ (2005: 3). She goes on to qualify that these interpretations are ‘signs that not only render visible different registers of problem (formal, ideological, sociohistorical) but conjoin these problems in a distinctive manner’ (2005: 3). In Klara and the Sun, literature serves a crucial function in testing out how we feel about our own ‘sociotechnical imaginaries’, by presenting readers with characters who challenge our ‘collectively held, institutionally stabilized, and publicly performed visions of desirable futures, animated by shared understandings of forms of social life and social order attainable through, and supportive of, advances in science and technology’ (Sheila, 2015: 4). Towards the end of the novel, Klara’s insight, faculties and, by extension personality, are no longer afforded their own agency but are instead subservient to the needs of a grieving mother. As Klara gradually begins to register a form of discomfort at the developing situation, ugly feelings are engendered in the reader.
These feelings are in part the result of a paradox that lies at the centre of the contemporary relations between seeing and knowing brought about by advanced machine vision technologies. The more we distribute our agency across vast and increasingly complex networks of non-human agents to enhance forms of visibility, the less knowledge we have about ‘the very processes behind the way in which these new visualities are rendered visual’ (Azar et al., 2021: 1098). In other words, although advanced digital and algorithmic technologies allow us to see more, we are not afforded knowledge to completely understand how there is more to see. Reflecting this paradox, in early scenes in the novel Klara begins to show signs of the process by which she engages external stimuli to unpack the concept of emotions. Looking out onto the street from the store she observes how:
Still, there were other things we saw from the window—other kinds of emotions I didn’t at first understand—of which I did eventually find some versions in myself, even if they were perhaps like the shadows made across the floor by the ceiling lamps after the grid went down (Ishiguro, 2021: 18–19).
Triangulating sight, emotion and the more abstract image of shadows made across the floor by lamps, the feeling that Klara describes in this scene is perhaps more akin to negation than intensely felt human emotion. Although she purports to convert the perplexing visual stimuli from outside the window into emotions that she might come to perceive in herself, she nevertheless articulates a form of resignation, which Ishiguro likens in strikingly poetic and imagistic terms to shadows dancing across the floor when the store is closed at night. This strangely hybrid feeling is aligned with that which Ngai calls ‘stuplimity’: a ‘concatenation of boredom and astonishment—a bringing together of what “dulls” and what “irritates” or agitates; of sharp, sudden excitation and prolonged desensitization, exhaustion, or fatigue’ (2005: 271). Using the metaphorical structures of literary prose (‘like the shadows’), Ishiguro creates the effect of ennui in a non-human subject, who refers nostalgically and in a highly subjectivised fashion to ‘some versions’ of herself.
As the ethical tension unfolds, in scenes where she begins to merge the objective with the emotional, Klara’s observational dynamics become increasingly affective. For instance, when a group of children visit Josie’s house for one of the ‘interaction meetings’, she suddenly begins to display signs of apprehension, which she understands not as a response to any obvious or logical information, but rather as part of a developing sense of intuition:
There was an unpleasant tint on the three boxes containing the boys on the sofa—a sickly yellow—and an anxiety across my view of them, and I began to attend instead to the voices around me (Ishiguro, 2021: 70).
In this curious scene, Klara brings machine vision and emotion into an integrated outlook (‘an anxiety across my view of them’), creating the impression of an affective image or landscape rather than a human feeling per se. Invoking the ‘tint’ of the partitions in her line of sight as an index to an ‘unpleasant’ feeling that she associates with the concept of anxiety, she then moves to incorporate the auditory stimuli of the voices around her to attempt to more clearly interpret the unfolding scene.
The visual dynamic of this scene can be usefully understood in relation to Ngai’s formulation of anxiety. For Ngai, while anxiety is ‘intimately aligned with the concept of futurity, and the temporal dynamics of deferral and anticipation,’ it also has a ‘spatial dimension’ (2005: 210). Ngai notes how, in psychological discourse:
[A]nxiety is invoked not only as an affective response to an anticipated or projected event, but also as something “projected” onto others in the sense of an outward propulsion or displacement—that is, the quality or feeling the subject refuses to recognize in himself and attempts to locate in another person or thing (usually a form of naïve or unconscious defense) (2005: 210).
Similarly, as Susanne Langer writes in Feeling and Form: ‘It takes precision of thought not to confuse an imagined feeling, or a precisely conceived emotion that is formulated in a perceptible symbol, with a feeling or emotion that is actually experienced in response to real events. Indeed, the very notion of feelings and emotions not really felt, but only imagined, is strange to most people’ (1953: 181). Sensing what she understands to be anxiety as the social scene around her becomes increasingly tense, Klara thus defers or displaces feeling away from the visual and its attendant ‘sickly yellow’ tint to other sensory data.
In other scenes towards the end of the novel, especially as Klara begins to piece together the complexity of Josie’s situation, feelings are invoked not as discrete data to be read across the human face, but instead as part of Klara’s innate knowledge, reflecting a trend in which, to borrow again from Ngai, ‘affect becomes publicly visible in an age of mechanical reproducibility: as a kind of innervated “agitation” or “animatedness”’ (2005: 32). Ruminating on her prior interpretations, Klara describes the source of the ‘uncomfortable feeling’ she intuits from the mother’s behaviour:
I believed there were particular danger topics for Josie, and that if only the Mother could be prevented from finding routes to these topics, the Sunday breakfasts would remain comfortable. But on further observation, I saw that even if the danger topic were avoided—topics like Josie’s education assignments, or her social interaction scores—the uncomfortable feeling could still be there because it really had to do with something beneath these topics; that the danger topics were themselves ways the Mother had devised to make certain emotions appear inside Josie’s mind (Ishiguro 2021: 91).
In these introspective moments, where Klara explicitly builds upon past intelligence to develop an understanding of complex human emotion, a split between faciality and inner consciousness becomes apparent, reflecting Klara’s movement towards the human tendency to ‘make an image’ of a face by reading its expressions even though we know ‘that images can deceive’ (Belting, 2017: 18). The subtle stream of consciousness technique through which Ishiguro develops Klara’s perceptual development, wherein she considers concepts ‘beneath’ other topics, produces the narrative effect of ‘processes of aversion, exclusion’ and, ultimately, ‘negation’ (Ngai, 2005: 11–12). Such is Klara’s perceptual matrix that she ‘relies on a common-sense model of surface and depth’ in which ‘she realizes that aspects of human emotion may exist hidden from view, and can only be perceived partially and through inference’ (Stacy, 2022: 4). In other words, although Klara senses that an ‘uncomfortable feeling’ is indexical to specific topics of conversation, she nevertheless registers its presence computationally or spatially rather than morally, even when she detects a moral imperative based upon the synthesis of prior engagement with humans. The division between what Klara perceives as the correct or moral course of action and what she is programmed to do in turn generates unease in the reader, who is confronted with the material conditions that produce artificial intelligence in the first place.
Conclusion
One of the most pressing analytical challenges for contemporary cultural works, especially speculative and science fiction, is to reframe machine vision in ways that allow the ‘diagrammatic dimension associated with image collections to appear in less abstract, less representational ways’, so that we might begin to understand machine vision and its attendant facial recognition capacities as ‘situated, operative and as generative of new kinds of actualities’ (MacKenzie and Munster, 2019: 13). By producing a character in Klara and the Sun that is less an artificial intelligence machine and more a ‘cultural other’, Ishiguro presents a future world in which advanced machine vision is not only fully embedded into our technological and social interactions, but also our subjective reality (Kim and Kim, 2013; Coeckelbergh 2011).
Klara’s paradoxical perceptual toolkit, through which it becomes clear that more data is not necessarily advantageous, offers a powerful critique of the ways that machine vision sometimes stalls on a ‘plateau of object recognition’ and cannot ‘achieve reliable scene understanding’ (Murphy, 2019: 1). In many instances throughout the novel, heightened attention to feeling becomes the solution to these technological glitches; however, feeling itself is often presented as compromised because it is mediated through Klara’s advanced understanding of what, precisely, she should be thinking or feeling. Slipping in and out of subjective boundaries, Klara positions readers to acknowledge the ugly and uncomfortable feelings that increasingly attach themselves to the advanced artificial intelligence that defines our contemporary moment.
To that end, perhaps the novel’s most impressive achievement in offering a conflicted and uncomfortable model of ugly feelings, enacted and mediated through instances of human-to-machine vision, is that it problematises literature’s relationship to a rapidly evolving culture of concern and unease about the status of artificial intelligence and sentient machines. Yet Ishiguro’s project is not driven by a strategy of hyperbole or even hypothesis in his depiction of emotionally and cognitively complex machines whose feelings are amplified to hold the reader’s attention. Instead, through the careful treatment of literary language, itself a complex response to rapidly evolving technology, Klara and the Sun presents instances of affective subtlety, hesitation, ambiguity, mutability, confusion and deficit to solicit an emotional response in the reader concerning the sociotechnical reception and future possibilities of machine vision. This intricate interplay between kinesic imagination, characterisation and a not-too-far away speculative future produces moments of intensely uncomfortable, ugly, feelings.
Understanding these feelings and their transformative potential is merely one step towards a future where machine vision can attempt to ‘step beyond the ocularcentric metaphysics of the Western gaze and the reproduction of racial capital’ and in which we accept that there is ‘sufficient reason to think more carefully about the extent to which we are willing to rely upon its gaze’ (Parisi, 2020: 1281; Andrejevic and Selwyn, 2022: 190). Through the fascinatingly inquisitive, curiously restless and affectingly empathetic machine vision of Klara, Ishiguro’s novel is exemplary of the ways in which literary, speculative and other forms of contemporary fiction provide a critical lens to better understand the technological, affective, social and ethical dimensions of human-machine relations. Ironically, it is through Klara’s human-like discomfort, rather than her precision computation, that the implications, both good and bad, of future machine vision are most incisively felt and contemplated.
Acknowledgements
The author would like to thank Jill Walker Rettberg and the ERC-funded Machine Vision in Everyday Life project team for supporting the development of this paper in early discussions and workshops.
Competing Interests
The author declares that they have no competing interests.
References
Adorno, T W 1997 Aesthetic Theory Hullot-Kentor, Robert (ed. and trans.). Minneapolis: University of Minnesota Press.
Ajeesh, A K and Rukmini, S P 2022 Posthuman Perception of Artificial Intelligence in Science Fiction: An Exploration of Kazuo Ishiguro’s Klara and the Sun. AI & Society, 38(1): 853–860. DOI: http://doi.org/10.1007/s00146-022-01533-9
Andrejevic, M and Selwyn, N 2022 Facial Recognition. Cambridge, UK: Polity Press.
Azar, M, Cox, G and Impett, L 2021 Introduction: Ways of Machine Seeing. AI & Society, 36(1): 1093–1104. DOI: http://doi.org/10.1007/s00146-020-01124-6
Beer, D 2013 Popular Culture and New Media: The Politics of Circulation. New York, NY: Palgrave Macmillan. DOI: http://doi.org/10.1057/9781137270061
Belting, H 2017 Face and Mask: A Double History. Princeton, NJ: Princeton University Press. DOI: http://doi.org/10.1515/9780691244594
Berger, J 1972 Ways of Seeing. London: Penguin Group.
Bolens, G 2012 The Style of Gestures: Embodiment and Cognition in Literary Narrative. Baltimore: The Johns Hopkins University Press.
Bollmer, G 2017 Empathy Machines. Media International Australia, 165(1): 63–76. DOI: http://doi.org/10.1177/1329878X17726794
Bucher, T 2017 The Algorithmic Imaginary: Exploring the Ordinary Affects of Facebook Algorithms. Information, Communication & Society, 20(1): 30–44. DOI: http://doi.org/10.1080/1369118X.2016.1154086
Bueno, C C and Abarca, M J S 2021 Memo Akten’s Learning to See: from Machine Vision to the Machinic Unconscious. AI & Society, 36(1): 1177–1187. DOI: http://doi.org/10.1007/s00146-020-01071-2
Celis, C 2020 Critical Surveillance Art in the Age of Machine Vision and Algorithmic Governmentality: Three Case Studies. Surveillance & Society, 18(3): 295–311. DOI: http://doi.org/10.24908/ss.v18i3.13410
Coeckelbergh, M 2011 You, Robot: on the Linguistic Construction of Artificial Others. AI & Society, 26(1): 61–69. DOI: http://doi.org/10.1007/s00146-010-0289-z
Crary, J 1992 Techniques of the Observer: On Vision and Modernity in the Nineteenth Century. Cambridge, MA: MIT Press.
Crawford, K 2021 Atlas of AI: Power, Politics and the Planetary Costs of Artificial Intelligence. New Haven: Yale University Press. DOI: http://doi.org/10.12987/9780300252392
Du, L 2022 Love and Hope: Affective Labor and Posthuman Relations in Klara and The Sun. Neohelicon, 49: 551–562. DOI: http://doi.org/10.1007/s11059-022-00671-9
Hacking, I 1986 Making Up People. In: Heller, Thomas C, Sosna, Morton and Wellbery, David E (eds.) Reconstructing Individualism: Autonomy, Individuality, and the Self in Western Thought. Palo Alto: Stanford University Press. pp. 222–236.
Hardt, M 1999 Affective Labor. Boundary 2, 26(2): 89–100.
Heidegger, M 1977 The Age of the World Picture. The Question Concerning Technology, and Other Essays. London: Harper and Row.
Helfand, J 2019 Face: A Visual Odyssey. Cambridge, MA: The MIT Press.
Hoelzl, G 2018 Postimage. In: Braidotti, Rosi and Hlavajova, Maria (eds.) Posthuman Glossary. London: Bloomsbury. pp. 361–361.
Ishiguro, K 2021 Klara and the Sun. A Knopf: New York.
Jasanoff, S 2015 Future Imperfect: Science, Technology, and the Imagination’s Modernity. In: Jasanoff, S and Kim, S H (eds.) Dreamscapes of Modernity: Sociotechnical Imaginaries and the Fabrication of Power. Chicago: University of Chicago Press. pp. 1–33. DOI: http://doi.org/10.7208/chicago/9780226276663.003.0001
Kim, M and Kim, E J 2013 Humanoid Robots as “The Cultural Other”: Are We Able to Love our Creations? AI & Society, 28(3): 309–318. DOI: http://doi.org/10.1007/s00146-012-0397-z
Langer, S 1953 Feeling and Form. New York: Scribner.
Lavater, J C 1775 Physiognomic Fragments to Promote Knowledge of Humanity and Love of Humanity. Leipzig: Weidmann and Reich.
MacKenzie, A and Munster, A 2019 Platform Seeing: Image Ensembles and Their Invisualities. Theory, Culture & Society, 36(5): 3–22. DOI: http://doi.org/10.1177/0263276419847508
Manovich, L 2021 Computer Vision, Human Senses, and Language of Art. AI & Society, 36(1): 1145–1152. DOI: http://doi.org/10.1007/s00146-020-01094-9
Møhl, P 2021 Seeing Threats, Sensing Flesh: Human-machine Ensembles at Work. AI & Society, 36(1): 1243–1252. DOI: http://doi.org/10.1007/s00146-020-01064-1
Murphy, R R 2019 Computer Vision and Machine Learning in Science Fiction. Sci. Robot, 4(30): 1. DOI: http://doi.org/10.1126/scirobotics.aax7421
Ngai, S 2005 Ugly Feelings. Cambridge, MA: Harvard University Press. DOI: http://doi.org/10.4159/9780674041523
Offert, F and Bell, P 2020 Perceptual Bias and Technical Metapictures: Critical Machine Vision as a Humanities Challenge. AI & Society, 36(1): 1133–1144. DOI: http://doi.org/10.1007/s00146-020-01058-z
Parisi, L 2020 Negative Optics in Vision Machines. AI & Society, 36(1): 1281–1293. DOI: http://doi.org/10.1007/s00146-020-01096-7
Sahu, O P and Karmakar, M 2022 Disposable Culture, Posthuman Affect, and Artificial Human in Kazuo Ishiguro’s Klara and the Sun (2021). AI & Society. DOI: http://doi.org/10.1007/s00146-022-01600-1
Stacy, I 2022 Mirrors and Windows: Synthesis of Surface and Depth in Kazuo Ishiguro’s Klara and the Sun. Critique: Studies in Contemporary Fiction. 1–15. DOI: http://doi.org/10.1080/00111619.2022.2146479
Styka, J 2017 The Stylistic Category of Clarity (ΣΑΦΗΝΕΙΑ, Explanatio, Perspicuitas, Claritas) In the Eyes of Greek and Roman Writers. Classica Cracoviensia, XX. pp. 119–139. DOI: http://doi.org/10.12797/CC.20.2017.20.07
Sumner, T D 2022 Zoom Face: Self-Surveillance, Performance and Display. Journal of Intercultural Studies, 43(6): 865–879. DOI: http://doi.org/10.1080/07256868.2022.2128087
Virilio, P 1994 The Vision Machine. Indianapolis: Indiana University Press.
Williamson, J 1931 ‘The Doom From Planet 4’ in Astounding Stories [ebook]. W.M. Clayton. https://www.gutenberg.org/files/31168/31168-h/31168-h.htm [Last Accessed 14 August 2023]