Introduction–On Trunked and Truncated Beasts
Can we address the elephant in the room? Spaces containing complex cultural collections (CCC) pose thorough challenges to the cognitive systems of visitors. Encounters with galleries, libraries, archives, or museums require sense-making activities with a vast number of mostly unknown objects. These are frequently of high perceptual diversity and rich in detail, each one connected to many threads of further information; and are commonly arranged in physical architectures based on unfamiliar principles. Even if visitors intend only to experience leisurely pleasure, such encounters require significant amounts of perception, interpretation, and learning. In short, considerable mental effort is required in order to cope with objects’ and topic’s complexities. If visitors are not domain experts, there is a good chance that this mental effort will soon translate into a rather simple generic feeling like fatigue, exhaustion, decreased attention, and information overload, or—if they cannot connect to the matter at all—plain boredom (Robinson, Sherman & Curry, 1928).
So aside from their well-known marvelous and inspiring aspects, it is rarely made explicit that CCCs require considerable support from a perception and cognition perspective. Learning about collections–i.e. building up a mental model (Vandenbosch & Higgins, 1996)–can be strenuous and challenging. This applies when visitors simply stroll through collections but is amplified when they explore the multiple dimensions of associated information (on textual displays or in collection catalogs) in depth. This challenging side of cultural collection is well-documented and well-known, too: ‘Museum fatigue’ and similar effects (like early satiation, exhaustion, and distraction) have been documented and studied for a long time (Bitgood, 2009a; Bitgood, 2009b; Davey, 2005; Gilman, 1916). Combined with restricted cognitive resources, collection complexity often enforces selectivity and simplification on the observers’ side. ‘Simply put, complexity is limited understanding. It is the absence of information that makes full comprehension of a system impossible’ (Rasch, 2000: 49). Furthermore, ‘increased consciousness of complexity brings with it the realization that “total comprehension” and “absence of distortion” are unattainable’ (Rasch, 2000: 51). As a practical consequence visitors often build up only a limited understanding of the collection, grasping only fragments of the cultural riches before exiting through the gift shop.
We are reminded of the parable on the elephant and the blind men.1 As an early reflection on the cognitive and communicative woes in face of object complexity, the tale ponders on the selectivity and apparent incompatibility of truncated system descriptions. Some sort of access to complex objects is possible for everyone, yet limited cognitive resources commonly generate idiosyncratic snapshots or locally valid impressions only. As for the reconnection of these partial perspectives and observations, the fable finds a solution either in an outside observer, who provides vision and conceptual integration; or in procedures of communication between the owners of the restricted views. We will keep those suggestions in mind, while turning back to present day cultural collections, which show no signs of simplifying as media technologies evolve.
Following decades of digitization, CCCs often exist both as traditional object collections in physical spaces, and as digital collections in data and information spaces.2 It is in these theatres of operation where GLAM professionals (i.e. the owners, curators, guides, or custodians of galleries, libraries, archives or museums) have to support activities to chase, grasp, and reassemble elephants on a daily basis. It is their challenge to make collections comprehensible in face of limited vision and finite attention spans. Even if there is a strong belief among museum professionals that museum fatigue cannot be stopped, ‘much like death and taxes’ (Bitgood, 2009b: 195) the fight against it (diminishing, struggling, wrestling with it) is part of their daily work. Numerous approaches also show that fatigue is in fact not inevitable, ‘if we design the visitor experience [more] effectively’ (ibid.: 196).
Looking around, we find numerous design strategies which help visitors to grasp the elephant while shunning, minimizing, or ameliorating fatigue. Many of them have been applied both in physical museum spaces, as well as in digital information spaces. Prominent methods include storytelling (Bedford, 2001; Boyd Davis, Vane & Kräutli, 2016), audio guides (Kuflik et al., 2011), gamification (Champion, 2014; Rowe, Lobene, Mott, & Lester, 2014), personalization and customization (Huang, Liu, Lee & Huang, 2012), participation (Ridge, 2013), and making curatorial concepts and arrangement principles transparent (e.g. onboarding techniques or ‘advance organizers’, as described by Anderson & Lucas, 1997).
In the following section, we zoom in on approaches which utilize methods of visualization to support the understanding of complex cultural collections. A synoptic approach is outlined by section three, its exemplary implementation in the fourth section, and its evaluation in the context of a case study in section five.
Visualization of Complex Cultural Collections
Visualization creates graphical representations from complex data allowing visitors to explore them interactively, and to acquire insights that unaided perception would not allow for (Ferreira & Levkowitz, 2003). The purpose of such representations is thus the amplification and augmentation of human cognition. This includes the acceleration of users’ understanding; and the support of their analysis, reasoning, and sense-making activities in face of enormous, heterogeneous, abstract, and often time-oriented data (Arias-Hernandez et al., 2012; Thomas & Cook, 2005).
Digital collections commonly integrate digitized object representations of artefacts (such as images, text, audio, videos, or 3D models) and associated metadata entries, such as place of origin, date of origin, creator, style, or inter-object relations (see Figure 1).
In some cases the databases of GLAM institutions already mirror the complexity of their physical collections, constituting a prototypical example of massive, heterogeneous, abstract, and often time-oriented data. Such digital databases are often even less amenable to human sense-making than their physical counterparts, a problem exacerbated by the fact that visitors to digital collections are often treated as if common information seekers on the web, and so are provided with only the most basic (search-centered) access technologies. Such search-based interfaces require a thorough understanding of the collection’s structure and available metadata to retrieve meaningful results (Goodale et al., 2014). It is this dire background against which several novel visualization-based approaches to data complexity have been developed. As the characteristics of CCC data differ to other collections of data in various ways, these works also expanded the understanding of ways in which users’ cognition could be supported more adequately. In addition to the task-driven and deficit-oriented conception of visitors as information seekers, they provide new facets of understanding by utililizing methods to support visitors as playful, curiosity-driven, strolling, critical and exploratory subjects.3
Generosity, Serendipity and the Autotelic Reframing of Data Complexity
Let’s imagine a visitor arriving at the landing page of an art gallery, an archive or a museum, with a collection he doesn’t know well (cf. Whitelaw, 2015). We consider this visitor lucky if the website developers have already taken on board recent work reconsidering how visualizations can help visitors engage with the elephant ahead. In the following, we explore a selection of the basic ideas and design strategies they might have employed.
As a powerful paradigm for interaction with abundant information, the ‘search box’ approach to information retrieval has dominated interface and interaction design since the emergence of the web. Search boxes still are often chosen as the main method of access and are used even by the largest cultural collections such as Europeana, which, as of June 2018, contained more than 50 million artworks (europeana.eu). Exemplarily implemented by our everyday search engines, the search box paradigm builds on two assumptions: that visitors at least vaguely know what they are looking for, and that visitors do not want to engage with the complexity of the search space, which stays hidden from their perception until query algorithms have done their mediating work. However, this only works if users are able to state their needs (i.e. their information deficit), after which ten blue links to further data or information artifacts are wheeled out for closer inspection (Broder et al., 2010).
Dörk et al. (2011) reject these assumptions. Building on studies of non-experts (or ‘casual users’, as per Pousman et al., 2007), they firstly take issue with the paradoxical manner in which search engines require visitors to search for things they commonly know little or nothing about. Against this unjustified assumption, they call for methods that enable direct access and exploration, such as directly entering a data collection and strolling through its riches. Secondly, they revise the operating metaphor on data complexity. If massive data collections are not conceived as tiresome deserts or dusty archives, but for instance as vital landscapes or vibrant cities, then movement through them becomes an ‘autotelic’ activity, providing aesthetic value in and of itself. Here, the shortening of search paths and times is no longer front-and-centre to the visitor experience, but rather the provision of vertical immersion and horizontal exploration in and through datasets. The visitor is no longer positioned as a deficit-driven information seeker, but as an open-minded urban flâneur. In order to facilitate the desires of this browsing subject, interfaces should extend beyond the search-box and become ‘generous’, enabling hedonistic, open-ended, curiosity-driven and multi-perspective data engagement endeavors (Whitelaw, 2015).
This ‘generous design’ avoids starting with questions but prefers to directly show: it aims to offer rich overviews and context, as well as high quality primary content and detail on demand (Butler, 2013). Because it has the privilege to deal with data that does not have to be hidden it can throw the doors of collections wide open and so transform databases into giving and sharing visual repositories, which represent scale and richness; but also allow multiple ways to focus on specific details. To honor the complexity and diversity of a collection, generous design offers multiple access or vantage points, and encourages multiple perspectives on the assembled riches. Understanding that any given visualization method can capture only certain aspects of a collection’s composition or structure, it calls for multiple views to be used in the presentation of objects, combining the strengths of different methods and forming complementary composites to reveal different aspects of a collection. Such a multi-perspective interface enables the ‘open-ended proliferation of partial views, rather than a single total or definitive representation’ (Whitelaw, 2015: n.pag.), an approach which, as Drucker (2013) argues, better match the open-ended dynamics of human interpretative processes.
Another key facet of human information acquisition that visitors can utilize in such interfaces is ‘serendipitous’ engagement. In museums, libraries and other open object collections, visitors frequently find interesting and inspiring information by chance. Several studies on everyday information practices show that serendipitous encounters constitute a key component of information acquisition (Ross, 1999). Thudt et al. (2012) thus reflected on interface design methods which create options for serendipitous learning and for encountering unexpected information of interest. Based on their study, they recommend following a playful approach to information exploration and to entice curiosity through visually distinct representations of single objects. Furthermore, they recommend to highlight adjacencies between objects but also to provide flexible visual pathways for exploring a collection, and to grant multiple visual perspectives and access points.
Advantages and Challenges of Multiple Views
As a standard technique for fostering multiple entry points and a plurality of perspectives and interpretations, the method of ‘multiple views’, or ‘coordinated multiple views’ (Andrienko & Andrienko, 2007; Roberts, 2007) has been established. Offering multiple views has the advantage to ‘maximise insight, balance the strengths and weaknesses of individual views, and avoid misinterpretation’ and ‘allow the user to select and switch between the most appropriate representations for the data and task at hand’ (Kerracher et al., 2014: 3). Instead of betting all analytical capacities on singular implementations of visualization methods like maps, networks, or treemaps (see Figure 2, left hand side), advanced interface design builds on the understanding that one view is not enough (Dörk et al., 2017)—bountiful combinations of views are the way to go. As a recent review of visualization approaches to cultural collections shows, existing visual collection interfaces frequently make use of this principle, and implement on average 2.6 different spatial, structural, or cross-sectional visualization methods (Windhager et al., 2018).
However, offering multiple views can also be a way to cover a specifically interesting data dimension in a more diverse or in-depth fashion. For the cultural heritage domain, time is such a crucial data dimension (Dörk et al., 2017). Accordingly, the temporal origins of individual objects or collection parts should not only be visualized by the means of simple timelines, but also by utilizing animation, layer superimposition, layer juxtaposition or space-time cube representations (see Figure 2, right hand side). Using these options in tandem can help to maximize insights and balance the strengths and weaknesses of individual views for the temporal data dimension in particular (Kerracher et al., 2014). Although analysts of cultural collections could arguably benefit from such a rich depiction of the temporal dimension, we found collection interfaces to only use a modest number of 1.2 time-oriented views on average (Windhager et al., 2018), which shows a huge potential for future designs.
While we consider this generous provision of multiple (spatial, structural, and temporal) perspectives as a strength of novel interfaces, their implementation also comes with a notable downside, which has been barely mentioned or problematized up to now: multiple perspectives recreate perceptual complexity and diversity on the overview level on our screens. The resulting challenge has various consequences for macrocognitive reasoning operations (Klein & Hoffman, 2008), i.e. for sense-making in the context of complex data and tasks. We call this challenge the ‘split-attention challenge’ of complex interfaces with multiple views—and consider it to be a second-order problem of visual reasoning, and a fundamental challenge for future visualization system design (cf. Schreder et al., 2016).4
From Visual Analytics to Visual Synthetics
Split-attention challenges arise when observers of multiple views start to wonder about the bigger picture of a collection—or what the whole elephant looks like—yet their diverse information sources appear spatially or perceptually separated, and do not easily merge.5 Visual-analytical interfaces mostly focus on taking complex subject matters or data apart, separating them into their constituent elements and providing cross-sectional or longitudinal cuts with different techniques through complex objects of study.6 Figure 3 shows two screenshots taken from prominent visualization interfaces, which are frequently applied to the analysis of cultural heritage collections (Coleman et al., 2017; Jänicke et al., 2013). In the selected arrangements, they both combine the map-based representation of a collection with a time-oriented representation (i.e. a histogram and a line chart).
For a synoptic integration of the displayed data, users have to combine information from both views (i.e. from the spatial and temporal perspective at the same time) and build up a mental model that integrates both data dimensions. Cognitive science researchers have called attention to the fact that such synthetic operations are cognitively demanding in general, but require even higher cognitive effort when the aim is to construct a coherent and consistent mental model rather than a sketchy ‘cognitive collage’ (Tversky, 1993). We contend that this challenge becomes aggravated where visual-analytical systems are designed without additional ‘coherence techniques’, or in the absence of a macrocognition-supporting visual-synthetical framework (Schreder et al., 2016).
With regard to the synthesis of bigger pictures, we distinguish between possible results along a quality gradient of construction. According to Tversky’s distinction (1993), ‘cognitive collages’ equal a distorted mix-up of partial information, differing perspectives and reference points that characterize fragmentary internal representations. ‘Snippets of information are stored in memory but are not systematically or only loosely related to one another. Though this information can be recalled, it is difficult to use such ill-structured information to solve more complex problems’ (Schreder et al., 2016: 82). In contrast to cognitive collages, mental models integrate different aspects and perspectives and ‘capture the categorical or spatial relations among elements coherently, allowing perspective-taking, reorientation, and spatial inferences’ (Tversky, 1993: 15). Figure 4 illustrates the distinction with figurative regard to the CCC elephant.
While it is relatively easy to synthesize jumbled and fragmented collages from multiple views, their coherent assembly requires either more mental effort by the user—or the development of more effective techniques of visual-synthetical design on the visualization side.7
With regard to ‘coherence techniques’, which support cognition by connecting insights from different views to larger units of sense-making, we find two basic approaches: the use of consistent visual variables or design choices across multiple views (Qu & Hullman, 2018); and the use of coordinated interaction methods like coordinated selecting and highlighting or linking and brushing, as well as synchronized panning, scrolling or zooming (North & Shneiderman, 2000). Yet even if both techniques are exemplarily implemented—as in the two interfaces shown in Figure 3—cognitive challenges remain. On the one hand, significant visual work is needed to bridge the distance of separated views, while conflicting design choices must be disambiguated. One of these conflicts is created by the simultaneous use of the horizontal axis as an west-east axis of the map view, while simultaneously representing the temporal data dimension in the other view.
As Funtowicz and Ravetz (2013: 8) put it in their reflection on the elephant, ‘[e] ach perceives his or her own elephant as it were. The task of the facilitator is to see those partial systems from a broader perspective, and to find or create some overlap among them all, so that there can be agreement or at least acquiescence’. Accordingly, we think that the facilitation and orchestration of inter-perspective agreement is a challenge worth a systematic research effort of its own. The development of future visual analytics interfaces deserves special attention from a visual synthetics perspective in order to cope with the downsides that the multiplication of perspectives brings. We do not think that ‘multiple view-fatigue’—which can hit viewers when trying to synthesize everything on their own—is inevitable, if visualization designers put the synthetical challenge on their agenda. This should not be done to the detriment of the hermeneutic richness of single views, but for their mutual amplification. As such, we want to explore options to better organize visual complexity, and to do so in a coherent, consistent and interoperable manner. Guided by these targets, we introduce an approach that we consider to significantly help with the challenge to facilitate perspective overlap, integrate information and insights, and mediate between multiple views on complex cultural collections.
PolyCubism–A New Approach to Information Integration
The research project PolyCube—Towards Integrated Mental Models of Cultural Heritage Data (PolyCube, 2016; Windhager et al., 2016) addresses this challenge by developing a visually integrated interface for CCCs. The interface will work as a web-based platform for collection visualization, but could also be implemented as an (interactive, screen-based) data sculpture (Zhao & Van der More, 2008), which can serve as a three-dimensional ‘advance organizer’ (Ausubel, 1960; Anderson & Lucas, 1997) in the entrance hall of a gallery, library, archive, or museum.
The PolyCube emerges from the space-time cube representation (STC), first developed and utilized in human geography to support the visual analysis of human movement patterns and the spatial diffusion of innovation (Hägerstrand, 1970). The operating principle of this method is to orthogonally blend cross-sectional views (horizontal plane) and temporal view (vertical axis) together, allowing the mapping of the spatiotemporal origins of objects. Every event distribution in space and time thus translates into the unique shape of a point cloud, disclosing further spatiotemporal patterns to the gestalt perception of CCC visitors and analysts.
By the means of a space-time cube representation, the PolyCube scaffold can arrange CCC objects as point clouds according to multiple spatio-temporal arrangement principles. On the bottom, a data plane initially features a geographic map, and each object’s place-of-origin determines its horizontal position. The vertical axis of the cube represents time–and thus date-of-origin assigns an individual altitude to each cultural object above the ground (Figure 5).
Contemplated from a distance, this framework rearranges every corpus as a characteristically shaped ‘hyperobject’, which invites on-demand probing, zooming and close-up display. Further visual structures are sets which can delineate aggregations of objects, and links displaying relations between them. Together with possible alternative layouts for the data plane (like force-directed graphs, set diagrams or treemaps), the PolyCube approach can morph the corpora of large cultural collections into a wide range of expressive, data-driven shapes or patterns, with each constellation allowing different insights into a collection’s rich conceptual anatomy (see Figures 6 and 7).
A New Kind of Pattern Language
Figure 6 shows a lineup of basic available patterns. While basic distribution plots (left) unveil the spatiotemporal extension of a cultural collection’s origins for the visitors’ contemplation, inter-object links can provide the means to visualize narrative or curatorial pathways, as well as genealogical or inter- and intragenerational relations between artifacts (second from left and center).
For categories of objects—accumulated and delineated by sets—the framework generates expressive flow patterns (second from right and far right), which exemplarily can disclose the parallel evolution of cultural styles or schools, or their mutual genealogical influences. For these accumulating perspectives—which can also indirectly visualize the associated development of cultural organizations, art schools, religions, fashions, disciplines, or any other collective entities—a simple pattern language helps users visually parse complex developments as composites of basic temporal patterns (Figure 7). Styles or schools emerge in time, and either grow, split, or differentiate into multiple subcultures (left hand side). On the other side they can merge, de-differentiate, shrink, and cease to inspire collective reproduction or variation.
Excursus on two versus three dimensions in visualization design
When utilizing the third dimension in InfoVis, one should prepare for some additional explaining. As Munzner (2014) puts it: ‘[i]n brief, 3D is easy to justify when the user’s tasks involve shape understanding of inherently 3D structures … In all other contexts, the use of 3D needs to be carefully justified’. In light of this stance, we should check whether cultural collection data is inherently 3D, which—in a trivial sense—it is obviously not. Yet, on the other hand, the relevance of time has been already stated for the cultural heritage context, which technically adds a further dimension to any plain visualization technique, already utilizing two display dimensions.8 Following this perspective, hybrid 3D data (i.e. spatio-temporal or structural-temporal data) is omnipresent in the cultural heritage domain, which requires integrated representation solutions as provided by the STC. More specifically, a number of further arguments support the use of an STC.
Firstly, the STC achieves the integration of spatio-and-temporal in a fair and balanced manner by distributing the strongest and most effective visual variable (i.e. position, cf. Mackinlay, 1986) equally to all sides: x- and y-axis to spatial data, z-axis to temporal data.
Secondly, this unfolds a whole new visual-analytical morphology as an expressive and technically open-ended, time-oriented pattern language, that could be parsed and read by highly trained faculties of 3D gestalt perception (see below), and which synoptically encodes time like no other method we know (see Figures 6 and 7).
Thirdly, as Bach et al. (2016) note, STC representations can act as translational hubs or as operational cognitive scaffolds. They can mediate between the temporal visualization methods mentioned above (see Figure 2); and translate from temporal to spatial perspectives, while supporting visual analysts’ navigation by seamless transitions (see Figures 8 and 10). To the best of our knowledge no other visualization method can do this.
Fourthly, empirical studies on casual users (Amini et al., 2015; Kristensson et al., 2009, Kveladze, Kraak & van Elzakker, 2015) show that they can identify spatiotemporal patterns more quickly and more accurately with STC than with 2D visualization. While the STC is less suited for identifying detailed data properties on one dimension, it can unfold its full power when users want to see multidimensional patterns.
Fifthly, studies show that STC representations are liked because they are ‘cool’ (Amini et al., 2015; Kristensson et al., 2009). This should not be dismissed given the importance of drawing casual users and accidental visitors into an in-depth exploration process.9
Sixthly, Sorger et al. (2015) provide a conciliatory frame for the mediation of 2D and 3D representations, which resonates with recommendations of generous design and cherishes the benefits of representational syntheses: Integrating 2D and 3D visualization methods in a single interface provides users with complementary composites, which can add to the method’s mutual contextualization and comprehension.
Drawing these arguments together, we consider STC representations to provide a powerful and largely untapped potential for visual-synthetical mediation—not in spite of but due to their use of an additional display dimension. While this also increases visual clutter and interaction costs (e.g., due to additional rotating, zooming and panning, cf. Munzner, 2014), some of the standard complaints from plain design advocates could also be returned to the sender: pleas for the minimization of interaction costs will remain acceptable only if they find alternative ways to cover the significant cognitive costs of information integration that pile up for unaided macrocognition in between multiple views. There is a final argument to be made about cognitive economics—one that strives for a balance between open-minded, pro-plurality approaches (Dörk et al., 2011; Drucker, 2013; Thudt et al., 2012; Whitelaw, 2015) and a vital defense of cognitive ergonomics. The latter could encourage a re-thinking of Ockham’s razor for the visual reasoning domain (views and entities should not be multiplied beyond necessity) and drive the orchestration of already existing perspectives (see also Section Six).
Case Study–The Charles W. Cushman Collection
To consolidate the outlined design principles, we present first patterns and insights from a digital collection case study, reshaped by the first implementation of the PolyCube framework as a visual-analytical research prototype.
PolyCube–Technical Implementation and Case Study Data
Three main considerations guided the technical implementation of the PolyCube concept: reusability; modularity for ease of reading and extension; and compatibility with DOM selection in order to accommodate various document object model (DOM)-related libraries such as data driven documents (D3.js). We aimed to build the PolyCube 3D rendering environment on CSS3D, doing without the WebGL engine as much as possible, as this is still not supported by browsers and older devices with limited exposure and instability of the current HTML5 canvas.
As for the data, we make use of the Cushman Collection (Indiana University, 2004), as it has already been developed, prepared and geo-referenced by Miriam Posner (2014) for the use with the Palladio interface (Coleman et al., 2017; Edelstein et al., 2017), which also serves as a reference for comparison.10 Charles Weever Cushman was an amateur photographer and alumnus from Indiana University. The collection he bequeathed to the university encompasses 14,500 photographs, taken between the years 1938 and 1969. As our system prototype is still awaiting optimization for processing speed and visual occlusion management (Elmqvist & Tsigas, 2008), we took a closer look at a randomized sub-selection of 800 photographs dating from 1938 to 1955.
PolyCubistic Perspectives on the Cushman Collection
For the case study, a geographic map and a set-diagrammatic visualization were implemented as cross-sectional visualization methods. These views have been transferred to an STC, which also offers a juxtaposition and a superimposition perspective on demand. Figure 9 shows the first representation of collection data from a space-time cube perspective. The screenshot shows the origins of Cushman’s photographs as spatiotemporally located events along the trails of his travels.
The representation allows for rotation, zooming and panning, and access to previews of photographs (see Figure 11, left hand side). Space-time cube representations provide an integrated perspective on spatiotemporal distributions (Kristensson et al., 2009), but also serve as a cognitive scaffold, which helps to create other spatial, temporal, or spatiotemporal perspectives by visual manipulations (Bach et al., 2016; see Figure 8). To support the navigation of users between different views, and to keep their spatiotemporal orientation intact, the prototype features seamlessly animated canvas transitions (cf. Federico et al., 2012) as a mediating coherence technique. Figure 10 shows how these seamless transitions visually guide the user’s perception from an STC representation to a layer juxtaposition perspective (top row), and from a juxtaposition to a superimposition perspective (bottom).
From a model-based reasoning perspective, these transitions strengthen the visual momentum of the visualization system (Bennett & Flach, 2012), and support the maintenance of the spatiotemporal mental model. Exemplarily, starting from an STC representation allows to seamlessly flatten the vertical time axis, so as to arrive at an aggregated superimposition perspective (see Figure 11).
The flat superimposition layout allows for inspection of the overall spatial distribution of objects, and the precise reading of spatial positions from an orthogonal point of view. As the time-axis has been shortened, it is possible to encode time into another retinal variable like the color of the data points to allow for a balanced comparison of different spatiotemporally integrated perspectives.
Figure 12 shows the prototype’s third major perspective, arranging temporal layers in a juxtaposed position. The strength of this position is the disaggregation of the superimposed view into multiple temporal panels, and the conventional reading direction from left to right. We consider this constantly available plurality of perspectives to provide an added value to the visual analysts, so that they are always able to balance the strengths and weaknesses of individual views by switching ‘between the most appropriate representations for the data and task at hand’ (Kerracher et al., 2014).
Figure 13 finally shows how the PolyCube framework is open for the implementation of various further spatial, structural, or in general ‘cross-sectional’ visualization methods (cf. Figure 2, left hand side). Using a simple set visualization, it allows to aggregate objects per temporal segment, and to convey an integrated view on the development of every (sub)collection.
If such set-diagrammatic cuts through the longitudinal development of a collection are further enriched (for instance by differentiating subsets), the flow-patterns of Figure 7 will emerge, supporting the cognition and sense-making of collection visitors and analysts. Due to the openness of this imaging framework, we consider its emerging ‘data sculptures’ to provide a multi-faceted but orchestrated approach to the visualization of complex cultural collections. Exhibitions can utilize it by providing interactive 3D models on large or small screens, but also by implementing them as physical visualizations (Zhao & Vande Moere, 2008) in the entrance halls of libraries, archives and museums.
Whether for online or offline collections, such data sculptures can serve as prime exhibits among others, featuring as a bigger picture of the whole elephant, and as a novel interpretation of the advance organizer concept (Anderson & Lucas, 1997). Whilst we are aware that visitors will be required to put in a degree of work to become familiar with such models, studies point out that once someone is ‘hooked’ by a (meta-)exhibit, it becomes more likely that they will engage with subsequent experiences, while ‘boredom and tiredness are then minimized’ (Bitgood, 2009b).
We conducted a qualitative evaluation of the PolyCube prototype with three casual users: two female and one male. None of them had prior knowledge about the Cushman collection, nor any expertise in the field of information visualization. They participated voluntarily in this study without any remuneration besides some complimentary chocolates.
Following a short introduction to the Cushman Collection and the interaction techniques offered within the prototype (rotate, zoom, pan, select), participants were left to freely explore the prototype on a 24’’ screen while thinking aloud. The visual structure of the STC was not further explained as we sought to understand how casual users make sense out of the unfamiliar PolyCube system (similar to the procedure in Smuc et al., 2008). Having gained an understanding of the prototype’s visual structure, they were asked some task-like questions about the Cushman Collection (e.g. can you guess from the visualization, where Cushman lived in which periods?). For the selection of questions, we oriented ourselves on prior research (e.g., Amini et al., 2015), showing that the STC is more powerful for gaining spatiotemporal knowledge related to broader patterns than about individual data points. In a final interview, participants were asked to compare different variants of the STC (number of layers, set-diagrammatic vs. geographic data plane), as well as the STC against the juxtaposition and superimposition views with respect to user experience and to its informative value. They were encouraged to name improvements and describe problems they encountered. Overall, the evaluation procedure took between 20 and 35 minutes per participant. While the experimenter guided the evaluation, two observers noted down the most important statements and observations. Audio recordings were used to validate these protocols.
During the free exploration, Participant 1 started with an extensive phase of close reading—viewing and evaluating the different photographs—before she was encouraged to explore the arrangement of the data points (7’) and slowly gained an understanding of the visual structure (15’). The other two participants explored and understood the visual structure right from the beginning. Participant 3 rightly observed that significantly fewer than 800 data points became visible in the various perspectives, which was caused by the merging of spatiotemporally adjacent data points on the chosen scale.
Participants reported no significant problems while answering our questions. They could identify spatiotemporal patterns efficiently with the STC. All three participants were able to describe where Cushman lived or travelled during each period. Participant 1 was the only one to show initial difficulties in relating the data points to the correct geographic regions, but came to grips with the task after rotating the STC. Confronted with the task to identify the time periods when Cushman was the most active or inactive, all three participants could instantaneously point out the corresponding time periods. When asked to describe the collection to someone else, they focused on their (mostly emotional) evaluation of the explored photographs rather than on the collection’s spatiotemporal characteristics. As Participant 2 phrased it: ‘a number of uninteresting photographs, but in a nice toy to play with’.
During the final interview, all three participants preferred the STC over the juxtaposition and superimposition visualizations. As Participant 1 stated: ‘you can’t feel the logic at once, but then it is becoming clear … You can compare period, territory, the main objects. This is nice’. All participants highlighted the STC’s potential to support an integrated understanding of the geographic and temporal distribution and interdependencies of the data, which cannot be as easily derived from the other views. They also highlighted the attraction and user experience of the STC. As Participant 2 put it: ‘if I have something boring [the photographs] and fun [the STC]—and something boring and no fun—I’ll take the former, obviously’.
The participants suggested numerous improvements. With respect to the visual design, Participant 2 would stated they would have found the STC more logical or natural if the time-axis were inverted. For the juxtaposition perspective, all participants missed labels specifying the temporal periods. Participant 3 suggested improving the labeling on the time axis so that it can be easily read regardless of rotation. We also collected some design suggestions, such as the ability to enlarge selected photographs on demand and the addition of data layers of related (historical or political) events, so that the artworks of the collection could be contextualized in a broader space-time context.
As for the set-diagrammatic visualization (Figure 13), the participants easily understood the focus on the total amount of pictures per period, but also remarked that the abstraction from the geo-temporal details reduced the visual-analytical value. However, they recognized a potential for this perspective when dealing with the analysis of larger (or also categorically differentiated) collections.
Conclusion and Outlook
In this article, we have reflected on both the curiosity and openness that drives people to explore cultural collections and on the well-known limitations of their cognitive resources. Information visualization offers a powerful spectrum of methods to provide visitors to complex collections with facets of a bigger picture. Interaction with such representations can add to the visitor’s sense of overview and orientation – and thus facilitate conceptual understanding. Following our discussion of recent achievements of generous interface design, we focused on a second-order problem that arises from one of its central design strategies: multiple views allow visitors to inspect CCC data from diverse perspectives and support the investigation of spatial, structural, and temporal data aspects. Yet, most of these interfaces leave users to themselves when it comes to the integration or mediation of these perspectives.
We introduced the PolyCube framework as a method to mediate and integrate a diversity of local views on a global level of representation. Analogous to the provision of overviews on a local level, this enhances the ease of global cognitive syntheses and reduces cognitive efforts to integrate various perspectives without sacrificing any of the benefits offered by plain local ‘standard’ views. In particular, the options provided by seamless transitions appear as a promising technique to support visual macrocognition and as a noteworthy strategy for strengthening the visual momentum of advanced interfaces for use with and by cultural collections (Bennett & Flach, 2012). While preparing for the necessary evaluations to more thoroughly investigate and substantiate our arguments, we look forward to a discussion which needs to be had on a more fundamental basis, where methodological and epistemic positions of humanities-related research are negotiated.
Towards New Kinds of Elephants
Revolving around an organismic metaphor of complexity, we have discussed a specific combination of techniques to reassemble elephants as a whole. Towards the end of this endeavor it seems necessary to look into one of the most obvious limitations of this metaphor: cultural collections—like so many other complex phenomena—have no original (spatial or visual) superstructure that can be visually reconstructed in an isomorphic fashion. Diagrams and information visualizations are indispensable techniques because they successfully create new arrangements of abstract data, optimized for human perception by rule-driven layouts. Unfortunately, these rules have been mostly devised as independent procedures, with each visualization technique imposing its own structure and logic on the pictorial spaces of canvasses or screens. When zooming out from a multitude of such local (body part) images, they do not easily connect like pieces of an animal puzzle. Unlike naturalistic images, they cannot be directly traced back to a common 3D space, to which they hold an isomorphic part-whole relationship. And unlike words or sentences, they also cannot easily be connected to more complex descriptions because no diagrammatical ‘macrosyntax’ for the assembly of macro-pictures has been developed (Windhager et al., 2019). In the present, then, this requires designers of visualization systems to engage in the non-trivial practice of elephant creation ex nihilo. To bring the body parts of abstract and complex topics together, their anatomies and connective tissues have to be invented first. If macrocognitive syntheses should be supported, new kinds of elephants await their creation and cultivation—an objective obviously allowing multiple solutions for each topic too. While our concept of ‘coordinated multiple cubes’ offers such an orchestrated draft, we hope for a whole branch of visual synthetics research to emerge, to bring new kinds of bigger pictures into being in the material and mental ‘white spaces’ in between multiple views.
Mapping and Tapping into Humanities Controversies about Interface Design
If we aligned a good part of this article’s argumentation with the blind men’s quest for information integration, we know that other observers of the scenery can see things differently. To also bring in their perspectives, we close with a reflection on expected reservations about our holistic approach. Regarding various humanities approaches to interface design (Drucker, 2011, 2013) we even expect our initial problematization to be inverted: if reflections do not start from the cognitive costs of reasoning with a diversity of incoherent information, but from all-too simple and counterproductive suggestions for unification—of which are many—the momentum can shift to the defense of interpretive diversity.11 It is in this context that we consider the calls for even multiplying ‘fragmentation and partial presentations of knowledge’ to originate (Drucker, 2013, n.pag.).
If interpretation is a central operation underlying the thinking and working of the humanities, then interface design has to support this activity, conceived as an open-ended, critical and constantly self-challenging endeavor. Related approaches thus sometimes question traditional HCI objectives like ergonomic efficiency, but strive to foster elaborate evaluation and reframing activities like critical reflection, intellectual argument and rhetorical engagement. To this end flow, or pleasure-driven engagement with data is also deemed essential; as well as the acknowledgement of the subjective, situated, and partial character of every emerging result (Drucker, 2013). Interpretive approaches deliberately call for perspectival pluralism and the disaggregation of asserted totalities, while embracing ‘ambiguity and uncertainty, contradictions and the lack of fixity or singularity’ (Drucker, 2013: n.pag.). As such we are aware of positions which seemingly invert this article’s rationale, and which ask designers to create interfaces ‘which can tolerate inconsistency among [different] types of knowledge representation and organization’. From this point of view, inconsistencies and contradictions between multiple views are not only acceptable, but they ultimately also help to expose ‘the illusion of seamless wholeness’ as a useless or even counterproductive idea (Drucker, 2013: n.pag.).
As with many controversies, it is possible to tap into such lines of contrarian argumentation by mapping them within a ‘square of opposition’ (Figure 14). This notation has evolved from its Aristotelian origins to support the mediation of polarizing discussions or tensions between seemingly incompatible values or positions (Hartmann, 1926; Schulz von Thun, 2007). As a visualization technique it represents two positions (A and B) as polar opposites on the left and right hand side of a canvas. We map our advocacy for holistic or integrated representations as position A, to oppose it with the endorsement of visualization plurality and diversity at position B. Furthermore, two possible manifestations of each side are distinguished, putting the ideal conceptions (A1 and B1) at the top, while adding their less than ideal versions (A2 and B2) below, which arise either from their poor implementation (e.g., from malpractice or exaggeration) or from their external misinterpretation (including negative framing or deliberate misconstruction). As for the ongoing debate about the proper visualization of cultural complexity, we know holistic representations to be at constant risk to devolve into forced schemata of unification. On the other hand, the strive for perspectival plurality can lead to the fragmentation of any coherent picture, substituting the non-virtue of forced integration with the non-virtue of conflicting diversity.
In a reliable fashion, contrarian arguments emerge from a diagonal polarization, where charges (arrows in orange) are directed from the upper corners of a position (A1 and B1) to the opposite corners at the bottom (B2 and A2). Corresponding controversies thrive on the common self-idealization of a position in combination with the devaluation of the opposite value. Yet the square can also show ways for mediating tensions by developing dynamically balanced or hybrid positions in between (arrows in blue). While not being especially popular in the academic context, pragmatic approaches to the mediation of controversies can move both sides forward.12 While our position started close to a holistic stance (A1) motivated by problems of perceptual fragmentation (B2) we acknowledged methods of generous design (B1) but focused on the challenges of renewed fragmentation by multiple overviews (B2) to finally mediate them with an interoperable design of ‘orchestrated diversity’, dynamically balancing between A1 and B1. On the other hand, pro-plurality approaches to interface design follow a mirror-inverted pattern to problematize totalizing representations (A2), which frequently offer even less than the sum of their parts (Latour et al., 2012). As such, they plausibly argue for designs fostering plurality and diversity (B1), but to avoid the descent into conflicting diversity they also have to reflect on strategies of coordination across views so that they ‘can be integrated in various ways’ (Dörk et al., 2017: 46).13
Connectivity is key. While it is possible to enjoy many humanities controversies as explication of competing and contrarian positions, advanced interface design is well-advised to read them in a complementary fashion and to bring their best arguments into a dynamic balance. This will also allow us to take care of a more informed development of bigger pictures in the realm of the humanities, despite the damage that approaches concerned with these big pictures suffered from poststructuralist decrees. As has been stated with regard to ambitious accounts of culture and history in general: ‘[i]f the grand narratives known so far … have been seen through as unsuitable attempts to seize power over the world’s complexity, this critical realization neither delegitimizes the narration of things past nor exempts thought from striving to cast an intense light on the comprehensible details of the elusive whole’ (Sloterdijk, 2013: loc. 847).
To remake and refine visual representations of cultural collections and other complex humanities topics, we advocate synoptic visualization approaches which coordinate the best knowledge representation strategies of multiple communities. Such hybrid endeavors will generate more effective approaches to the support of macrocognition in face of data diversity, and the facilitation of switching between multiple perspectives and sense-making frames. This seems to us to be not only a design task worth strengthening, but also a cognition technique which comes close to a civic meta-competence for these times, arguably not only needed in digital humanities’ and cultural sciences’ research domains.