data nonvisualisation

Over the past two decades the diversity and the quantity of screens in our lives have proliferated. They are a defining feature of contemporary urbanisation and are dotted around city and financial centres, shopping malls, in shops within the malls and in traffic thoroughfares such as motorways and airports. Screens have even been incorporated into the architectural infrastructure of new buildings sometimes comprising an entire wall. An example of this can be found in the façade of the Kunsthaus Graz in Austria where an installation of fluorescent lights sits under the 900 square metres of acrylic glass that comprises the gallery’s eastern ‘skin’. These lights are digitally controlled to form low-resolution text and images, functioning like pixels on a digital screen. Likewise screens have infiltrated our domestic and intimate spaces; computer monitors regularly grace bedrooms and the portability of the mobile phone and iPod means that we now carry screens close to our bodies. With so many surfaces available for information to be displayed it seems more than obvious to call digital culture an age of data visualisation.

However, the more that data multiplies both quantitatively and qualitatively, the more it requires more than just visualisation. It also needs to be managed, regulated and interpreted into meaningful patterns that are comprehensible to humans. The work and outcomes of extracting pattern and order from data are rarely visualised for screen display in daily life. Indeed this management and interpretation of data flows is undertaken by sophisticated sampling, tracking and automated techniques and the results of these are more frequently sequestered to become the property of corporations and institutions. Even when data flows do not become private or hidden property, their remixing and recombination in, for example, the web through the operations of search engines, databases, digest and feeds such as RSSs (Really Simple Syndication) increasingly makes this manipulation of data invisible.

I will here refer to these mounting reserves of data about data, the software used to extract and analyse these and the social and cultural techniques accompanying this increasing trend as processes of data nonvisualisation in digital culture. By looking in some more detail at two areas in which data nonvisualisation processes dominate – Web 2.0 and data mining – we can begin to see how this marks an increasing trend in the way digital culture is organising data. At the same time, these newer less visible processes of aggregating and regulating data begin to reorganise contemporary digital culture. Whereas data visualisation characterised previous decades of digital culture in terms of tendencies in software development and the importance of digital imagery in both the arts and sciences, the invisibility of the processes involved in the manipulation of data is now ascendant.

This is not to say that these techniques for aggregating and deciphering data do not use visualisation techniques. In the area of data mining particularly, visual environments can be modelled to make sense of patterns detected in sets of information. What is invisible or rather not visualised are the parameters, relations and arrangements that are used to organise, interpret and hence make sense of the data. Additionally, the visualisation of data patterns has taken on a particular aesthetic – that of the vector/line. Examples of this can be found in abundance throughout the contemporary aesthetics of digital culture as social networks, relations between documents, corporate organisational relationships and even complex ideas are visually rendered as connections between lines and nodes. This visual style represents a type of thinning out of the visual plane of the image in contemporary culture – an attempt to streamline only the essential information-based elements of the image and eliminate ‘noise’ from the image scape. We might also think about the growing dominance of these minimal line images as a tendency toward reduced visuality within data visualisation.

An important cultural response to the proliferation of visualised data throughout the 1990s and early 2000s came from artists who re-worked scientific and medical images. The presumption that data imaging was a neutral or accurate portrayal of scientific facts has been variously investigated in the work of Aziz and Cucher, Justine Cooper, Michele Barker, Catherine Richards and others. But if we now increasingly occupy an aesthetic and social space in which the processes of making and organising data are largely invisible, what would be an appropriate aesthetic response to this trend? It may be the case that online and software artists will need to consider future artistic practices that are not visually based in order to respond to these processes of data nonvisualisation.

The ‘blackbox’ of data processing
Katherine Hayles has suggested that the use of computers for visualisation purposes has radically altered not only the ways in which mathematical operations are performed but contributes toward a new kind of knowledge that is visually intuitive:

…with computers, a new style of mathematics is possible. The operator does
not need to know in advance how a mathematical function will behave when it is iterated. Rather, she can set the initial values and watch its behaviour as iteration proceeds and phase space projections are displayed on a computer screen…The resulting dynamic interaction of operator, computer display and mathematical functions is remarkably effective in developing a new kind of intuition (Hayles, 1990: 163)

Sherry Turkle’s early analysis of the shift to online explorations of identity through chat and text-based virtual worlds indicated that interaction with digital machines became more ubiquitous the less people knew about the technical operations of those machines (Turkle, 1995). She compared the 1984 release of the MacIntosh operating system and its relatively easy yet opaque ‘desktop’ interface with a previous generation of ‘nerds’ and programmers who had interacted with computers using text-based commands (Turkle, 1995: 34). The command-line interface for a previous generation of computer-human interaction encouraged its human users to tinker with the underlying code of the interface in order to simply get the machine to work. In a sense, then, the operation and performance of computational systems had been more visible – although to a smaller and more elite group of people – if more cumbersome to operate.

There have been many debates about how graphics function in interface design, especially at the level of the Graphic User Interface (GUI). Some designers suggest that graphic representation of computational processes – the desktop as a representation of the computer’s operating system, for example – can confuse and obsfucate interaction with the computer (Norman, 1990: 216). Others have emphasised the importance of the GUI in communicating to users the complex tasks and functions that data undergoes in computation (Marcus, 1995: 425). But the use of graphics to represent both data and the processes performed upon data now definitively guides everyday interaction with computers.

By the late 1980s – and certainly by the introduction of GUIs for the web in 1994 – we were already less overtly aware of the inner processing of data and its pathways through the underlying architecture of digital machines. Computers had become the exemplary black box machine – you put something in and you get something out – and most users never really understood what happens in the middle. By the late 1990s, data visualisation, especially the animation of changes to data over time, was likewise being applauded by interface designers as a technique for making computation more human-centred:

New ways of representing data, especially changing data, allow users to gain new insights into the behaviour of the systems they are trying to understand and make the computer an invaluable tool for understanding and discovery as well as for interpretation and mundane calculation (Dix et. al., 1998: 598)

During the period of the rise of computer graphics, important areas of social and economic life such as financial markets and entire disciplines such as the life sciences, geographical systems and meteorology were adopting and developing various kinds of data visualisation. In the development of these applications, data visualisation followed two main directions: the digital visualisation of information held previously in analogue form such as printed maps or of numerical data such as statistics about climate; and the creation of information spaces as visual spaces. Geographical Information Systems (GISs) – an example of the first direction – began their life in the 1960s with the development of the Canadian Geographic Information Systems by Roger Tomlinson for the Canadian government’s Department of Energy, Mines and Resources in 1963. The digitisation and visualisation of geographic data has allowed query, analysis and editing of data using visual means and within a visual environment. During the 1980s and 1990s, GISs were standardised across a smaller number of computer operating systems and were being accessed across the internet. This greatly increased the ease and amount of user interaction. There are now a number of online applications that allow public access to certain kinds of GISs – map locators such as MapBlast and the virtual globe environment of Google Earth.

The second direction – the rendering of ‘pure’ information spaces – includes a multitude of projects for mapping cyberspace in which complex and invisible information flows and intersections such as website traffic are visualised (See Dodge and Kitchen, 2000). An example of this kind of data visualisation can also be found in the interactive three-dimensional real time rendering of the New Stock Exchange trading floor completed by the architectural design firm Asymptote in 1999. Traders in the exchange use this virtual information environment to, for example, visually track stock performance by individual companies and graphically detect the effect of incidents on performance. Asymptote’s Lise Ann Couture and Hani Rashid state that the complexity of data interrelations in stock markets was precisely the rationale presented by the New York Stock exchange for commissioning the spatial visualisation of its information (Asymptote, 2006).

The fascinating paradox of all these trends toward the visualisation of data – the screen interface of the desktop computer, the dominance of GUIs in web browser design and the construction of entire information spaces as both two- and three-dimensional image-scapes – is that the structures, operations and circuits through which data move become increasingly invisible. It is often the case that during initial periods of a digital medium’s or set of technologies’ development a period of greater accessibility to these underlying structures and processes occurs. This period of experimentation, in which technical and design protocols are less established, is often also characterised by artistic and cultural exploration of the medium/technology.

The first phase of web development and design from 1995 to 2001 (sometimes referred to as Web 1.0) required designers and artists to be versed in at least a basic level of the then broadly used scripting language for displaying information online – hypertext mark-up language (HTML). In other words, during this early phase of web design there were no pre-packaged methods for formatting the way a web page was displayed. All graphic and stylistic elements had to be laid out in HTML scripting that ‘told’ the web browser how to format the page for online display. For a relatively short period, both artists and designers had a measure of access to the ‘source code’ of the web and this resulted in a lot of play with HTML aesthetics. From the mid-1990s, the artistic duo of Joan Heemskerk and Dirk Paesmans, known as ‘jodi.org’, became infamous for their collapse of the visual levels of web display into the underlying HTML level of source code. Their early piece ‘http://wwwwwwwww.jodi.org/’ used the visual potential of HTML (using the actual ‘language’ to create a diagram of a hydrogen bomb) rather than HTML’s functionality as a piece of executing computer code (see Lunefeld, 2001: get page number). This very simple act of using the web’s language to sketch out an image of the hydrogen bomb was jodi.org’s reminder to us of the military origins of digital computing and indeed of the internet.

In fact, jodi.org furnish us with an aesthetic example that resists the broader cultural trend toward data nonvisualisation. Rather than using the graphic interface to obscure the underlying operations of computation, jodi.org’s work insists on using visual elements to foreground the complex historical, social and economic factors that are embedded within contemporary ‘user-friendly’ interfaces. Nevertheless, web design has now moved toward less visible engagement – certainly for the everyday user – with the underlying architecture, data structures and flow of data through its various nodes and mechanisms. This is so much the case that many people are unable to clearly distinguish between the web and the net or have no sense, for example, of how different search engines operate to retrieve and display their end results. In the next section, I want to briefly examine some of the information mechanisms within the Web 2.0 environment that contribute to this increasing trend toward data nonvisualisation.

Data as pattern, automation and aggregation
After the infamous dot.com crash, the web environment dramatically changed. One of the key criticisms of earlier web interaction and transaction had been that pre-existing commerce, institutions and communications were simply relocated into the domain of cyberspace. Models and modes of interaction suited to and developing out of the web environment has not really emerged in its early phases of growth.

Web 2.0 is a phrase used to denote the many changes that have taken place in the online environment after online cultures, commerce and everyday users regrouped in the post- dot.com context. It marks a ‘new’ generation of services and relationships that are internet-based and indeed can only develop in the online context. At the core of the concept of Web 2.0 is the understanding of the network as an expanded field of interaction, interrelation and semantic generation between users, online technical infrastructure and software. (See O’Reilly, 2005).

Newsblaster is an automatic news weblogger developed by the Natural Language Processing Group at Columbia University, New York, USA. The project began in 2002 and is a good example of a Web 2.0 tool. Newsblaster ‘reads’ a range of news items (from approximately 14 different sources) and, using artificial intelligence techniques, produces summaries of these stories. The tool is an example of an ‘aggregator’ – software that draws together and re-presents data in a digested and reduced form. Aggregators are a common feature of the information landscape of Web 2.0 as they are: a) automated forms of operations – such as producing digests of information – previously carried out by human labour in the Web 1.0 environment; b) methods for dealing with the explosion of online information that followed, the growth of blogs from around 2002 onward; and c) able to easily link and function in relation to the straight-to-web publishing environment that has become the mainstay of contemporary online transaction.

But Newsblaster is also an example of data mining techniques – automatically extracting embedded patterns and invisible connections – to produce news digests based on keyword and common phrase relationships in the stories that it culls from online searches. It represents a textual instance of the aesthetic of making visible the invisible connectivity of data. However, what remain invisible in Newsblaster’s automated, aggregate functionality are two key aspects. First, users deploying such aggregators are not aware what the parameters are for extracting and determining pattern and hence the processes of making data meaningful in particular ways are never visualised or made explicit. Automatic aggregation tends to perform operations that reduce the relations between data to commonalities rather than differences. This may be of crucial importance in the aggregation of news data where conflicting rather than similar perspectives about an item actually comprise the information about it. In reviewing the ‘newsworthiness’ of Newsblaster New York Times journalist Susan Reed notes that:

in summarizing reports about President Bush’s plan for greater scrutiny of corporations, Newsblaster did not include criticism that the plan failed to call for increased financing for the Securities and Exchange Commission, which would carry out the effort. (Reed, 2002)

Aggregation therefore rests upon and contributes to the ‘image’ of networked information based upon similarity and close proximity as determinants of interconnectivity. It shares this propensity with other Web 2.0 tools and environments such as Friendster, which function by creating clusters of connections (friend and/or semantic networks) between closely proximate linked data and/or users.

Second, the historical, cultural and institutional contexts in which a tool such as Newsblaster operates are not so apparent in its every day use. The Newsblaster project was funded by the US government’s National Science Foundation (NSF) and the Defense Advanced Research Projects Agency (DARPA). Although the project had been in development from 1998, nonetheless NSF and DARPA funding to a range of data mining projects increased in the heightened emphasis upon security and intelligence in the post-9/11 context. Newsblaster was funded due to a perceived need by US intelligence analysts wanting to explore the potential of data mining for homeland security applications. According to the NSF, data mining large sets of information from television broadcasts and web pages may uncover underlying invisible relations between events and increase the predictive capacities of intelligence agencies (NSF Press Release, 2002). What is important here is not the specific development of Newsblaster but rather the boost to the Web 2.0 environment afforded by US military funding. Coincidentally or not both the dot.com crash and 9/11 occurred in 2001 and it is after this period that the rise of Web 2.0 occurs. What, then, are the less visible forces at work driving the imaging and understanding of data as pattern and deep connectivity?

It comes as no surprise that the search for the invisible patterns and organization of data should be driven by military requirements. Data mining is an operation that can only take place in a context where vast quantities of data are produced and circulate and where much of this data is in fact meaningless or rather redundant. The automated mining of data sets for underlying pattern supposedly sifts through redundant information and extracts only relevant information. But it is precisely redundant information – or rather the potential for redundancy – that is at the heart of the original military diagram for networked connectivity. Paul Baran, an engineer working for the American nonprofit research agency RAND, wrote a memorandum in 1964 that became the basis of the thinking and imaging of networked communications (Rand, 1964). Sponsored by the US Air Force, the memorandum details a plan for a digital communications system that could survive the event of an attack on any of its parts. It is often remarked that the distributed and mesh-like character of the diagrams Baran used to illustrate how this system would function serve as an abstraction of advanced internet connectivity. However, more fundamental to Baran’s system is the ratio of its redundancy of links and nodes to actual links and nodes needed for communication of data (Baran, 1964: 8–9). By building in a degree of redundant links and nodes, Baran sought to allow switching of information packets to alternative communications routes in the case of either systemic failure or enemy attacks carried out upon the system.

Although it is now the case that the contemporary internet has outgrown its original military origins, redundancy of information is perhaps the most characteristic attribute of contemporary online communications. Everyone has experienced this phenomenon in the fruitless searches conducted for an item that lead nowhere or in being the recipient of bulk or spam email. And it is precisely this prolific redundancy of data – built into the original thinking and imaging of distributed communications – that today motivates the activity of data mining; that is, producing invisible pattern from the overwhelming chaos of too much information. It is as if we have come full circle in the 40 or so years since the inception of networked thinking to the point where what was conceived as a line of protection for the US military – the production of redundant connections, links and flows of information – now sustains the intelligence arms of this same institution. Perhaps the future of networks lies not so much with their visualisation but with what lies beneath them – the institutional and intellectual cultures of their past. In order to understand the increasing trend toward the nonvisualisation of the processing and manipulation of data, then, we also need to understand the institutional, intellectual and cultural histories of data’s flows.

Recommended Readings:
Baran, Paul (1964) “On Distributed Communications: Introduction to Distributed Communications Networks”, Memorandum RM-3420-PR, Santa Monica, CA: The Rand Corporation

Dodge, Martin and Kitchen, Rob (2000) Mapping Cyberspace, London: Routledge

Hayles, N. Katherine, (1990) Chaos Bound, Ithaca: Cornell University Press

Lunenfeld, P. (2000) Snap to Grid: A User’s Guide to Digital Arts, Media and Cultures, Cambridge, Massachusetts: MIT Press

O’Reilly, Tim (2005) “What Is Web 2.0:
Design Patterns and Business Models for the Next Generation of Software”, O’Reilly weblog, http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html?CMP=&ATT=2432

References
Asymptote website (2006) ‘NYSE 3D Trading Floor’, http://www.asymptote.net

Dix, Alan, Finlay Janet, Abowd, Gregory and Beale, Russell (1998), Human-Computer Interaction (Second Edition), London: Prentice-Hall Europe.

Dodge, Martin and Kitchen, Rob (2000) Mapping Cyberspace, London: Routledge

Hayles, N. Katherine, (1990) Chaos Bound, Ithaca: Cornell University Press

Lunenfeld, Peter (2000) Snap to Grid: A User’s Guide to Digital Arts, Media and Cultures, Cambridge, Massachusetts: MIT Press

Marcus, Aaron (1995) ’Principles of Effective Visual Communication for Graphical User Interface Design’, Readings in Human-Computer Interaction: Toward the Year 2000 (Second Edition), Ronald M. Baecker, Jonathan Grudin, William Buxton, Saul Greenberg eds, San Francisco: Morgan Kaufmann missing page numbers

National Science Foundation Press Release 02-64-1 (2002) ‘NSF, Intelligence Community to Cooperate on “Data Mining” Research’, July 30, http://www.nsf.gov/od/lpa/news/02/pr0264.htm

Norman, Donald A. (1990 ) ‘Why Interfaces Don’t Work’, The Art of Human-Computer Interface Design, ed. Brenda Laurel, Reading, Massachusetts: Addison-Wesley Publishing, 209–19.

O’Reilly, Tim (2005) “What Is Web 2.0:
Design Patterns and Business Models for the Next Generation of Software”, O’Reilly Blog, http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html?CMP=&ATT=2432

Reed, Susen E. (2002) “A News Cocktail Mixed by a Software Genie”, New York Times Electronic Edition, March 28,

http://tech2.nytimes.com/mem/technology/techreview.html?res=9C04E5DF113BF93BA15750C0A9649C8B63

Turkle, Sherry (1997) Life on the Screen: Identity in the Age of the Internet,
London: Phoenix

1 response so far

  1. [...] A post on my research blog Data Nonvisualisation which forms the background for the proposal for this piece for [...]

Leave a Reply