<?xml version="1.0" encoding="UTF-8"?>
<?oxygen RNGSchema="../schema/xmod_web.rnc" type="compact"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:xmt="http://www.cch.kcl.ac.uk/xmod/tei/1.0"
    xml:id="ab-764">
    <teiHeader>
        <fileDesc>
            <titleStmt>
                <title>How Do You Visualize a Million Links?</title>
                <author>
                    <name>Brown, Susan</name>
                    <affiliation><orgName>University of Alberta</orgName> and <orgName>University of Guelph</orgName>,
                        English <reg><country>Canada</country></reg></affiliation>
                    <email>susan.brown@ualberta.ca</email>
                </author>
                <author>
                    <name>Antoniuk, Jeffery</name>
                    <affiliation><orgName>University of Alberta</orgName>, Orlando Project <reg><country>Canada</country></reg></affiliation>
                    <email>jeffery.antoniuk@ualberta.ca</email>
                </author>
                <author>
                    <name>Bauer, Michael</name>
                    <affiliation><orgName>University of Western Ontario</orgName>, Computer Science <reg><country>Canada</country></reg></affiliation>
                    <email>bauer@uwo.ca</email>
                </author>
                <author>
                    <name>Berberich, Jennifer</name>
                    <affiliation><orgName>University of Western Ontario</orgName>, Computer Science <reg><country>Canada</country></reg></affiliation>
                    <email>jenn_b19@hotmail.com</email>
                </author>
                <author>
                    <name>Radzikowska, Milena</name>
                    <affiliation>Mount Royal College <reg><orgName>Mount Royal University</orgName></reg>, Communications <reg><country>Canada</country></reg></affiliation>
                    <email>mradzikowska@gmail.com</email>
                </author>
                <author>
                    <name>Ruecker, Stan</name>
                    <affiliation><orgName>University of Alberta</orgName>, Humanities Computing <reg><country>Canada</country></reg></affiliation>
                    <email>sruecker@ualberta.ca</email>
                </author>
                <author>
                    <name>Yung, Terence</name>
                    <affiliation>Mount Royal College <reg><orgName>Mount Royal University</orgName></reg>, Communications <reg><country>Canada</country></reg></affiliation>
                    <email>terence.yung@zerom3.com</email>
                </author>
            </titleStmt>
            <publicationStmt>
                <publisher>Centre for Computing in the Humanities, King's College London</publisher>
                <address>
                    <addrLine>Strand, London WC2R 2LS, England, United Kingdom. Tel:+44 (0) 20 7836 5454</addrLine>
                    <addrLine>http://www.kcl.ac.uk/cch/</addrLine>
                </address>
            </publicationStmt>
            <sourceDesc>
                <p>No source: created in electronic format.</p>
            </sourceDesc>
        </fileDesc>
        <revisionDesc>
            <change>
                <date>2010-04-26</date>
                <name>AC</name>
                <desc>CCHLite encoding</desc>
            </change>
        </revisionDesc>
    </teiHeader>
    <text type="paper">
        <body>
            <div>
                <p>In the past quarter century, established methods of literary history have been
                    severely contested. On the one hand, syncretic, single-author histories have
                    become problematic as a result of a combination of the expanded literary canon
                    and a range of theoretical challenges. On the other, a demand for historicized
                    overviews that reflect the radical recent reshaping in all fields of literary
                    study has produced large numbers of both collectively written histories and
                    encyclopedias or companions. Literary history thus tends towards compilations in
                    which specialists treat their particular fields, at the cost of integration or
                    of coherence. Meanwhile, the primary materials are increasingly available in
                    digital form, and literary historical scholarship itself is increasingly
                    produced digitally, whether as versions of established forms such as journal
                    articles, or in resources that invoke the potential for new kinds of analysis.
                    Major digital initiatives over the past decades have focused almost exclusively
                    on digital resource creation: the increasingly pressing question is how to use
                    this expanding body of materials to its fullest potential.</p>
                <p>In this project, we investigate how literary historical analysis can be extended
                    using various forms of visualization, using the experimental Orlando Project as
                    our test bed. <hi rend="italic">Orlando: Women's Writing in the British Isles
                        from the Beginnings to the Present</hi> is recognized as the most extensive
                    and detailed resource in its field and as a model for innovative scholarly
                    resources. Composed of 1,200 critical biographies plus contextual and
                    bibliographical materials, it is extensively encoded using an interpretive
                    Extensible Markup Language (XML) tagset with more than 250 tags for everything
                    from cultural influences, to relations with publishers, or use of genre or
                    dialect. The content and the markup together provide a unique representation of
                    a complex set of interrelations of people, texts, and contexts. These
                    interrelations and their development through time are at the heart of literary
                    inquiry, and having those relations embedded in the markup, and hence
                    processable by computer, offers the opportunity to develop new forms of inquiry
                    into, and representations of, literary history. Such new opportunities of scale
                    are often invoked using Greg Crane’s seminal question, “What can you do with a
                    million books?” (2006).</p>
                <p>We need to be able to ask big, complex questions while remaining grounded in
                    particularities, and we need new ways of representing answers to those
                    questions. This requires new tools for scholarly research that can access,
                    investigate, and present new aspects of the human story and history. In this
                    context, we contend that the scholarly interface requires not only
                    experimentation but also careful assessment to see what works to make digital
                    materials of real value to humanities scholars. As argued by Ramsay (2003),
                    Unsworth (2006), and others, using computers to do literary research can
                    contribute to hermeneutic or interpretive inquiry. Digital humanities research
                    has inherited from computational science a leaning towards systematic knowledge
                    representation. This has proved serviceable in some humanities activities, such
                    as editing, but digital methods have far more to offer the humanities than this.
                    As Drucker and Nowviskie have argued, “The computational processes that serve
                    speculative inquiry must be dynamic and constitutive in their operation, not
                    merely procedural and mechanistic” (431).</p>
                <p>The <hi rend="italic">Orlando</hi> encoding system, devised for digital rather
                    than print textuality, facilitates collaboratively-authored research structured
                    according to consistent principles. The encoding creates a degree of
                    cross-referencing and textual inter-relation impossible with print
                    scholarship—not simply hyperlinking but relating separate sections of scholarly
                    text in ways unforeseen even by the authors of the sections. It represents a new
                    approach to the integration of scholarly discourse, one which allows the
                    integrating components to operate in conjunction with, rather than in opposition
                    to, historical specificity and detail (Brown et al. 2006c). However, the
                    search-and-retrieval model of the current interface for <hi rend="italic"
                        >Orlando</hi>, while user-friendly in that it resembles first-generation
                    online research tools, cannot exploit this encoding to the fullest. Search
                    interfaces only find what the user asks for, whereas visualization enables
                    exploration and discovery of patterns and relationships that one might not be
                    able to search for.</p>
                <p>For instance, the current interface permits users to search for authors by the
                    number of children they had, and thus to explore the relationship between
                    literary production and reproduction. The quantity of material in <hi
                        rend="italic">Orlando</hi> makes it difficult to see overall patterns amongst
                    the results. Recent experiments with the Mandala browser have demonstrated that
                    visualization permits one to see both interesting anomalies (e.g. in lives which
                    have demanded the use both of the childlessness tag and the children tag), and
                    larger patterns, such as the non-correlation between high literary productivity
                    and childlessness or small family size (Brown et al. 2008). These preliminary
                    investigations confirm Moretti’s argument (2005) that visual representations
                    enable kinds of literary historical inquiry that are not supported by
                    conventional search interfaces. <hi rend="italic">Orlando</hi> has the added
                    advantage of making it possible to dive back into the source material to see the
                    specifics from which the representation is produced. </p>
                <p>In addition to the Mandala experiments, we have also been working on a set of
                    designs for visually summarizing relationships in a manner that allows
                    interactive exploration (e.g. Fig. 1). Building on the large body of previous
                    literature in network visualization (e.g. Barabási 2002; Watts 2003; Christakis
                    and Fowler 2009), we are experimenting with new visual representations for
                    networks of people.</p>
                <p>
                    <figure>
                        <head>Fig. 1: One of several concepts for summarizing relationships among
                            authors in Orlando. Here the authors on the path of connection are shown
                            as coloured circles, where size is frequency and a unique colour is
                            assigned to each author.</head>
                        <graphic url="764_Fig1.jpg" xmt:type="full" rend="left-img"
                            mimeType="image/jpeg"/>
                        </figure>
                </p>
                <p>Interviews and observations of users at a recent hackfest provided some excellent
                    insights into the sorts of interfaces that are likely to appeal to users wanting
                    to explore embedded relationships in a body of texts in an open-tended way.
                    While point-to-point visualization was considered to have some value, more
                    excitement was generated by open-ended interface sketches, such as a
                    visualization that resembles a cityscape, even when these were much less
                    representational than conventional interfaces for the humanities.</p>
                <p>At the same time, Brown and Bauer have been working on a visualization tool that
                    illustrates the challenges facing the project of representing all of <hi
                        rend="italic">Orlando</hi>’s semantically interrelated data through a
                    graphical representation based on nodes and edges. It highlights the difficulty
                    of providing prospect when dealing with a large and complexly structured data
                    set, since the full set of relationships even of a moderate subset of the 1200
                    writers becomes unreadable, with over 16 million edges in the graph. We have
                    done some work to explore algorithms and interfaces to accommodate these large
                    data spaces and multitudes of relationships and tags. The challenge is to
                    provide the researcher with a means of perceiving or specifying subsets of data,
                    extracting the relevant information, building the nodes and edges, and then
                    providing means to navigate the vast number of nodes and edges, especially given
                    the limited amount of space on a computer monitor. The figures below illustrate
                    some of the aspects of the tool.</p>
                <p>
                    <figure>
                        <head>Fig. 2: several writers and their interconnections</head>
                     <graphic url="764_Fig2.jpg" rend="left" mimeType="image/jpeg"/>
                        </figure>
                    <figure>
                        <head>Fig. 3: zoomed view of relationships</head>
                       <graphic url="764_Fig3.jpg" rend="right"  mimeType="image/jpeg"/>
                       </figure>
                </p>
                <p>The nodes (at the centre of the starbursts) represent writers, while the blue
                    dots show other individuals (Fig. 2). The edges are shown as differently colored
                    lines indicating different kinds of relationships as determined by tags
                    (identified in the colored boxes). A researcher can display names, hide certain
                    edges by deselecting tags, and zoom in and move around a large graph of nodes
                    and edges (Fig. 3). The tool is a starting point for evaluating existing
                    computational approaches and graphical displays of relationships as a means of
                    exploring literary questions. It raises exciting questions regarding the
                    integration of data mining approaches with a graphical interface, particularly
                    for scholars suspicious of abstractions. Computationally, the question of how to
                    make such a tool accessible to remote users is a challenge.</p>
                <p>This paper will compare the various approaches to visualizing links that we have
                    employed to date on this data, and reflect on them in relation to both the
                    literature on visualization approaches and our user feedback as a means of
                    advancing our thinking on the challenge of creating interfaces for exploring
                    large numbers of interlinkages within or between humanities resources.</p>
            </div>
        </body>
        <back>
            <div>
                <listBibl>
                    <bibl>
                        <author>Barabási, Albert-Lászlo</author>
                        <date>2002</date>
                        <title level="m">Linked: The New Science of Networks</title>
                        <publisher>Perseus Publishing</publisher>
                        <pubPlace>Cambridge, MA</pubPlace>
                    </bibl>
                    <bibl>
                        <author>Brown, Susan</author>
                        <author>Clements, Patricia</author>
                        <author>Grundy, Isobel</author>
                        <author>Ruecker, Stan</author>
                        <author>Antoniuk, Jeffery</author>
                        <author>Balazs, Sharon</author>
                        <author>Sinclair, Stéfan</author>
                        <author>Patey, Matt</author>
                        <date>2008</date>
                        <title level="a">Beyond Text: Using the Mandala Browser to Explore <hi
                                rend="italic">Orlando</hi></title>
                        <title level="m" type="proceedings">Society for Digital Humanities
                            (SDH/SEMI) Meeting. Congress of the Humanities and Social Sciences
                            Federation of Canada</title>
                        <name type="venue">University of British Columbia</name>
                        <date type="conference">June 2008</date>
                    </bibl>
                    <bibl>
                        <author>Brown, Susan</author>
                        <author>Clements, Patricia</author>
                        <author>Grundy, Isobel</author>
                        <date>2006c</date>
                        <title level="a">Scholarly Introduction</title>
                        <title level="m"><hi rend="italic">Orlando</hi>: Women’s Writing in the
                            British Isles from the Beginnings to the Present</title>
                        <publisher>Cambridge University Press</publisher>
                        <pubPlace>Cambridge</pubPlace>
                        <ptr
                            target="http://orlando.cambridge.org/public/svDocumentation?formname=t&amp;d_id=ABOUTTHEPROJECT"
                        />
                    </bibl>
                    <bibl>
                        <author>Crane, G.</author>
                        <date>2006</date>
                        <title level="a">What do you do with a million books?</title>
                        <title level="j">D-Lib magazine</title>
                        <biblScope type="vol">12</biblScope>
                        <biblScope type="issue">3</biblScope>
                    </bibl>
                    <bibl>
                        <author>Drucker, J.</author>
                        <author>Nowviskie, B.</author>
                        <date>2004</date>
                        <title level="a">Speculative computing: Aesthetic provocations in humanities
                            computing</title>
                        <title level="m">A Companion to Digital Humanities</title>
                        <editor>Schreibman, Susan</editor>
                        <editor>Siemens, Ray</editor>
                        <editor>Unsworth, John</editor>
                        <publisher>Blackwell</publisher>
                        <pubPlace>Oxford</pubPlace>
                        <biblScope type="pp">431-447</biblScope>
                    </bibl>
                    <bibl>
                        <author>Christakis, Nicholas A.</author>
                        <author>Fowler, James H.</author>
                        <date>2009</date>
                        <title level="m">Connected: The Surprising Power of Social Networks and How
                            They Shape Our Lives</title>
                        <publisher>Little, Brown and Company</publisher>
                        <pubPlace>New York</pubPlace>
                    </bibl>
                    <bibl>
                        <author>Moretti, F.</author>
                        <date>2005</date>
                        <title level="m">Graphs, Maps, Trees: Abstract Models for a Literary
                            History</title>
                        <publisher>Verso</publisher>
                        <pubPlace>London</pubPlace>
                    </bibl>
                    <bibl>
                        <author>Ramsay, Stephen</author>
                        <date>2003</date>
                        <title level="a">Toward an Algorithmic Criticism</title>
                        <title level="j">Literary and Linguistic Computing</title>
                        <biblScope type="vol">18</biblScope>
                        <biblScope type="issue">2</biblScope>
                    </bibl>
                    <bibl>
                        <author>Unsworth, John</author>
                        <date>2005</date>
                        <title level="a">New methods for humanities research</title>
                        <title level="m" type="lecture">Lyman Award Lecture</title>
                        <name type="venue">National Humanities Center Research Triangle Park,
                            NC</name>
                        <date type="conference">11 November 2005</date>
                        <ptr target="http://www3.isrl.illinois.edu/~unsworth/lyman.htm"/>
                    </bibl>
                    <bibl>
                        <author>Watts, Duncan J.</author>
                        <date>2003</date>
                        <title level="m">Six Degrees: The Science of a Connected Age</title>
                        <publisher>W. W. Norton &amp; Company</publisher>
                        <pubPlace>New York</pubPlace>
                    </bibl>
                </listBibl>
            </div>
        </back>
    </text>
</TEI>
