Hypermedia as Integration: Recollections, Reflections and Exhortations

Keynote Address, Hypertext '96 Conference
Washington, DC, March 20, 1996

Randall H. Trigg

Xerox Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, CA 94306
Copyright © 1996 Xerox Corporation

Click on linked thumbnail images to see full-size slides together with text.

Introduction

It's truly an honor to be invited to give this keynote. Even though my involvement in hypermedia has ebbed and flowed over the years, I've always felt an ongoing commonality with this community and its goals. To give you one example, I've been impressed and pleased ever since the 1987 Chapel Hill conference at the continuing active participation and interchange of two groups Frank Halasz called the "engineers" and the "literati." This is not to say that we've always completely understood each other, but over the years, we've pushed and pulled across that "boundary" - strengthening and deepening the work of both sides. A brilliant example is Jim Rosenberg's groundbreaking theoretical work on structuring and composites, which is in part informed by the innovative structures of hypertext fiction and poetry, like his own Intergrams.[1] This bridging has always felt wonderfully appropriate for a field whose primary concern is linking. So there's a first sense of integration that we've been involved in for years.

I'll start with a brief and "coarse" history of hypermedia systems. This will point to how I'm going to be using the term "integration," and set up a very current example to show just how controversial and political these issues can get.
Then I'll present a brief "guided tour" through the hypermedia systems I've been associated with over the years. This will get my biases and prejudices about integration on the table and into a historical context.
Next, in the "reflections" part of the talk, I'll revisit Frank Halasz's landmark keynote at the HT'91 conference. Those of you who were there remember how aptly he captured the state of the field. I think you'll also see how prescient that talk was, and how relevant the issues continue to be today.
Finally, I'll issue a few exhortations - some integration perspectives we ought to be pursuing today.

Here is a collection of some of the senses of integration that have played a key role in our work at different times and in different ways. Some of these have received lots of attention, others maybe need a bit more. (For example, I've had some fun chats with Cathy Marshall about linking across the offline/online boundary, an area we might consider investigating deeper.) I'm sure you can add your own senses of integration to this list.

To get us started, here's a rashly oversimplified, but I hope familiar history of our field, or at least the system-building part of it. Rather than looking at the full richness of the different "generations" of hypermedia as others have done, I'm focusing on these "eras" as indicative of which integration goals were predominant.

So, for instance, I'm starting the first era at the time working hypermedia systems appeared on the scene in the 60's with NLS/Augment being the outstanding example. Sometimes today, we look back on these monolithic systems as antiquated elephants, as biting off more than they could chew, trying to cover all of the user's computing needs under one roof.
But it's important to remember that the goal was the noble one of offering integration to users. Unfortunately, there wasn't much out there in the way of "applications" to link. Was there really a better text editor, or programming environment alternative to that offered by Augment? In addition, there wasn't an infrastructure for connecting separate programs - each hypermedia developer had to design their own.

The second era was marked by the "open systems" movement. Here we acknowledged the existence (and dominance) of powerful specialized application programs. The goal of integration was transformed to one of designing the glue to connect these applications and their documents together.

Today we're seeing the ascendance of other goals of integration, of global distribution, and radical decentralization via the World-Wide Web.
One thing we've learned to do in these (post-modern?) times is to mistrust sequential, "progress" view of history. History always seems to turn out to be more cyclical upon closer inspection. Indeed, the goals of those early developers are with us still, things have just gotten complicated in some different ways. As a result, we sometimes find ourselves re-learning the same lessons over and over.

Here's the home page for a "feature" of the latest Netscape Navigator web browser, support for email services.[2] Let me read you the opening lines:
"Do I really need another mail server? If you're like most companies you probably have at least one or even several mail systems already. In fact, you probably have at least several different vendors' mail solutions. So why should you purchase a Netscape Mail Server and add another one?"
My questions exactly! Is this a step forward?

Now, we've had support for sending email from web browsers for awhile - there's a "mailto" protocol that embeds this in html link tags. But email browsing and management is a whole different kettle of fish. There are good programs based on years of iterative development. I'm a fan of Eudora myself which integrates rather well with the web. I can "command-click" on a URL in an email message on my mac, to cause Netscape to display that page. Sure, Netscape 2.0 lets you render html in the message. But unless you think HTML ought to be the replacement for ascii, this basically saves one key click.

I want to use this example to raise a few hard questions about integration. For example, whose interests are best being served here? Imagine an alternative where Netscape and Qualcomm (producers of Eudora) form an alliance, leaving each responsible for what it does best. After all, we do see such alliances. Has someone at Netscape decided that the battle with Microsoft is better served through "colonizing" more of the desktop, in effect returning to monolithic agendas?
But we need to be very careful about casting stones here. First, I know I'm not alone in worrying about the alternative if Netscape loses the battle with Miscrosoft. But more importantly, the question of how best to support integration is hard. Clearly the standards and infrastructure have to be in place before smooth "boundary crossings" can succeed. Here's where work done in this community has so much to offer. Witness for example, the exciting open hypermedia systems workshop that took place earlier at this conference. I only hope the Microsofts and Netscapes of the world are listening.

Recollections

Now I'd like to follow the thread of integration at a more personal level. Some of you may be surprised (and relieved!) to see that Dexter isn't on this timeline.[3] It's not a system, but still had a big influence on my research and turned out to be extremely valuable for work in integration. After all, Dexter itself was an overt attempt to integrate the experiences of the foremost hypermedia systems of the late 1980's.

I've gotten a bit of a reputation for having been the first to get a Ph.D. in hypertext.[4] But some of my colleagues used to correct me, saying I was the first to "get away with it"!
In looking back, I think two integration goals were at the heart of my thesis. One came from having been inspired by Ted Nelson's vision of Xanadu in his classic text, Computer Lib, and by Doug Engelbart's Augment project. I thought what was needed was an approach that integrated both styles of interconnecting information. At the same time, I was intrigued by the "CSCW" side of Vannevar Bush's vision - how scientists and academics distributed across networks could work together, commenting on and critiquing each others writings.
Though these two endeavors wound up taking up most of my thesis, there was one chapter in between that formed the glue between them, and is the only part of my thesis that I'm still asked about today.

I had decided that hypertext links needed "types" (really "labels") that could distinguish in what way the link was serving either as a traversible connection, a structuring means, or an argument representation. Within each of these, I mapped out a finer-grained semantics. Based on a study of published articles and peer commentaries, here was my proposal for link types in support of critique.
As it turned out, the most important issue raised by my thesis was how a link mechanism could integrate representation of relatiohnships, structure, and traversal.
One more point about my thesis that is especially interesting today: The chapter on link types was required at the end by the outside member of my committee, Dagobert Soergel, a professor from the College of Library and Information Services at the University of Maryland. In those days, "digital libraries" were the focus of study by Dagobert and others, but they had to wait just as long as we for the rest of the world to finally catch on. I'll return to the issue of learning from librarians later on.

Coming to Xerox in 1984 was in many ways like meeting old friends. We hadn't known about each others' work, but NoteCards development had followed many of the same goals.[5] Fileboxes were a link-based hierarchical structuring mechanism, showing that links could marry both structuring styles, though we later came to see that this was a mistake.[6] NoteCards also had link types, though rather than predefined sets for particular applications, there were only a few built-in together with support for user definition. Later when we built the "programmer's interface," functionality could be associated with these link types.

One day we got the idea to treat all the notefiles we could get our hands on as a kind of "data" and gather some statistics. Here is the list of all the link types we found that people had created. It's a fascinating collection - and I'm sorry we didn't look much harder at them while we still had access to their creators and contexts.
"Link types" have come a long way since then, forming a piece of what George Landow called the "rhetoric of hypertext."[7] But the question of tailorable versus standard semantics is still relevant today.

Another study conducted by Melissa Monty revealed important aspects of the way our hypertext interface was engineered.[8] Her warnings against forcing "premature" atomization, filing and the like, led us not only to support different styles, but also to allow for smooth movement between them.

Frank Halasz said in 1991 that "Happiness is Notecards in the rearview mirror." Indeed, his critique of NoteCards identified vital issues for the field (a few of which we'll get to later). But in thinking about the positive side of what we accomplished, I think first of our users, a motley crew that were crucially "out of our hands." Here's an example of how we learned from a user about a rich sense of versioning, what we might call "chronological" integration.[9] This is a tiny piece of a NoteCards user's thesis notefile. When he explained to Peggy Irish and me what he was up to, he started from these two outlines - the "old" and the "new." But those terms had a relative meaning at best. Both outlines had been around for awhile and would persist usefully for some time to come. The "transition" phase between them ended up comprising much of his thesis work. Even if we could have provided automated support for "restructuring" his hypertext world, he wouldn't have wanted it. In fact, one of the things he liked best about NoteCards was the way it let him live simultaneoulsy coming out of the old and into the new. Here's hypermedia at its best - overlapping structures supporting multiple views. This suggests defining the goal of integration as supporting work that "rides" some boundary or transition - supporting change, but at its own pace.

Jeremy Roschelle and I worked on a system meant to offer better integration across multiple media, in particular, video.[10] Here on the left, you may be able to see the shaded scroll bar "thumb" that moved down the timeline as the video played. The boxes are what our users called "landmines," link anchors that, when encountered, automatically brought up their destinations, say, a bit of transcript or graphic explanation for a piece of video. The anchors had duration; each window would close as the thumb fell off the other end of its landmine.

This was a big hit with our users, social scientists using video in their analysis. They loved the ability to link from their videotapes to their descriptions, but hated the text editor we left them in. Ah, back to Era 1 integration. We were in an Allegro Lisp environment on the Macintosh which had an emacs-like editor with some minimal text formatting. But unlike the early days, there were better editors around. In particular, our users were long-time MS Word addicts. Couldn't we please link the video to MSWord?

Like any self-respecting hypermedia researcher receiving a challenge like that, I ran away. In fact, I left the country. But of course, there is no escape. There on the other side of the ocean in a city called Aarhus, Kaj Grønbæk and his colleagues were running into the same problem. This time we agreed to take it up as a central driving design goal (as were the Microcosm folks at Southampton among others). Kaj's studies of engineering work at Great Belt had shown an overriding need for linking across diverse third-party applications as well as a need for support for multiple platforms.[11] Integration agendas were the order of the day. We chose Dexter as the basis of our framework,[12] but that's another story.

Recently along with many of the folks in this room, we've begun worrying about WWW integration and all the infrastructural, networked, decentralized issues it raises. I think we have a ways to go before we get deep integration that will let us "ride" the boundary between local and world-wide hypertexts.[13]

Reflections

Some of you remember two earlier conferences at which Frank Halasz voiced some strong, provocative closing words, in 1987 where the result was his "Seven Issues" paper, and in 1991 when he revisited that analysis.[14] Who knows, perhaps we should revisit Frank's issues again in five years to see what they might have to say about the next millenium.

Now it's incredibly hard to refrain from talking about every one of Frank's issues. Looking back in preparing this talk, they each spoke volumes in today's WWW world. Realizing that not all of you have those b&w copies of his slides handed out at the end of HT'91, I got Frank's approval to put them up on the web, along with a transcript of the talk made by John Leggett and Cindy Kunz at Texas A&M University (http://www.parc.xerox.com/halasz-keynote).
I can only look briefly at three of Frank's issues here. Please do check out the rest yourself.

The first issue that jumped out at me was, "Very large hypertexts" remembering that the WWW hadn't taken off then. Here, Frank named the "barriers" to hypertexts bigger than 10K nodes. All quite reasonable issues when you look back at them. So what happened? People have big disorientation problems on the web. Until very recently (with WYSIWYG editors like Adobe Page Mill) document input and link creation is painful. Privacy has only recently been addressed. Heterogeneity amounts to HTML plus a few image formats. Could it be that the success of the WWW is due to the resolution of the other two problems - addressing the problem of scale by means of the already installed base of internet users, and the resolving of LAN/WAN issues by enabling TCP at the personal workstation? Judge for yourself, I suspect that Tim Berners-Lee's focus on a particular group of users and their needs helped alot - as did some adept politicking on his part. And let's give credit to the power of links even as impoverished as embedded "goto's" are. WWW as distributed HyperCard?

In 1991, Frank was preoccupied with the "tyranny of links" as he called it. He urged us to look hard at alternatives to linking. The way things turned out here is also fascinating. The web is certainly full of links, but there has been progress on a couple of the important alternatives, serious search & query support, and computed links (via CGI scripts).

No, the problem isn't that we're preoccupied with links, it's that we have lost track of structure! HTML is flat inside pages, and across pages there is no explicit representations of higher order structures. Fortunately there's hope on the horizon with projects like Hyper-G and DynaWeb's attempts to bring SGML to the web. It's ironic - Frank probably never expected the WWW to take up computed links and virtual structures, but leave behind composites!

Here's a quick example of what's missing. How often have you done a search for a home page for someone, and had to wade through pages of other hits (depending on how popular the person is, or how common their name). Our search engines could easily retrieve home pages, if they were only marked as such. We'll come back to this need for cataloging in just a moment.

The example with search is just one place where structure is crucial. Here I've tried to depict another sense of integration - this time the boundary we want to "ride" is between degrees of structuredness (i.e. between the elements of this table). Each of the kinds of navigation aids so well studied in our field ought to have both kinds of instantiations. I look forward to the day when structure available, say on a local host maintained by industrious webmasters, can smoothly and transparently be made visible to a web visitor and just as smoothly gracefully degrade when she leaves for the "wide open spaces."
In short, structure may be the crucial distinction between our view of hypermedia and the WWW's.

Exhortations

Alright, here's the moment where you get to ask what in the world all this history and looking back can say for what we should be doing now. Certainly, we need to bring structure to the web. Keith Instone's workshop on hypermedia research and the WWW at this conference is a first step in that direction.

What else? Well, here are three of my personal exhortations.

First, just to recapitulate, I think we need to look back at our own field for ideas that could influence current developments. Bush, Nelson, and Engelbart have gotten well-deserved attention over the years, but there are other long-lasting and significant projects in our field. To pick just one example, I'd recommend ZOG/KMS - many of their earlier ideas including their provocative UI perspectives are quite relevant today.[15]

Second, I recommend doing more of what we've long shown a talent for - looking to other fields for inspiration. Here, I will plug the cross-fertilization that is happening with digital library work.

Finally, I want to exhort you toward a somewhat different kind of integration involving community building.

The great thing about the digital library craze is how much we're learning from librarians, not just how much we can teach them about technology. Let's take a look at one kind of library work, cataloging.

Here, I'm drawing on recent work by David Levy, a colleague at Xerox PARC and a name known to some of you from the Digital Libraries conferences. He has been studying cataloging in a particularly engaged way. His sources include histories, textbooks, and ongoing "live" debates. From these, he's starting to develop recommendations for the value and potential role cataloging could have on the web.[16]

David characterizes the work of catalogers generally as maintaining stability and controlling variability.[17] Moreover, they participate in defining the bounds of variability. Better understandings of cataloging will surely have implications for the development of URNs, URCs and the like. But I want to pick up another of David's threads, what the catalogers call "finding aids."

Long used by the archival community, these remind me a bit of hypermedia "guided tours."[18] Used to describe large stable document collections, they have both a structured and a narrative quality. A researcher at Berkeley, David Pitti, has designed a structured SGML representation for finding aids which he is proposing be adopted as a web standard.

Here, have a look at Pitti's web pages. One of the nice features is that his structures can be browsed with standard non-SGML enhanced web browsers, thanks to on-the-fly translation by DynaWeb. As I learned from the inserts in our conference packet, our own Steve de Rose deserves much of the credit here.

There are many archives in several institutions which have been cataloged with Pitti's finding aids. This one is housed at Berkeley, the records of the Nicaraguan Information Center which bequeathed all its documents to the university library in 1991 when the center closed its doors. Here's the high level structure which includes narrative descriptions of content as well as details on the physical holdings. For example, I learned that this archive occupies some 235 linear feet.

Here is a view of "Series 5," a bunch of containers holding the center's news clipping collection. Notice the structure labeled "where you are" captured at the top of the page.

If you do a search, the hits are represented "in context" very much in the Sup erBook style. For example, there are three hits on "contragate" in Series 5. Clicking on the link shows us pages marked up with hits, again in context. The feeling is one of smooth transparent access - one does familiar sorts of searches, but structure is a resource and is represented as appropriate.

Notice the way Finding Aids provide links across the offline/online boundary, a kind of integration we haven't perhaps focused so much on.
Pitti's work is also making the knowledge and expertise of the archival community available for possible application on the web. I look forward to exploring "finding aids" as a means of structuring complex collections of evolving digitized and offline material.

Now I want to change gears slightly to look more at human activity and what hypermedia might have to do with that. (That is, human activity outside the use of technology.) On the web, one often hears a distinction made between two perspectives. The predominant one is an "information" perspective where activities include advertising, "information seeking," and "content creation." (I love the term "content," as though people's authoring is a matter of filling containers.) A less prevalent perspective is that of "communication," though of course, email is still a large part of internet use. Here one begins to see activities that are associated with the forming of communities, like collaborating, mobilizing and volunteering. My question is what role hypermedia is having and might have in supporting those activities.

I'm struck in particular by the interplay here between the functions of email and what we could call evolving shared structures (i.e. inter-linked pages). Links bring something new to the table, and I bet one could learn alot by comparing the formations of internet communities before and after the WWW. Granted, shared WWW structures are inadequate for true collaborative work, but things were much worse in the days when we only had ftp. We could call this kind of link-augmented communicating, "networking by networking."

Let me give you one of my favorite examples of this kind of networking on the web. Jervay Place is a public housing project in Wilmington, NC that a couple of years ago was being threatened with demolition. At that time, half the housing units were boarded up - the other half were almost all households headed by single African-American women. A group of these women formed what they called the Jervay Task Force, to start negotiations with the local and federal housing authorities hoping to gain influence over what would succeed Jervay. The task force argued that they should be given an opportunity to live in the new buildings that would succeed Jervay Place. And furthermore that they should have a role in planning and even designing these buildings. Lacking expertise in architecture and community planning, they went to a local public access computer which was on the internet and which they had begun to learn their way around. They started with an eail appeal for help to several usenet lists covering alternative community housing. Lots of positve responses came back and a few architects outright volunteered their time. The task force women then sent the city's plans (which had been a struggle to procure) to these architects who commented on and critiqued them, proposing some alternatives. The task force then sat down at a meeting with the housing authority, with the experts' proposals in hand. This gave them a kind of credibility at the table they otherwise wouldn't have had. This enlisting of participation at a distance played a role in the success of their interactions.

Meanwhile the task force had gotten hold of some disk space on the city's host machine and put up web pages to keep their new found supporters informed of the progress of the negotiations. Over the last nine months, a wealth of materials have appeared there including reports like this one of another meeting.

And pictures like these letting the "community" of supporters get a better sense for Jervay and its residents.
Though the internet part of this story started with an email call for help, the evolving network of web pages was instrumental in forming a lasting presence that could sustain existing connections. In other cases of using the web for forming social communities, things might be ordered differently, say, starting with web pages and then looking for input and volunteers to help "fill the gaps." First links, then email.
I leave to you to count the kinds of "integration" happening in these examples. And encourage us all to think about hypermedia's role in this, the underreported (and undervalued) side of the internet.

As Cathy Marshall said earlier this week, we're on the web alright, but we're still very much "in the box." Perhaps broadening our perspective from linking to integration in all its rich variety can pull us now and then out of the box and into the world.

One last shameless plug: If you're interested in learning more about "communication" perspectives on the web (and many other important issues), check out Computer Professionals for Social Responsibility, an organization that offers an alternative point of view to those of the sometimes overwhelming commercial interests that surround us and our work.

Last modified: Fri Feb 6 17:40:32 1998
Randy Trigg trigg@workpractice.com