HighWire at 25: John Sack looks back

News

HighWire at 25: John Sack looks back

Last month saw HighWire’s 25th anniversary, a huge milestone in our history. Founded by Stanford University during the early days of the web, HighWire pioneered the online revolution in scholarly publishing.

Since then, our world has transformed beyond recognition and our industry is facing disruption like never before. In the last year, we’ve all had to come to grips with the “new normal”, exploring new ways of doing, sharing and publishing science and research more rapidly and more collaboratively than ever before. 

In this blog post, HighWire co-founder John Sack gives us some insight into the early days of HighWire and how it grew out of Stanford University in the early days of the web. 


What was your background pre-HighWire? 

During my graduate work in English at Stanford University, I spent time researching modern poetry the traditional way – flipping through a card catalog. My eyes were opened when I was helped by a public service librarian who duplicated three days of my tedious card catalog work by typing two or three commands into a computer system named “BALLOTS” (Bibliographic Automation of Large-Library Operations on a Time-sharing System), and providing me with a computer-formatted print-out.

At that point I became more interested in what computers could do for researchers than I was in my own research – into how authors ‘stored’ meaning in text so that readers could ‘retrieve’ it – and started studying natural language storage and retrieval systems. One of the first projects I worked on was SPIRES-HEP, a database that has been run by the Stanford Linear Accelerator Center since the late 1960s as a database of particle physics literature. Interestingly, SPIRES-HEP can be considered in some ways the birth of preprints; as the particle beamline was a critical resource, researchers who wished to use it were required to enter their abstracts into SPIRES-HEP to ensure that effort was not duplicated and the unique beamline resource wasted. Through this work on SPIRES I met a great mentor in my life and early computer pioneer Doug Engelbart; when I visited his lab and saw what he was doing (he invented the computer mouse, among other things mentioned below) it completely reset my research focus by inspiring a new understanding of possibilities.

This was in the early/mid-80s, just as Silicon Valley began its early growth phase. Steve Jobs sold the first Apple Macs onto campus, and researchers could print out books or articles using laser printers for the first time – Apple’s ‘LaserWriter’ printer was an eye opener after years of listening to the scratchy sounds of dot-matrix printers, because its output was close enough (at 300 dpi) to the photocopied journal page that research could be transmitted over the network.. Universities, publishers, and the research community became more open to rethinking physical vs. digital media. Stanford was putting the library ‘card catalog’ online, one of my first large campus-wide projects. For me, I just happened to be in the right place at the right time, and have a keen interest in the right space: natural language information retrieval, and workflow transformation. Both of these turned out to be part of putting scholarly communication online. 

What is the HighWire origin story?

In the early 90s I began to work with Mike Keller, the co-founder of HighWire and head of Stanford University’s libraries. The ‘serials crisis’ was a growing concern for institutes and libraries, and Mike thought web-based journals might hold a potential key to the challenge. 

Of course, to get things off the ground we needed to work with a journal publisher who was willing to take the risk on a new and broadly untested technology. At its heart, the HighWire origin story is about several non-profit publishers and societies who took the chance and took that risk to explore a completely new model of publishing. The hope was that the web could help to solve some of the problems hindering research: rigid formatting, rising costs, and lack of speed. Of course, from our perspective now, the dial-up internet of the early 90s looks painfully slow! (Our early interfaces to the all-important figures in the literature used three levels of detail, just to conserve bandwidth. But in some fields the figures were the point! For example, in a physics paper, figures were used to convey equations and a paper without its equations was like a sentence without verbs and nouns.) But it was a huge step forward from the traditional way of doing things.  

We worked with students in Stanford’s Computer Science department to partner with the leadership of the Journal of Biological Chemistry, together designing what would become the industry-standard for delivering web-based scholarly articles, issues and journals. This led to the birth, 25 years ago (see References), of JBC Online – which we then demoed at the Society’s annual meeting in May 1995.

You mention designing what would become industry-standard. What were the technical decisions made at the time? 

Many of the decisions we made in 1995 – none of which were foregone conclusions by any means – are still industry standards today, although of course the developments in semantic enrichment and discoverability, granularity of content, multimedia and A.I., and analytics have driven forward the way scholarly content is authored, reviewed, managed and disseminated. Our major early decisions included: 

    1. Be web-oriented, not application oriented. This was a big decision then, seemingly obvious now. But most organizations working on online journals were focused on writing applications (for Microsoft-based PCs at the time) rather than the ‘World-Wide Web’ approach. We started right off using the web, because it handled so much for us, and our readers.
    2. Be article-oriented, not page oriented. This was a move away from the page-bound approach of physical printing, and into the article as an “object” in its own right. 
    3. Be article-oriented, not journal oriented. Very few researchers were reading journals from cover-to-cover; it was specifics they were interested in. Today, enhancements in search and discoverability mean that users can drill down to extremely granular levels.   This meant, in particular, that we needed to think of a journal as a database, and that the main user interface would be a search engine, not a “browse issue” page-turning metaphor.   This was obvious in the case of JBC Online, since at the time it published about 100 articles – 1000 pages – a week!  
    4. Use HTML, not images. This was partly down to usability, as loading images took much longer than text. It was a huge step away from the physical print process, and inherent to the text and data mining capabilities which have evolved. But within just a year or two, Adobe’s PDF delivery became an essential adjunct to HTML article delivery.
    5. Make links. We could see that the true power of the web was its ability to connect and link content. This choice has been borne out over time, through the power of Google and the advent of the semantic web, and the ability to link out to related content and enable intuitive and intelligent navigation of content.
    6. Emphasize user-centered design. We wished not to emulate a print journal online (as most other systems at the time were doing), but rather to start with a user-focussed view: What were they trying to accomplish? How could we help them do it?  We took particular interest in the fact that – for research journals at least – the authors and the readers are the same people, writing for each other.
    7. Focus on recent literature, not the archive. The ‘need for speed’ in biomedical research – which was our focus – led us to focus on the recent issues of journals that might be taking months to reach researchers because of physical mail and library shelving times. The value to researchers that would motivate them to try this “new WWW thing” would be in reading material on the web that they could not read any other way.

This was taking a huge step into the unknown for publishers. What was the reception to these initial steps like at the time? 

From the researcher’s point of view, the last point above – about focusing on recent literature – turned out to be part of our huge success at the society annual meeting where we showed the JBC Online for the first time. We had decided to put the five most recent issues of the JBC online. Five issues of the weekly JBC was about 500 articles, around 5,000 pages of research! Frankly, researchers didn’t care about all our fancy technology – they just wanted to do a search for a keyword (e.g., a protein they were studying) and walk away with a printout of the most recent material on their topic. Our booth always had a line of people waiting, and the LaserWriter printer was always completely busy.

Early publisher and editor customers saw a lot of risk in moving to online journals, but they didn’t want to get left behind. There was a lot of FOMO – not that we were calling it that at the time! – and also a drive to be seen as ‘first-movers’ and innovators. Many felt that partnering with a Stanford-backed initiative mitigated the risk, as clearly these guys must know what they were doing! Our early customers were very high status in the academic publishing world (at one point, more than half of the top 200 most frequently cited journals were hosted by HighWire). They had a sense that they were at the leading edge. For us initially, of course we couldn’t know how web-based journals would be received. It’s thanks to those publishers and societies willing to take that risk that we were able to define, refine, revolutionize and standardize the way that scholarly information is published online. 

Why the name?

I came up with the name while sharing a bottle of wine with a friend. She was great fun to brainstorm with, because we could riff and laugh together and generate a lot of ideas very fast (as in true brainstorming) and then filter later. The wine may have helped! 

This was 1994, and things having to do with nets and webs were still new. Newspapers (remember those?) had to explain what the World Wide Web was. So “Wire” suggested that, along with a nod to electronics and communication. The “High” part was to suggest it was a bit of an unknown, an adventure. And the ‘joke’ was that we worked with a net (the circus-act theme was reflected in the names of our first couple of dozen servers (e.g., “sideshow” and “clowncar” were the names of two of them!). It was supposed to be fun and witty, rather than profound; whimsy was big in 90s startup culture. Recall that the main search engine of the time was “Yahoo!” and something called “Google” had just launched across the quad from HighWire at Stanford!

One of our earlier depictions of the logo was the name HighWire with a wire strung between the points on the W: the idea was of an electronic pulse such as you’d see on an oscilloscope:

This was “the HighWire act” as depicted by an artist at the time, balancing paper and electronic materials:

Credit: Andrzej Krauze and HMS Beagle

Academic publishing is a complex ecosystem made up of readers, researchers, librarians, institutions, funders, faculty, publishers, technologists. How have you seen the ecosystem evolve over time as business models and user expectations have evolved?

The biggest change in the last two decades – beginning with the introduction of NIH’s PubMed Central – is the intervention of funders in the publishing side of the workflow. Previously funders kept away from that, saying it would be improper to influence where people publish. Now they have different priorities, and they’ve become a hugely powerful player; for instance, making it a condition of grants that outcomes be published in a certain manner, or meeting certain criteria (such as open access).

The web has obviously led to huge collaboration within the ecosystem. Researchers are now all connected, and that’s enormous; there’s now more cross-pollination of knowledge between disciplines, in part because researchers now easily take hyperlinks across many more journals, vs. the fewer they would read in their niche fields previously. The use of social media to promulgate or hype research has been another big change. And the rise of alternative metrics that measure attention, but are hard to calibrate. The debate over the appropriateness of some uses of metrics – such as the Journal Impact Factor – and the attempt at ‘alt metrics’ – is part of an attempt to understand our audiences (both writers and readers).

The speed of interaction amongst parties is obviously much quicker than it was. When we started, and even into the earlier 2000s, most manuscripts were still transmitted in FedEx envelopes. As fast as that was, it was still much slower the digital workflows we are now accustomed to (an early demonstration of HighWire’s manuscript workflow system was a paper that was submitted, reviewed, accepted and published online in 24 hours by the JBC, taking advantage of its worldwide network of authors, editors, and reviewer). In its turn, that speed has led to further interrogation of barriers within the publishing process and whether they are necessary, or necessary in that particular order: for example, once a publisher has accepted an article, do they need to wait until it’s typeset and made pretty before publishing it? This questioning is what gave rise to publish-ahead-of-print as a (relatively) new workflow model. The American Chemical Society introduced its “ASAP” service – putting a basically-formatted paper online shortly after acceptance – and the JBC soon introduced its model of putting the author’s accepted manuscript online immediately upon acceptance. 

Business models were the next big change, publishers working with us came up with the concept of free back issues. The idea came from the editor of PNAS at the time, who suggested HighWire-hosted journals make back-content free to all. In some ways that was a bad idea from a business perspective – it gave away the value of the archive. But it led in some ways to more thinking about open access – the 12-month ‘embargo’ delay in the US open access policy is the same as most HighWire-hosted journals’ free back issue policy – and questioning why and who we were charging for access. 

Another thing that’s happened is that economies of scale are a huge advantage larger publishers have, so smaller society publishers have a new set of challenges. More and more are now going to commercial publishers to deal with that. Strangely, some of the business models which were intended to drive people toward open access have ended up to the advantage of large publishers, as they have the clout and the resources to do negotiations and sell hundreds or thousands of journals at a time, and as the Big Deal morphs into Read-and-Publish deals. 

What have been the biggest disruptions and innovations you’ve seen in publishing during your time in the industry? 

It goes without saying that the development of the web has made the biggest impact and revolutionized the industry. The development of search engines was hugely important; at first it was Medline in the life sciences via subscription then PubMed on the web, but that soon became open once we began indexing content there. Google indexed all of our content in 2002 and utilized the HighWire tagset to create the backbone of what became Google Scholar, providing researchers a more efficient way to discover information.  

There was a lot of experimentation during the early days of the web; people wanted to try things out, and it was much cheaper to do so online than in print. This led to a huge amount of innovation very quickly. Lately information-presentation innovation feels like it has slowed down, though changes to business models (such as Plan S) continue to drive huge change, and new workflows such as preprints and new technologies such as A.I. have yet to be fully incorporated. A lot of new tools, standards and initiatives HighWire has been involved in the development and championing of are designed to reduce ‘friction in the workflow’ (for example, preprints, CASA, and driving discoverability through semantic tagging). 

It would be hard in early 2020 not to mention the evolving role of preprint servers in the diffusion of scientific information. Within traditional publishing models, more than a year can elapse between the submission of the latest brilliant discovery to it being published. The practice was for researchers to circulate pre-prints informally among themselves. Digital has given researchers an extended set of possibilities in preprint, which we’ve seen utilized to its full potential during the COVID-19 pandemic. Suddenly preprints are seeing download rates comparable to the most-read articles in any HighWire journal ever hosted (both medRxiv and bioRxiv have racked up record traffic, with an enormous 40 million unique pageviews between April). Preprints are a point in the workflow of researchers, and there will be a natural evolution to capture, engage, connect and facilitate earlier and later points in the workflow. This will alter the workflow eventually, and potentially the value chain.

What was the most exciting period of your career & why? 

Meeting and being inspired by Doug Engelbart, who was responsible for the creation of the computer mouse, the development of hypertext (along with Ted Nelson), video conferencing, and precursors to graphical user interfaces, all of which he demonstrated in what is now known as “The Mother of All Demos” in 1968. Hypertext in particular was an “a ha” moment, since as a scholar I saw this as a way to implement easy-follow footnotes in scholarly papers. But hypertext became the niche “Gopher” protocol until the invention of the World Wide Web made hypertext part of the structure of writing online documents.

Meeting Doug led to a complete change in my career; I was a grad student in English until meeting him! I realized that exploring ways of organizing and presenting information that could make it useful and actionable was so interesting, and could lead to a complete transformation of the knowledge economy as we knew it. In the late 80s, Mitch Kapor – creator of Lotus123 and responsible for the first digital spreadsheets, and a great believer that the industry should honor its forefathers and foremothers – donated (as I recall) $1m to Stanford to set up a lab for Doug, who then became my supervisor; we met weekly to talk about whatever interested him. He was a quiet genius, and a key mentor of my life. 

Working with and learning from Don Kennedy, who died earlier this year, was also a huge honor. Don was Stanford University Provost then President during the early days of my career, then Editor-in-Chief of Science so I was able to work with him again. Don had an extraordinary way of mentoring young people; whether they be grad students in one of his courses, or young administrators like me at Stanford. A business-decision meeting was like a graduate seminar: the ‘why’ of a decision was as important as the ‘what do we decide’.

What are you most proud of during your time at HighWire? 

Building a start-up within Stanford was an incredible challenge, opportunity and honor. At the time, Yahoo! was just becoming visible; we started at the same time as Google. The exciting thing was completely rethinking how people got access to journals. From our discussions with researchers, we knew that they valued the individual article far more than the journals, which were a package of articles to most readers. We wanted to think outside of the journal as a monolithic print product; this led to us pioneering the idea of the “article economy”, with an article-first approach. 

Of course, this meant that journals then essentially became databases of articles, which didn’t support the way the business model worked. This led to a lot of conversations, challenging debate, and introspection within the community; rethinking the way in which research literature worked was a big opportunity. Acting as a community organizer for this incredibly inventive, creative community of editors and publishing leaders has been the real high point for me, and something of which I remain hugely proud. 

References: 
  1. Sack, John. “HighWire Press: ten years of publisher‐driven innovation.” Learned publishing 18.2 (2005): 131-142.\
  2. Sack, John. “How Herb Tabor’s vision for timely and accessible research led scientific publishing into the online age.” Journal of Biological Chemistry 294.5 (2019): 1721-1728

Latest news and blog articles