The fourth paradigm of science brings with it an onslaught of data. Quantitative, qualitative, direct and anecdotal, it’s an often-acknowledged fact that the ability to collect and share vast quantities of data is the greatest change in scientific research of our times. With this new opportunity comes inherent challenges in the comprehension of data.
Enter data visualisation. For the purposes of this blog, data visualisation is defined as the dynamic manipulation of data to aid understanding and provide context; data visualisations that are formed on demand from criteria set by the user.
This is distinct from the many online tools that create graphs, charts, storyboards and timelines available for free or on a subscription which function using inputted data to create a static visualisation for publication. To further narrow the focus, this also excludes data visualisation tools that are interactive, but limited to the exploration of a single set type of visualisation; an interactive map for example.
In academic publishing we are only just setting out on the dynamic data visualisation journey. There are some great examples of visualisations that provide context and clarity to the exploration of datasets. Among them are SchoolDash, which provides maps, dashboards, statistics and analysis of schools data in England for the use of the public, journalists, policymakers, and the schools themselves, created by Timo Hannay, the founder of Digital Science.
Also notable is a recent project completed by Semantico for McGraw-Hill Education, where a dynamic data visualisation tool – DataVis Material Properties – was added to their Access Engineering database to provide students and researchers alike with configurable visualisations of material properties data, including cost, to provide instant visual context to materials data. With the functionality to save multiple visualisations, and to ‘dig down’ into more detailed information, the tool supports students, educators and researchers to tell stories with the visualisations.
The potential applications of tools of this type are myriad. In his excellent 2010 TED talk, The Beauty of Data Visualisation, David McCandless quoted research by Tor Norretranders that found the bandwidth of our visual senses was equable to that of a computer network (for comparison, our sense of taste has the throughput of a pocket calculator), and that data visualisations take advantage of our brains’ capacity for spotting patterns and making connections. Using dynamic data visualisations for databases and datasets could and should become a mainstream aspect of publishing research in the future.
By simply concluding, “Data Visualisations are brilliant! We should have more!”, we would be missing the real untapped potential of dynamic data visualisation. As it stands, visualizations are taken from cleaned – and therefore closed off – datasets. Imagine then if visualizations could be made from the vast raw datasets languishing in data dumps. If, instead of neatly fencing off data as an analysis reporting tool, it was scraped from raw datasets as an integral part of the research process. Indeed, in an ideal world, if these datasets could be stitched. This would free the data from the bounds of perspective and ideology.
Scaling back these heady imaginings to what is possible within current data quality conventions, the application of data visualizations to raw data is an opportunity that as it stands, we are missing. Dynamic visualisations, taken from raw data and outputted as html are achievable. These would answer a number of needs. The need for the rapid availability of data results, the need for further context, and the need for the research output in one scholarly discipline to be available and understandable to others.
There are barriers to this. Data quality is imperative, and of course, the need for raw data to be reviewable. Perhaps the main barrier for publishers though, is our continued attachment to the print mental model. The main method of publishing research is the journal article, submitted online, reviewed online, produced online and published online – as a pdf. A flat online facsimile of the printed journal article.
Dynamic data visualisation is the scholarly, educational and indeed, information, publishing opportunity of our times. Integrated into publishing workflows and online publishing platforms it could springboard the effectiveness and usability of information to the next level.
Latest news and blog articles
Full-text HTML of preprints now available on medRxiv