OPINION | The Necessity of Open-Access Data in the Digital Future

Unprecedented times call for unprecedented measures.  While it’s surely an overused and somewhat bastardized expression, it still holds a certain poetic appeal that unfortunately seems to evoke dystopian-like images of governmental overreach under a Machiavellian auspice – particularly in this era of constantly developing COVID protocols and lockdowns.  And though there is perhaps merit to this sort of association, I would rather use it in this instance to illustrate a kind of antithesis to these dystopian imaginations and as a ‘call to arms,’ so to say, to those who recognize that the COVID-19 pandemic has simply set a new precedent for the challenges the world as a whole will face in the very near future (pick your poison with regards to other pandemics, epidemics, climate catastrophe, mass migrations, and so on and so forth).

In these battles to come, the best means by which the populace can defend itself is simply the same as it has always been – through collaborative effort.  Within this inspiration, I am arguing that a pertinent example of this can be found in the sharing of data resources amongst those who are marching on the frontlines of the scientific community trying to contain the spread.  I would be very forthright at this point, however, and say clearly that this article is not operating under a rose-colored delusion that the world has figuratively ‘come-together’ in any meaningful way to defeat the common enemy of a virus.  However, I would argue that a critical step is in the process of being taken with the ways in which data resources are being shared as a means of expediting the scientific response to the COVID-19 pandemic.  Furthermore, it is also my belief that not only could this step permanently alter our global trajectory, but that it is essential for it to find solid ground should we wish to withstand the tides to come.  

Changing Perceptions of Open Data

This figurative step can be found in nearly any corner of the digital sphere having to do with data analysis in the COVID-19 era.  Personally, I first came across it while using Tableau software for data visualization (a relatively strong and intuitive tool for those who struggle on this front).  Important to know in this context is that Tableau is proprietary software that one has to pay to use and it is predominantly oriented towards corporate use (though there is a lesser-known variant called Tableau Public that is mostly free).  I bring it up here, though, because following the emergence of COVID-19 on the global stage, Tableau utilized its own data engineering and created the COVID-19 Data Resource Hub and made it accessible on Tableau Public and online catalog platforms such as data.world.  In essence, this resource made data regarding the spread of COVID maintained by Johns Hopkins University’s Center for Systems Science and Engineering far more accessible and usable by those in the field of data science.

Though this example might seem somewhat trivial and maybe even superficial, it is still fairly indicative of a more evolved way of thinking in lieu of the COVID-19 pandemic.  Further, it is simply one example – another can be found in the United States National Institutes of Health’s Open Access Data and Computational Resources to Address COVID-19in an endorsement by UNESCO, vicariously by the Center for Disease Control and Prevention’ provision of journals and databases, and in Canada by the privately-owned company Esri (which also utilizes officialized data collected by the Canadian government).  

These examples stand somewhat counter to the most predominant examples of data management in the contemporary digital climate of what Shoshana Zuboff refers to as Surveillance Capitalism: where private corporations (such as Facebook and Google) harvest user’s computer-mediated interactions and aggregate it into a product that they privately own and sell (predominantly to advertisers).  And while user data stemming from social media platforms isn’t directly comparable to some of the data that will be discussed, it is worth mentioning in that the mindset of this form of data management is being directly challenged – the implications of which are incredibly significant.  

The Use of Data in Responding to COVID-19

The reason that open access to scientific data has been seen as so valuable in the fight against COVID-19 is because it has been recognized that open source and collaborative scientific research leads to higher usability, transparency, and quality of research.  Further, it vastly expedites the process of scientific progress, and in a time of what could effectively be considered a global shutdown, starting it back up as quickly as possible is considered far more important than proprietary ownership over private datasets – at least it was in some of these aforementioned cases.  

Data in this context has a plethora of use-cases with regards to stemming the COVID-19 crisis; essentially any scientific community that is supported by digital technologies can find value.  Some of the more prominent use-cases, however, can be found in the use of artificial intelligence and machine learning algorithms to develop automated COVID-19 detection from Computed Tomography (CT) scans.  Additionally, it has proven invaluable for mathematicians and epidemiologists who are still in the process of developing virus-diffusion models for a variety of different social-distancing scenarios.  Finally, it has been considered essential for the development of contact-tracing technologies.

And while these are just a few samples, they still effectively illustrate the need for open data in the fight against COVID-19.  In fact, global research communities have been clamoring for quality datasets since the onset of the pandemic. The organization Open Data Watch is quick to point out, however, that there are many issues needing to be addressed in this facilitation; indeed, they highlight that data has been hidden, manipulated, and deleted to serve different political agendas.  Therefore, they have astutely made the case for interoperability standards in the provision and facilitation of open datasets, which essentially refers to an agreed-upon set of rules and guidelines for publishing accessible data.  The hope here is that these standards would negate issues such as inconsistent quality, data duplication, or the inability to compare data across systems.  

Efforts have also been made to understand how COVID-19 is being understood at a more psychological and sociological level; Natural Language Processing (NLP), Text Mining, and Network Analysis have been used through Twitter’s application programming interface (API) to create datasets containing millions of tweets of which can be parsed to determine common themes and thinking patterns – a process that is entirely necessary in the fight against COVID-19 misinformation, for instance.

Conclusion

It is for these reasons and more that the idea of open access data is coming more and more to the forefront of digital consciousness.  It is, however, not at all a new concept – the open source movement goes back to the earliest days of the internet.  Further, there is an incredibly strong digital community researching ways in which distributed ledger technologies – more commonly referred to as blockchain – can enable digital decentralization and cooperative collaboration in nearly any field facilitated by the internet (which is effectively any field).  The organization Decentralized Science, who received funding from the European Union’s Horizon 2020 research and innovation program, is a relevant example in this context, but they are simply one of many envisioning entirely new systems of collaboration and progress.

And while I have barely scratched the surface of these new potential systems, the point I am trying to make is that these ideas are coinciding more and more with the mainstream consciousness through the recognition that open-access data is paramount to a global pandemic response.  With all of that said, what needs to be recognized now is that there is no doubt whatsoever that the world will need to respond to an incredible variety of crises in the coming decades; again, pick your poison with regards to climate catastrophe, mass migrations, pandemics and epidemics, and so on and so forth.  Ironically, even untethered artificial intelligence merits some concern (though I am certainly in the camp that believes these concerns to be fairly misguided).

Be that as it may, since we know without doubt that the challenges facing humanity will continue to come with higher and higher levels of intensity, there is a level of foresight required here to develop the strongest infrastructure to respond to these crises.  In lieu of the COVID-19 pandemic as a figurative stepping stone, we also know the value in open access data as one of the most effective means of expedited scientific progress and response (with the aforementioned caveat provided by Open Data Watch regarding interoperability standards for this data).  Therefore, it is more than essential that more and more communities come into the world of open access and decentralized collaboration if we are to properly develop this infrastructure.  Indeed, it is a necessity.  

 

 

 

The views, thoughts, and opinions expressed in this article belong solely to the author, and do not reflect the views of Conversationally Speaking Magazine
Jack Smye
+ posts

Jack Smye holds a master’s degree in Political Economy from Carleton University in Ottawa, Canada.  His research interests have to do with the ways in which distributed ledger technology can transform the contemporary digital climate in a liberative manner.  Particularly, he is interested in the fields of decentralized digital identity management as well as cooperative and democratic governing structures in blockchain-enabled organizations.  Further, he is looking at how these interests will intersect with digital economies, decentralized data management, and developing systems having to do with artificial general intelligence.  Jack is involved with a variety of organizations at the forefront of research in these eclectic fields and he is also a programmer in the area of data science with developing skills in 3D printing.



Categories: Science & Technology

Tags: , , , , , , , ,

Leave a Reply

%d bloggers like this: