Datapoetic Glossary: an introduction to Datapoiesis Language
This text is part of the Datapoiesis project. During the first Datapoiesis Fall School in November 2019, Urbano Reviglio and Alessandro Longo found the necessity of developing this glossary in order to create a common language for the workshop. This work was exposed in Ivrea at the end of the Fall School. If you want to know more about Datapoiesis, you can visit the official website. A huge thanks goes to Oriana and Salvatore for letting me be part of this experience and to Urbano for being my companion in it.
Table of contents
0. Preamble: Tackling the Linguistic Explosion
1. Datapoiesis Practices Description
2. Main Datapoietic Concepts
2.2 Preliminary Definitions
0. Preamble: Tackling the Linguistic Explosion
Dozens, hundreds, if not thousands of new words and concepts are a testament to our ever-growing symbiotic relationships with technology, data and computation. When a culture integrates something new — a new technology, for example, or an art form — new words enter the language, giving us the vocabulary we need to talk about it. There are many ways of handling this evolution; sometimes a language will borrow words from another language, but often we draw on the resources of our own language, often with portmanteau — a linguistic blend of words — or, more simply, with two or even more words. Some of these concepts are indeed intuitively integrated in our everyday language.
Yet, such a linguistic explosion is disruptive. The consequence chaos is an opportunity for the datapoietic artistic process, opening the possibility for differences and novelties to emerge. Chaos is always also chaosmos: through ambiguity, new visions can blossom. Forming, inventing and fabricating new concepts and artworks is precisely to work on this chaotic dimension. However, it is also necessary to not let the chaos prevail: both to avoid the danger of incomprehensibility as well as the dominance of technical language.
Therefore, we find ourselves in need to introduce Datapoiesis processes along with problematized — as much as preliminary and transitional — keywords. In addition, we aim to create a brief glossary, the first datapoietic Glossary, to expose, discuss and, eventually, map the kaleidoscopic constellation of the data realm within the datapoietic creation process. Defying technical terms, we perceived the necessity of a human touch: data science vocabulary is nuanced with artistic and philosophical sensibility. Such datapoietic Glossary, similarly to the environment in which was developed, has to be thought like a living entity. The aim is to ultimately develop a common language able to clarify and trigger further discussion — within and outside the Datapoiesis — about the way data and computation are changing our environment, social practices and imagination.
1. Datapoiesis Process Description
Datapoiesis uses data and Artificial Intelligence to create objects and experiences that help human beings and their societies to perceive and comprehend the complex phenomena of our globalized world, and to use these understandings to promote positive change.
More concretely, it provides a platform within a broader dynamic ecosystem to help to make visible unexpected correlations hidden in data (that, by nature, are invisibly produced and apparently noisy) to ultimately create datapoietic objects and experiences. Data is treated as the new oil. In the actual data revolution we assist to a datafication of everything that creates a data overload. When tech companies extract, mine, capture as much data as possible to create value for profit, Datapoiesis strives to explore data, collect, analyze and curate them to create meaning (to create value). Metaphorically, while most businesses more likely see such ‘oil’ as gasoline for energy, Datapoiesis uses such oil as shaping it, as plastic, to create new meaningful objects, devices and experiences. We don’t want data to be just the new oil, if this implies an extractivistic, centralized, rapid and often automatized model which doesn’t care about the human value of these resources.
Datapoiesis opposes and resists Data Hubris. Data in fact do not necessarily provide meaningful information nor knowledge, at all. Data in fact is an Information element consisting of symbols, and information is the result of the processing of more data. Thus, information attributes meaning to data that, in isolation, has none. To understand this, consider the “knowledge pyramid” (see image); there is data at its base, with information above, then knowledge and, finally, wisdom at the top. Data hubris could believe that the more data the better their predictive power and decisions. Yet, more knowledge can reduce the accuracy of prediction of uncertain outcomes and simultaneously increase confidence in prediction. Datapoiesis opposes this. Datapoies does not merely accumulates data and information to eventually find correlations: Datapoiesis aims at turning data and information into knowledge, then use this to create object and experiences to ultimately achieve wisdom.
To achieve this, Datapoiesis is going to be a participatory, transparent and sustainable platform; Participatory because we consider users both as data consumers and data producers, in one word: data prosumers. Datapoiesis considers data as common. Transparent for two reasons: more generally, our platform governance and more specifically our data governance. Sustainable because we care about our digital carbon footprint. As such, Datapoiesis aim at raising awareness on such invisible but profound environmental negative impact.
To conclude, Datapoiesis explores the unexpressed potential of data, not to banalize the complexity of human beings and societies while, instead, to problematize and nuance them in order to spark awareness and reflection according to its main principles of transparency, participation and sustainability.
2. Main Datapoietic Concepts:
- Data Hubris
- Data revolution
- Data overload
- Data processing
- Data collection
- Data curation
- Data visualization
- Data governance
- Data protection
- Thick Data
- Data common
- Data anonymization
- Digital Carbon Footprint
- Platform Ecosystem
- Data Prosumer
2.2 Preliminary definitions
- Datapoiesis: Datapoiesis is a portmanteau of Data (english) and -Poiesis (from ancient greek Ποίησις) and means “to create something that was not there before.”
- Data: from the latin datum, which means given. Data are, by definition, the quantities, characters, or symbols on which operations are performed by a computer, which may be stored and transmitted in the form of electrical signals and recorded on recording media. Yet, data is never given. Critically, the nuances of its ontology need to be steadily problematized as the notion of data also include, in particular, its political character — data is power — its social character — data is relational — and economical character — data redefines the economy. Data is perhaps one of the most ideological concept of our culture.
- Big Data: One of the most buzzing and puzzling terms of this decade. There are many definitions of big data such as: “Data that exceeds the processing capacity of conventional database systems.” (O’Reilly). Big Data, then, are not only a very large amount of data, but also data that are constantly flowing and changing quickly. They come in many formats (structured/unstructured, text/image, etc) and are not always trustworthy. The accumulation of such quantity of data already changed human society forever. In Datapoiesis, we are aware that data is not what is used to be. The activity of counting lost its centrality: in Big Data, what really matter are the shapes through which we can observe these myriads of data.
- Thick Data: Thick data involves qualitative informative materials, tools or techniques that help to gather granular, specific knowledge about their target audience
- Data Processing: the activities conducted to obtain meaning from the raw data. It is especially such a process in which Datapoiesis opposes to the Data Hubris.
- Data Curation: the active and ongoing management of data through its lifecycle of interest and usefulness to scholarship, science, and education. Data curation activities enable data discovery and retrieval, maintain its quality, add value, and provide for reuse over time, and this new field includes authentication, archiving, management, preservation, retrieval, and representation. In this creative process, Datapoiesis aims to experiment and formalize innovative tools and skills. In a sense, we indeed believe that data need to be curated. Datapoiesis intend to evolve from Data Curation to Data Caring, which opens an entirely new generation of professionals.
- Data Collection: the process of gathering data originated by different sources towards a specific elaboration centre.
- Data Capture: the action or process of gathering data, especially from an automatic device, control system, or sensor (Oxford). In Datapoiesis, we consider this word problematic as it refers to an extractivist model.
- Platform: Platforms are what platforms do (Bratton, 2014). What we call “platform” is “a standards-based technical-economic system that centralize and decentralize at once, drawing many actors into a common infrastructure”. A platform is intended as a tool to help organize the interactions of an ecosystem. It creates a common language with determined roles with the aim of outperforming in respect to its competitors.
- Ecosystem: The word “ecosystem” comes from biology and it refers to a system in which entities have some degree of mutual dependence. In a platform ecosystem, the value created by each member influences the value created by others.
- Data prosumer: Data prosumers refers to datapoietic users who both consumes and produces data. The aim of Datapoiesis is in fact to treat and push users to be data prosumers so as to stimulate participation and interactivity.
- Datafication: This term is used to indicate the main technological trend of our time: the becoming-data of the world. Every aspect of human life, from social relations to finance, from health to sport, till nature and space, is nowadays a source of computerized and quantified data, thus allowing for real-time tracking and predictive analysis. Datapoiesis clearly wants to explore all these data.
- Digital Carbon Footprint: The ecological impact of data activities (storage etc.). A paradigmatic example of the materiality of the Internet: Google.com “processes an approximate average of 47,000 requests every second, which represents an estimated amount of 500 kg of CO2 emissions per second.”
- Data Overload: The state of confusion generated by the excessive amount of data and information.
- Data Science: It is a “concept to unify statistics, data analysis, machine learning, and their related methods” in order to “understand and analyze actual phenomena” with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, computer science and information science. (Wikipedia)
- Data Visualization: It is a set of tools, techniques and knowledge aimed at communicating and representing data.
- Data Commons: Data resources that are generally collectively created and owned or shared between or among a community and that tend to be non-exclusive, that is, be (generally freely) available to third parties. Thus, they are oriented to favor use and reuse, rather than to exchange as a commodity. Additionally, the community of people building them can intervene in the governing of their interaction processes and of their shared resources”.
- Data Revolution: ‘data revolution’ refers to the transformative actions needed to respond to the demands of a complex development agenda, improvements in how data is produced and used; closing data gaps to prevent discrimination; building capacity and data literacy in “small data” and big data analytics; modernizing systems of data collection; liberating data to promote transparency and accountability; and developing new targets and indicators.”
- Data Hubris: From Ancient Greek ὕβρις, means insolence and arrogance, it is the assumption that big data analytics can be used as a substitute rather than a supplement to traditional means of analysis (included social and scientific analysis).
- Data Anonymization: the process of safeguarding privacy in data by removing any possible identifiers
- Data Pseudonymization: it is a strategy to anonymize data by replacing the real values by fictitious entries.
- Transparency: Most data collection nowadays is hidden; in opposition to this, we want to show clearly what are our data sources, how data are collected and processed.
- Sustainability: Data collection comes with a price: an extremely polluting system sustains these technologies. Datapoiesis wants to open a conversation about this topic developing a sustainable model for the environment and for social issues. Our mission is to produce positive change and make the data world more sustainable also for our perception.
- Participation: A participatory model is the goal of Datapoiesis, involving people into a positive change, making networks and new connections. The sharing of data should be democratized in opposition to the techniques of extraction developed by Big Techs.