Open data
Open data are data that are openly accessible, exploitable, editable and shareable by anyone for any purpose. Open data are generally licensed under an open license.
The goals of the open data movement are similar to those of other "open" movements such as open-source software, open-source hardware, open content, open specifications, open education, open educational resources, open government, open knowledge, open access, open science, and the open web. The growth of the open data movement is paralleled by a rise in intellectual property rights. The philosophy behind open data has been long established, but the term "open data" itself is recent, gaining popularity with the rise of the Internet and World Wide Web and, especially, with the launch of open-data government initiatives Data.gov, Data.gov.uk and Data.gov.in.
Open data can be linked data—referred to as linked open data.
One of the most important forms of open data is open government data, which is a form of open data created by ruling government institutions. The importance of open government data is born from it being a part of citizens' everyday lives, down to the most routine and mundane tasks that are seemingly far removed from government.
The abbreviation is sometimes used to indicate that the dataset or database in question complies with the principles of FAIR data and carries an explicit data‑capable open license.
Overview
The concept of open data is not new, but a formalized definition is relatively new. Open data as a phenomenon denotes that governmental data should be available to anyone with a possibility of redistribution in any form without any copyright restriction. One more definition is the Open Definition which can be summarized as "a piece of data is open if anyone is free to use, reuse, and redistribute it—subject only, at most, to the requirement to attribute and/or share-alike." Other definitions, including the Open Data Institute's "open data is data that anyone can access, use or share," have an accessible short version of the definition but refer to the formal definition. Open data may include non-textual material such as maps, genomes, connectomes, chemical compounds, mathematical and scientific formulae, medical data, and practice, bioscience and biodiversity data.A major barrier to the open data movement is the commercial value of data. Access to, or re-use of, data is often controlled by public or private organizations. Control may be through access restrictions, licenses, copyright, patents and charges for access or re-use. Advocates of open data argue that these restrictions detract from the common good and that data should be available without restrictions or fees. There are many other, smaller barriers as well.
Creators of data do not consider the need to state the conditions of ownership, licensing and re-use; instead presuming that not asserting copyright enters the data into the public domain. For example, many scientists do not consider the data published with their work to be theirs to control and consider the act of publication in a journal to be an implicit release of data into the commons. The lack of a license makes it difficult to determine the status of a data set and may restrict the use of data offered in an "Open" spirit. Because of this uncertainty it is possible for public or private organizations to aggregate said data, claim that it is protected by copyright, and then resell it.
Major sources
Open data can come from any source. This section lists some of the fields that publish a large amount of open data.In science
The concept of open access to scientific data was established with the formation of the World Data Center system, in preparation for the International Geophysical Year of 1957–1958. The International Council of Scientific Unions oversees several World Data Centres with the mission to minimize the risk of data loss and to maximize data accessibility.While the open-science-data movement long predates the Internet, the availability of fast, readily available networking has significantly changed the context of open science data, as publishing or obtaining data has become much less expensive and time-consuming.
The Human Genome Project was a major initiative that exemplified the power of open data. It was built upon the so-called Bermuda Principles, stipulating that: "All human genomic sequence information … should be freely available and in the public domain in order to encourage research and development and to maximize its benefit to society". More recent initiatives such as the Structural Genomics Consortium have illustrated that the open data approach can be used productively within the context of industrial R&D.
In 2004, the Science Ministers of all nations of the Organisation for Economic Co-operation and Development, which includes most developed countries of the world, signed a declaration which states that all publicly funded archive data should be made publicly available. Following a request and an intense discussion with data-producing institutions in member states, the OECD published in 2007 the OECD Principles and Guidelines for Access to Research Data from Public Funding as a soft-law recommendation.
Examples of open data in science:
- -- Journal of publications describing and linking to open scientific datasets related to Earth system sciences. Review of the dataset itself is an integral component of peer review. Launched in 2008
- data.uni-muenster.de – Open data about scientific artifacts from the University of Muenster, Germany. Launched in 2011.
- Dataverse Network Project – archival repository software promoting data sharing, persistent data citation, and reproducible research.
- linkedscience.org/data – Open scientific datasets encoded as Linked Data. Launched in 2011, ended 2018.
- systemanaturae.org – Open scientific datasets related to wildlife classified by animal species. Launched in 2015.
In government
- relevant data is disclosed;
- the data is widely disseminated and understood by the public;
- the public reacts to the content of the data; and
- public officials either respond to the public's reaction or are sanctioned by the public through institutional means.
Several national governments have created websites to distribute a portion of the data they collect. It is a concept for a collaborative project in the municipal Government to create and organize culture for Open Data or Open government data.
Additionally, other levels of government have established open data websites. There are many government entities pursuing Open Data in Canada. Data.gov lists the sites of a total of 40 US states and 46 US cities and counties with websites to provide open data, e.g., the state of Maryland, the state of California, US and New York City.
At the international level, the United Nations has an open data website that publishes statistical data from member states and UN agencies, and the World Bank published a range of statistical data relating to developing countries. The European Commission has created two portals for the European Union: the EU Open Data Portal which gives access to open data from the EU institutions, agencies and other bodies and the European Data Portal that provides datasets from local, regional and national public bodies across Europe. The two portals were consolidated to data.europa.eu on April 21, 2021.
Italy is the first country to release standard processes and guidelines under a Creative Commons license for spread usage in the Public Administration. The open model is called the Open Data Management Cycle and was adopted in several regions such as Veneto and Umbria. Main cities like Reggio Calabria and Genova have also adopted this model.
In October 2015, the Open Government Partnership launched the International Open Data Charter, a set of principles and best practices for the release of governmental open data formally adopted by seventeen governments of countries, states and cities during the OGP Global Summit in Mexico.
In July 2024, the OECD adopted Creative Commons CC-BY-4.0 licensing for its published data and reports.
In non-profit organizations
Many non-profit organizations offer open access to their data, as long it does not undermine their users', members' or third party's privacy rights. In comparison to for-profit corporations, they do not seek to monetize their data. OpenNWT launched a website offering open data of elections. CIAT offers open data to anybody who is willing to conduct big data analytics in order to enhance the benefit of international agricultural research. DBLP, which is owned by a non-profit organization Dagstuhl, offers its database of scientific publications from computer science as open data.Hospitality exchange services, including Bewelcome, Warm Showers, and CouchSurfing have offered scientists access to their anonymized data for analysis, public research, and publication.