Open data has been a cornerstone of the smart city movement for many years. Smart city technologies depend on open data from city authorities and urban services for much of their core functionality – without access to transit data or GIS data, many applications simply could not work. But the need for open data extends beyond functionality, to the governance environment itself. Open data enables more accountability in government and it is crucial to establishing open, contestable markets for technology in city services.
Given the long history of the open data movement, and the crucial role it plays in smart cities, an open data policy is an obvious early candidate for the policy framework of the G20 Global Smart Cities Alliance.
Many local and national governments have adopted open data policies ranging from general declarations of principles (e.g. Open Data Charter) to specific mandates for how data will be treated by the city (e.g. Dubai, Vienna, Seattle). The Alliance has reviewed these examples to develop a model policy for open data.
How to use this policy
This policy document can be used right along the value chain – as the basis for your city’s own open data policy and any associated legal instruments; to provide guidance as you move from implementation (e.g. internal operating procedures, team structure, standards to support data publishing and platform development) to adoption (e.g. compliance and participation of stakeholders within government and engagement across the wider data community).
Open data publishing has been established for over a decade, and context matters. It is therefore natural that city authorities will seek to implement variations in this policy based on differences in local needs. We expect such variation in specific areas including:
- The scope of data to be made freely available – some governments will seek to leave open commercialization options, while others may not.
- The accountability mechanism – all governments need to adjust reporting requirements for their governance structure.
- The mechanism for accessibility – some governments will have different channels through which to make their data accessible (e.g. a regional data hub rather than a municipal data hub).
- The extent of technical administration – some governments may develop and maintain their own open data portals / platforms while others will procure a solution or use external agencies. Cities with more resources may have a specialist ‘Data Office’, while others may combine data responsibilities with an existing office that oversees multiple departments.
A City has a duty to maximize the potential in the data it generates and collects. Making administrative and operational data available in open form can increase quality of life, improve economic, social and environmental outcomes, and create more resilient communities and public services. From collection to publication and use, cities must maintain the public’s trust and respect.
Specifically, open data should:
1. Fundamentals of Open Data
There are a range of principles that must be considered as foundational first steps that support the specific goals of an open data policy, as well as the broader objective of a city and its ecosystem driving benefit from open data.
- The City should make data open by default, and do so through the City’s Open Data Portal. For smaller cities, regional platforms could provide a cost-effective route to open data publishing.
- This data should be timely, comprehensive and the processes that generate it clearly documented: open data is relevant only if it adds value and is legible to the information user.
- Open data should be published in a machine-readable format.
- Barriers to use should be minimised and ease of use maximised. Datasets on the Open Data Portal should be made available free of charge (subject to cost considerations below), without registration and license requirements, and be free of restrictions on their use (i.e. under open data license).
- When planning or modifying systems or data collection projects, or implementing new digital technologies (e.g. IoT), city departments, in collaboration with the City Data Office, should consider which datasets and associated metadata can be published as open data.
- This applies equally to system, projects and technologies provided by third parties acting on behalf of, or commissioned by, City authorities.
- All parties providing to the public any of the City’s open data, or providing an application utilizing the City’s open data, must explicitly identify the source and version of the data, and a description of any modifications made.
Whilst treating open data as a public good and always using (economic, social and environmental) value creation as a starting point, in some circumstances the city could consider monetisation – and potentially commercialisation – of open data. Increases in data volumes, the prevalence of data in digital service business models, and the number of third party organisations seeking to innovate with data all create potential circumstances. This complex topic requires transparent and accurate costing of data practices and derived benefits.
The following factors should be considered when determining transparent monetisation and pricing of open data:
- When costs are incurred by the city through providing open data in a value-adding format (e.g. after significant pre-processing) or at high volume (through heavy calls on APIs) to a third party that will then derive an economic benefit from that data.
- Whether community benefit is delivered alongside economic benefit to third parties (e.g. a commercial parking app parking that draws on city open data generates profit but also reduces congestion).
- Whether the proposed applications comply with city government’s wider policies (e.g. can government data be used to enhance location based advertising of products and services for a for-profit organisation?).
- Who the third party is and the potential negative effects of charging for open data. A discounted rate could applied to local startups as an incentive to open data value creation. Not-for-profits using open data in the interests of social equity should not be charged.
2. Wider City Technology Policy, Strategy and Initiatives
City open data initiatives have to support and build, rather than erode, digital trust in communities. They need to add to wider ecosystem and market confidence. Specifically, privacy, security, responsibility, accountability and ethical concerns around open data and its use need to be taken into account, especially as digital technology becomes increasingly embedded in physical and community infrastructure. This implies a growing need to build coherence with a wider set of data and technology related activities.
- Open data policies should be integrated into wider ICT security and privacy policies to ensure that the release of specific data attributes cannot cause privacy or safety harms to (individual members of) the public or private sector organisations, or put critical infrastructure at risk.
- Policies should build on city-wide data governance policies and regulation, so that open data practice adheres to, and extends to the broader public, important aspects of data management protocols and processes (e.g. wider data classifications and publishing approaches).
The next section sets out tools that can be used to assess data utility and quality, as well as compliance with data classification systems, related policies, and laws when determining datasets to be released as open data.
- All city data infrastructure projects must commit to publish as open by default and to only use permissions-based access as a last resort for sensitive data attributes, where anonymization or deidentification is neither possible nor practical (e.g. primary registers drawing in and linking data from numerous sources to create trusted and high-quality reference points for physical, economic, social assets).
- Open data publishing should be considered in the design and implementation of the city’s wider data infrastructure (see ‘Platform and Data Infrastructure’ section).
3. Governance and Process for Accountability and Compliance
Clear and solid governance arrangements are needed to ensure that open data is managed as a strategic asset. Direct accountabilities for maintaining and publishing data sets should be supported by clear, principle-based rules for promoting re-use and innovation. These are the key elements to stimulate freedom, better access and uptake, reuse and impact whilst also managing privacy, public safety, security, commercial confidentiality and compliance with law.
Differences in maturity, complexity and scale of the operating environment will lead to variations in the core model suggested here. City laws and political oversight will also need to be weighed to ensure the ongoing sustainability of operations.
A central City Data Team (“CDT”) should be created as the organisation-wide authority – the trusted guide and steward – for data and open data management, and publishing. The team should have the support of, and be accountable to, senior executive authority. In more advanced settings, the CDT can and should play a broader role in the city’s overall data analytics functions, as well as support the design of digital services that themselves produce publishable data.
The CDT should establish processes to identify datasets to be published on the Open Data Portal, from the perspective of community need. These processes should assess the potential utility, uptake and end value. They should be informed by input, and therefore serve the needs of stakeholders from across government, communities, academia, businesses, and data consumers generally.
The CDT should establish processes to identify datasets to be published from a technical perspective. These processes should consider minimum standards of data quality (e.g. completeness, accuracy, timeliness and permanence) as well as potential privacy risks, to encourage reliability and re-use.
For larger cities with multiple departments and organisations…
- The CDT should manage the relationships with departments and provide guidance (e.g. how to prioritize data against guidelines defined by the CDT) to ensure value focussed and efficient open data publishing. To improve the quality and overall impact of open data publishing, activities can include:
- Directing department data champions to make the department’s data holdings and accompanying metadata available on the city’s open data portal, in accordance with the policies and the operating procedures of the CDT.
- Directing department data champions to ensure that department data made open to the public adheres to the city’s privacy, security, retention and public disclosure policies and standards.
- Developing a catalogue listing each department’s data assets. These data catalogues should be combined into a master data catalogue, and with metadata, be made publicly viewable. Consideration should be given to using international standards (e.g. DCAT2) so that open data catalogues can be linked to provide larger, federated and common data resources.
- Establishing publishing goals and accompanying plan for department open data on a regular cycle (e.g. annually). Attainment of these goals can be made part of the performance evaluation for each city department or data lead. There is also scope to set up public compliance reporting to build a sense of competition among departments.
- Publishing an Open Data Manual to document – and provide guidance and templates on – the management and publication of open data. This document or series of documents can cover a range of areas from data governance roles, how to build and manage data inventories, descriptions of the data ingestion process, and guidance on standards and classifications.
- Maintaining and updating a wider set of open data policy materials, including interpreting, updating, and modifying an Open Data Policy and supporting procedures.
- Developing and maintaining an Open Data Classification Framework (including its relationship to broader data classification systems) and processes and supporting its use by other organisations.
- Evaluating requests received through established community feedback mechanisms so that datasets can be prioritised for release, and incorporated into the work programme for the team.
- Publishing (most obviously on the open data portal) an annual Open Data Plan, which can serve both as data publishing schedule and as a description of strategic improvements to be made to CDT operations and assets.
- Directing department data champions to make the department’s data holdings and accompanying metadata available on the city’s open data portal, in accordance with the policies and the operating procedures of the CDT.
An Open Data Plan should include:
- A proposed publication timeline for datasets to be published on the portal in the upcoming year.
- A plan for the upcoming year to improve public access to open data and maintain data quality.
- Proposals for improving the city’s open data management processes and data infrastructure to advance open data policy goals.
- Proposals for experimentation and innovation – e.g. the publication of derivative (aggregated or anonymised) datasets where full datasets cannot be published as open, or experimentation with synthetic ‘differential privacy’ approaches to allow for open publishing of high value datasets.
- Costs associated with delivering Open Data infrastructure and operations for the upcoming fiscal year, as well as benefits and use cases to prove open data value.
4. Ecosystem Creation for Trust and Value Creation
The outcomes from open data initiatives can be multiplied by a range of activities. Support and indeed demand from the highest levels of management and political leadership is important. Engagement across all parts of the data ecosystem (e.g., among data science and developer communities of interest generates trust and in turn, momentum around projects that demonstrably meet need and create value in the form of better governance, improved services and quality of life.
- The CDT can introduce a simple data requests service on the open data portal, inviting all-comers to make the case for the release of open data.
- Creating a permanent mechanism to solicit and act on wider data community feedback (e.g. input into broader policy discussion, open data publishing practice, and in more advanced cases, crowdsourcing of datasets) should also be considered.
- The power of blogging and well-illustrated (visualised) case studies of impactful open data value creation should not be underestimated. Publishing rights can be extended to other organisations and individuals to strengthen the sense of community contribution.
- Recognising that the broader cross-section of the public lacks the technical expertise needed to use open datasets, the CDT should actively explore non-technical ways in which the public can interact with open data, such as collaborations with app developers and platforms that share data and insights with the public.
- The value- and outcomes-based use of open data assets can be accelerated by creating opportunities for members of the public, departments and offices, student groups use open data to explore a specific challenge (e.g. air quality). While the effort of attracting a sponsoring department able to clearly articulate demand and action open-data driven insights should not be underestimated, hackathons and longer open innovation competitions can be highly effective in bringing the potential of open data to wider attention.
The Central Data Office should:
- Raise community awareness of open datasets via external channels (e.g. social media)
- Communicate open data in a range of formats to make datasets more understandable for community members, e.g. data visualisations
- Activate the use of open data to solve major city problems by leading or participating in community events (e.g. Hackathons, Open Innovation Competitions)
- Continuously improve datasets and platforms by seeking community feedback (e.g. annual surveys)
- Promote startups and academics that are using government data with other potential users (e.g. social media)
- Encourage other government agencies and suppliers to commit to an open data policy
5. Relationships with Principal Data Stakeholders
Because city governments vary in size, operating models, as well as the names given to departments, teams and key posts, this policy does not attempt to make the case for the actual form and positioning of the CDT within an organisation. It is better to focus on the key relationships and the outcomes to be achieved by doing so. We do assume that the CDT is headed up by a City Data Manager who exercises domain and managerial leadership for the team and the city’s open data operation.
The Chief Privacy Officer – the authority on questions or issues concerning Open Data privacy risk and mitigating the risk of privacy harms.
The Chief Data or Information Officer – for authority and decisions on wider data governance, management and quality issues, as well as matters relating to analytics.
The Chief Technology Officer or Director of ICT – for the approval of work plans as they relate to the data and technology infrastructure and plans for its development.
Departmental Data Champions (where applicable) – The CDT, through guidance, training and methods listed in this policy, will help data champions ensure departmental compliance with open data publishing standards and delivery against goals.
6. Technical Measures
Cities should use industry open standards to ensure the quality, interoperability and discoverability of open data. Technical maturity will vary between cities and departments. Understanding this maturity and introducing appropriate technical measures to make data as accessible and useable as possible by government and others working with its data will serve to increase the value generated from it.
- The City should undertake periodic assessments of data availability, quality, interoperability and discoverability as part of its Open Data Plan. This could be done at departmental level first, and over time for systems of strong interest to the public.
- For data quality assessment, the City should consider a data quality matrix to establish:
- Ownership and authority – that there is a custodian responsible for overall quality of the original data to be made available for re-use.
- Accessibility – that metadata is supplied and machine-readable formats are used.
- Accuracy – common data fields (e.g. dates, times, location) are used, and limitations and gaps in the data are explained.
- Completeness – the data makes sense as a complete dataset and should not require other data to make sense of it.
- Descriptiveness – accompanying metadata should describe how reliable data is and say how the data was created and processed. Ideally a schema should identify ranges and values in each field to show the temporal and geographic coverage, granularity and limitations for the assets described.
- From the viewpoint of interoperability (and also from data quality), particularly looking toward more abundant use of IoT data:
- A range of standardised data formats can be applied to increase the ease of reading of open data by software applications. In general these formats should be non-unique and non-proprietary.
- Elevation of “stages” for the form of data publication should be pursued, as follows: machine readable, to structured form, to open format, to Web API, and linked data (http://5stardata.info/en/) to add context and utility.
- Common data models should be adopted. The City should align with national guidelines, if available, to ensure interoperability not just within the City but also among other cities. and/or jurisdictions, for common datasets.
- For discoverability, the metadata attached to open datasets should include: title, description of the dataset, name of the publishing entity, (the open) classification, a link or copy of the open data license under which the data can be used, as well as a format description and timestamp.
The following formats for structured data of different types should be used:
- Tabular data should be published as CSV
- Geospatial data should be published as GeoJSON or KML
- Other structured non-tabular data should be published in an open standard where available (e.g. JSON, XML, RDF, GTFS)
- Real-time data or data being used in real-time services should be made available via a well-documented API.
7. Platform and Data Infrastructure
All cities organising an open data effort must have access to an open data portal. There are technical and business requirements which should be taken into account if this platform is to relate to user needs, and ultimately be treated as the trusted home for the city’s open data effort, attracting and sustaining usage across publishers and consumers. There are also technical considerations which relate to the wider city data infrastructure.
- From a technical perspective, an open data portal should be designed and implemented – or for those already in existence, a migration strategy should be built – so that it is harmonized with the city’s overall data infrastructure. In this way, e-government workflows (e.g. municipal planning approvals) and digital services, and the data that they use, operating in this infrastructure can incorporate open data publishing. This practice will establish flexible, cost-effective, citywide data infrastructure, and promote the development and alignment of open data related strategic investments and services.
- The main elements of such an infrastructure are:
- Identified data sources, their owner and current use.
- A data pipeline to ingest the data from the source, model it using a standard schema, classify it and determine an authorisation scheme, link and compare it to other sources and check its quality, optionally transform it to an event stream to record history and change, document it and provide schema and metadata for it and offer it for distribution to the data portal.
- A data portal that automatically creates the information products (files and API’s) to publish and use the data.
- In addition, there are basic business requirements for design, functionality and content that the Open Data Portal itself should meet. These will turn it from a trusted data catalogue to a platform that drives data usage activity and value:
- Designed through a user-centred process underpinned by inclusive user research.
- Adhere to accessibility standards to ensure inclusion and ease of access for all (see model policy on ICT accessibility standards in public procurement).
- Strong search functionality (file type, category, data publisher, recency). Advanced search techniques for attributes contained within datasets provided e.g. by Google will be a clear requirement in the near future.
- Well-indexed and categorised (e.g. economy, population, environment) datasets.
- A published open data timetable with clear labelling. In advanced cases, dataset alerts can be incorporated.
- Interactive interfaces to preview and visualise data and perform basic selection and analysis.
- Well-documented query and streaming APIs, and other services to help developers implement applications quickly, durably and reliably, and to account for the increase in big data feeds.
- Blogs and other forms of content creation to appeal to a technical and non-technical audience and to provide tangible evidence of impact for open data re-use.
City Corresponding to the [city administration] and contractors or agencies acting on the [city administration]’s behalf
Data Includes all datasets as well as other forms of information such as documents, drawings, pictures and other artefacts.
City Data All data created, collected and/or maintained by the [city administration] or by contractors or agencies on the [city administration]’s behalf.
Open Data Specific datasets that are made available to the public by the [city administration].
Machine-Readable Any widely-accepted, non-proprietary, platform-independent, method for formatting data (such as JSON, XML, and API’s) which permits automated processing of such data and facilitates search capabilities.
Open Standard A technical standard developed and maintained by a voluntary consensus standards body that is available to the public without royalty or fee.
City Data Office Office dedicated to making [city administration] data available to the public, to partners, and internally within government to enable use of data in support of the City’s goals. Comprises City employees who administer the Open Data Portal and provide planning, review, coordination, and support to City departments and offices publishing open data. Note that this Office might be assigned to an existing office, e.g. a City Manager’s Office.
Open Data Manual Guide defining strategies city departments and offices can implement to making their data open, encourage public use consistent with the city’s privacy and security policies, and realize benefits for their departments.
Open Data Portal The city’s catalogue and primary repository for Open Data, created and maintained by the [city administration] for the express purpose of ensuring permanent, lasting open access to public information and enabling the development of innovative solutions.
Open Data Plan The City’s plan for publishing Open Data.
Data Manager A City employee who is responsible for the City’s Data Office, stewards the data made available on the Open Data Portal,and manages the Data Office employees.
Data Champion Designated by each department, this person serves as the point of contact and coordinator for that department’s publishing of Open Data.
Andrew Collinge, Advisor, Smart Dubai
Yasunori Mochizuki, NEC Fellow, NEC Corporation
Task Force Members:
Michelle Fitzgerald, Chief Digital Officer, City of Melbourne
Berent Daan, Chief Data Scientist, City of Amsterdam
Eduardo Gomez Restrepo, Associate, C4IR Colombia
Jennifer Park, Director of Certification, What Works Cities
Contributors and reviewers:
Jacqueline Lu, Co-Founder, Helpful Places
Lilian Coral, Director/National Strategy + Technology Innovation, Knight Foundation
Michael Mattmiller, Director of Government Affairs, Microsoft
Dr Ahmad al Abdulkareem, Smart Cities Lead, C4IR Saudi Arabia
Oliver Rack, Amt für Digitales und Informationsverarbeitung, City of Heidelberg
Brigitte Lutz, Data Governance Co-ordinator, City of Vienna
Joran Van Daele, Open Data Manager, City of Ghent
Lara Medialdea, Public Policy Advisor, City of Buenos Aires