Data Ethics Policy
This page discusses Colouring Indonesia and the Colouring Cities Research in the context of the data ethics agenda, and includes discussion of issues relating to data privacy, security, quality, accessibility, platform transparency, inclusivity, governance and sustainability.
Further information on privacy and security, and on our code of conduct, can be accessed on our Menu pages. Progress on open code development relating to the privacy and security also can be viewed on GitHub site at https://github.com/colouring-london/colouring-london/issues/687 and https://github.com/colouring-london/colouring-london/issues/688.
The Colouring Cities Research Programme (CCRP) has been designed to test a new type of free public information tool, to encourage knowledge and data sharing about buildings and cities, for the public good. Colouring Indonesia is being designed as a safe, positive, constructive space for users of diverse ages, genders, cultural backgrounds, skills and abilities to enjoy and benefit from. Users need to be sure their contributions will be treated with respect and that GDPR principles of lawfulness, fairness and transparency, purpose limitation, data minimisation, accuracy, storage limitation, integrity and confidentiality (security), and accountability, will be met. We in fact actively discourage users from giving us personal data wherever possible. The privacy of building occupiers is also an important issue for us and is prioritised in our data collection approach. It is carefully balanced against the increasingly urgent need to collect information on building stocks to aid emissions reduction and increase urban sustainability as a whole.
Our open code and open data licenses mean that our data can be experimented with in any way. The CCRP has been set up at Turing to support testing of Colouring London prototype design with international research partners, and to promote ethical standards and principles for information management systems that deal with built environment data.
Ongoing effort is made to make data accessible to the widest possible audience, and to also highlight uncertainty, and sources of data wherever possible. Breakdown of trust in any of these areas is considered to pose a significant risk to the long-term sustainability of the project. One of the main questions asked by the CCRP is 'How do we balance the need to open up data on buildings, to increase sustainability, resilience and inclusivity in cities, with the need to protect the security and privacy of platform users, and of building users and occupiers as well?'
Key methods tested include: a) making Colouring City platforms accessible to everyone in view-only mode, without sign-up being required; b) requesting minimum personal data from platform editors but requiring them to adhere to a clear code of conduct; c) avoiding collection of private/potentially sensitive data on buildings through ongoing research and consultation with stakeholders, (e.g on private space within homes); d) developing collaborative monitoring systems to pick up issues as quickly; e) using security software and firewalls where applicable to manage data and prevent malicious attacks; f) constantly reassessing our security and privacy procedures and ethical framework.
Our programme's usefulness, success and longevity also relies public trust. We try to be as transparent as possible regarding what our project is designed to do, what types of data it collects and why these are needed to support the public good, how the project is managed, and what security and privacy features/mechanisms are in place. We work with a 100 year + time horizon in mind, believing that though technologies will change, low-cost, accessible databases, providing free, high quality, detailed information on national stocks, will always be required and desired, and that these databases must, to prevent major breaches of privacy and security, and to ensure inclusivity, be built from the outset to rigorous ethical standards, which are constantly assessed.
Below, information is first provided on existing principles we follow. The Open Data Institute's Data Ethics Canvas is then used to address specific questions.
Principles & frameworks we aim to promote
Below are frameworks and principles we promote. We also assess our platform against ethical standards set by them.
Set of ethical principles the Colouring Indonesia prototype platform is checked against
1. General Data Protection Regulation (GDPR)
. Colouring Indoneisa is required to meet GDPR requirements with regard to personal data on individuals. GDPR principles are also applied to all types of data collected as great care is also needed when handing certain types of spatial data relating to people's homes, especially data relating to domestic building interior space/activities, and to ownership. (Domestic buildings make up the vast majority of buildings in national building stocks).
GDPR data principles:
- Purpose limitation
- Data minimisation
- Storage limitation
- Confidentiality (security)
2. Open Knowledge Foundation (OKF) Open definition 2.1https://opendefinition.org/od/2.1/en/
The OKF defines knowledge as 'open if anyone is free to access, use, modify, and share it - subject, at most, to measures that preserve provenance and openness'.
3. The 'Open Data Chapter'https://opendatacharter.net/principles/
- Open by default
- Timely and comprehensive
- Accessible and useable
- Comparable and Interoperable
- For improved governance and citizen engagement
- For inclusive development and innovation
4. The Open Data Institute's Data Infrastructure Principleshttps://theodi.org/article/principles-for-strengthening-our-data-infrastructure/
- Design for Open
- Build with the web
- Respect privacy
- Benefit everyone
- Think big but start small
- Design to adapt
- Encourage open innovation
5. The Open Data Institute's personal data questionshttps://theodi.org/article/openness-principles-for-organisations-handling-personal-data/
In Colouring Cities the following questions handling personal data are also extended to people's homes
- What are we collecting?
- How are we using it?
- How are we sharing it?
- How are we securing it?
- How are we making decisions about it?
- How are we accountable?
- How can we make analysis/outputs accessible
See also ODI's data ethics canvas below
6. The Gemini principleshttps://www.cdbb.cam.ac.uk/DFTG/GeminiPrinciples
The CCRP promotes the Gemini Principles, developed by the Centre for Digital Britain at the University of Cambridge (2019) to provide a 'conscience' for the framework for information management systems on the built environment/infrastructure, and for national digital twins, and to ensure these remain focused on the public good.
- Public good
- Value creation
7. The New Urban Agendahttps://www.un.org/sustainabledevelopment/blog/2016/10/newurbanagenda/ and https://habitat3.org/the-new-urban-agenda/
The CCRP promotes the UN New Urban Agenda, created to drive global commitment to the goal of sustainable, inclusive, healthy and resilient cities and stocks:
- Provide basic services for all citizens (e.g. housing, water, sanitation, food healthcare, education, culture, communication technologies)
- Ensure that all citizens have access to equal opportunities and face no discrimination.
- Promote measures that support cleaner cities (air pollution, greenspaces, energy/transport).
- Strengthen resilience in cities to reduce the risk and the impact of disasters (better urban planning, quality infrastructure and improving local responses).
- Take action to address climate change by reducing cities' greenhouse gas emissions.
- Fully respect the rights of refugees, migrants and internally displaced persons regardless of their migration status.
- Improve connectivity and support innovative and green initiatives (including supporting cross sector partnerships).
- Promote safe, accessible and green public spaces
8. The Universal Declaration of Human Rightshttps://www.un.org/en/about-us/universal-declaration-of-human-rights
The CCRP works to support the UDHR, and specifically the following (of 30 Articles):
- Article 1: All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.
- Article 2: Everyone is entitled to all the rights and freedoms set forth in this Declaration, without distinction of any kind, such as race, colour, sex, language, religion, political or other opinion, national or social origin, property, birth or other status. Furthermore, no distinction shall be made on the basis of the political, jurisdictional or international status of the country or territory to which a person belongs, whether it be independent, trust, non-self-governing or under any other limitation of sovereignty.
- Article 3: Everyone has the right to life, liberty and security of person.
- Article 12: No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation. Everyone has the right to the protection of the law against such interference or attacks.
- Article 19: Everyone has the right to freedom of opinion and expression: this right includes freedom to hold opinions without interference and to seek, receive and impart information and ideas through any media and regardless of frontiers. (Note: Such speech must also respect other UDHR Articles).
- Article 21. Everyone has the right to take part in the government of his country, directly or through freely chosen representatives. Everyone has the right of equal access to public service in his country.
- Article 25: Everyone has the right to a standard of living adequate for the health and well-being of himself and of his family, including food, clothing, housing and medical care and necessary social services, and the right to security in the event of unemployment, sickness, disability, widowhood, old age or other lack of livelihood in circumstances beyond his control.
- Article 27: Everyone has the right freely to participate in the cultural life of the community, to enjoy the arts and to share in scientific advancement and its benefits. Everyone has the right to the protection of the moral and material interests resulting from any scientific, literary or artistic production of which he is the author.
The Data Ethics Canvas
Data ethics are described by the Open Data Institute (ODI) as a " A branch of ethics that evaluates data practices with the potential to adversely impact on people and society-in data collection, sharing and use."
Ethical use of data brings about trust and helps allow data to work for everyone. The Colouring Cities Research Programme uses theODI Data Ethics Canvasto help identify and manage ethical issues throughout the lifecycle of its prototype platform Colouring London.
As part of the process of development, existing, and new features within the platform are checked against the questions posed by the Ethics Canvas. First stage responses to core questions are given below.
Where are data from? Are personal or sensitive data involved?
Colouring Indonesia collects data to support research into the sustainability of Indonesia’s building stock. These relate to building location, use, type, age and history, size, materials and construction, sustainability, design/construction team, planning/ designation/demolition status, streetscape/green context, whether the building is community owned, and whether the user thinks it contributes to the city.
Our data are available for download as open data from our platform. Most data we are collating/collecting relate to physical characteristics of building, already able to be seen from street or satellite images. Much of this information is also already held within government or commercial databases, though in many cases these are restricted to the public and academia. Some government datasets, such as building designation and protection, are already publicly available.
Colouring Indonesia does not collect personal data, other than optional emails needed to enable users to reset site passwords. We actively discourage users, on our 'sign up' page, from contributing even their real names. Though it is helpful to understand what sectors and disciplines and groups our users are coming from to help us reach and ensure relevance to as wide an audience as possible, we believe requests for information for this purpose should, if introduced, be optional only, with minimal information asked for. We will continue to explore how this issue can best be addressed by working and consulting across sectors, disciplines and community groups.
Our sign-up agreement also tries to be as transparent as possible. It emphasises that when users make a contribution to Colouring Indonesia they are creating a permanent, public record of all data they add, remove, or change; that the database will record the username and ID of the user/editor, along with the time and date of the change and that all of this information will be made public through the website and through bulk downloads of the edit history.
Data are gathered in the following ways. Firstly, by identifying and collating existing datasets held by central and regional government bodies and other organisations. Secondly, by harnessing knowledge held within the community through crowdsourcing at building level, whether this be from, for example, building professionals, local councils, local amenity societies, building users or schools. Large-scale computational data generation programmes and live streaming of planning data will also be tested.
Our job is to bring together and visualise data on the building stock that is currently highly fragmented, restricted or unavailable, to make this more accessible and increase data accuracy through inclusion of sources and verification. Some data may derive from observation of the building itself, some need to be extracted from historical texts and some come in the form of ready-to-go datasets comprising city wide spatial statistics. Users are informed on sign up that data cannot be accepted on the site where any restrictions to its open release may apply. Our 'Community' section differs slightly in that it also asks users' how well buildings work and whether they contribute well to the city and/or local area.
The platform is also being designed as a collaborative data maintenance project as described in the ODI's handbook athttps://collaborative-data.theodi.org/, with specific datasets encouraged to be added to, verified and updated by specialist sources (e.g. historians for age data). Our stewarding structure is in the process of being developed.
How are we addressing accuracy, bias and incompleteness?
Data on buildings are collected at building level. To help address issues of accuracy and bias a number of features are being included. Each subcategory has a source box, a verification button, with a query button planned to enable problems that cannot be addressed within the editing system to be raised. Moderated dropdown options plus links to allow references to sources and routes to further information are also included. Easy to access edit histories also allow users to assess the accuracy of data. Specific phrasing of specific subcategory questions is also required in certain cases to address uncertainty.
As with the Wikipedia and OpenStreetMap model, Colouring Cities is designed as a low-cost model overseen by expert contributors. Our landing page also contains a clear statement that data are derived from multiple sources and that accuracy of the data must, ultimately, be determined by the user.
Who are we sharing data with and under what conditions?
Colouring Indonesia has been designed as a free knowledge exchange platform that collates, collects and generates open data on Indonesia’s building stock, able to be used by everyone. We do not sell data and we will not share user’s personal data (e.g email address) with any other organisation.
The site is explicit in the user agreement, required to be accepted on our sign-up page, on the way that contributed data can be used. Colouring Indonesia contributions are licensed under the Open Data Commons Open Database License (ODbL) by Colouring Indonesia contributors. Users are free to copy, distribute, transmit and adapt our data, as long as they credit Colouring Indonesia and its contributors. If users alter or build on Colouring Indonesia data, they may distribute the result only under the same licence.
The sign up agreement emphasises that when you make a contribution to Colouring Indonesia, you are creating a permanent, public record of all data added, removed, or changed by you as noted above. It is also explicitly stated that Colouring Indonesia is unable to accept any data derived from copyright or restricted sources, other than as covered by fair use. Data sources are encouraged to be recorded wherever possible.
Our platform code is also open and we encourage its use by other cities and towns. Code is available on our GitHub sitehttps://github.com/colouring-cities/colouring-indonesia under the following licensing terms: 'Coloring Indonesia Copyright (C) 2022 Coloring Indonesia contributors'.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
What rights will the source have?
Open Data Commons Open Database License (ODbL) Colouring Indonesia contributors are free to copy, distribute, transmit and adapt our data, as long as they credit Colouring Indonesia and its contributors. If users alter or build on Colouring Indonesia data, they may distribute the result only under the same licence.
Are we sure we are not contravening ethical frameworks?
Data privacy and data ethics are of the highest priority to the Colouring Cities Research Programme. We work to the best of our ability to ensure we do not contravene any existing ethical frameworks. Each data category is widely consulted on and rigorously checked for potential issues prior to release. Issues on contravention can be raised by users on our GitHub site which is constantly monitored. We also actively discourage the contribution of personal data, avoid the collection of data within the building fabric, incorporate controlled dropdown menus (with an internal moderation system for sources) and moderate all bulk uploads. We also state that all data uploaded must be from an open source or generated by the user themselves. CCRP’s ongoing work with internal data ethics working groups at the Alan Turing Institute and with national and local partners ensure diverse feedback routes and we actively seek to learn from, and collaborate with organisations advancing the data ethics agenda, such as the ODI.
Why are we collecting data? Are we replacing a service? Are we making things better and for whom?
We are collating and collecting open data on Indonesia’s building stock to provide essential information for citizens, researchers, education providers and policy makers, to support the development of sustainable and resilient buildings stocks. We also want to assist those designing, constructing, caring for, managing and studying Indonesia’s buildings to help solve urban problems both providing data, and through interdisciplinary/collaborative work.
Our aim is to create a one-stop-shop for open data on Indonesia’s stock. The release of these data is also designed to stimulate the production of innovative and efficient products within the academic, non-profit and commercial sectors which promote and support the UN's Sustainable Development Goals and the UN New Urban Agenda.
We also believe that it is healthy for Colouring Cities platforms collating data on the building stock to be curated by research institutions, whose stance is impartial and whose brief is to undertake research on the built environment for the public good.
Are we clear in the way the data will be used?
We are already aware of areas of research, such as energy, where demand for accurate building level attribute data is very high. We also know, from extensive consultation, that these building attribute data are also important to the construction and property industry, housing suppliers, planning bodies and the education sector. We are therefore excited about the many ways in which the data might be used.
We are currently developing a curated data showcase facility to allow users to upload information, images and links to how data from Colouring Indonesia are being applied to urban problems, and to in doing so to inspire and inform.
Who will be positively impacted and how? How can we maximise and measure this?
As noted above, building attribute data for Indonesia at building level is provided free for all those involved in the design, research, construction, management and maintenance and analysis of Indonesia’s buildings, and its sustainable development. Our project is also designed to encourage use and knowledge sharing by diverse audiences. This element is central to its design.
Who could be negatively affected by the project? & how is this being addressed?
The ODI's 10th item on its Data Ethics Canvas addresses the issue of negative project impact. Could the manner in which this data is collected, shared and used cause harm? Or be used to target, profile or prejudice people, unfairly restrict access? Could people perceive it as harmful?
All spatial data projects that collect information able to be linked to specific addresses need to be very careful with regard to the type of data collected and how it is held and accessed. A number of checks have had to be put in place to ensure the safety and privacy of building occupants, and platform users.
Examples of ways in which we are working to minimise negative impacts include a) discouraging the submission of personal data (e.g. email addressed, real names), b) not collecting data on the insides of homes, c) avoiding freetext wherever possible and using preset dropdowns, to prevent cyberbullying and security risks for occupants, d) only allowing users one vote per user on 'like me?', e) having no negative option for 'Like me?' again to prevent cyberbulling, f) having a sign-up page that provides clear guidelines for responsible and ethical use of the site and g) only allowing the copy and paste tool to be used on one building at a time to deter malicious behaviour and moderating all bulk uploads.
Owing to concerns raised during consultation with regard to privacy and ownership data, Colouring Indonesia also only collects data on buildings where the freehold is held by the state or 3rd sector owners.
How will ongoing issues relating to data ethics be monitored & discussed?
CCRP discusses data ethics issues, on an ongoing basis, with colleagues from Turing's Data Ethics Group. Turing's 'Facilitating responsible participation in data science' Special Interest Group, and with our project partners. Users are also able to raise issues for consideration on our discussion threads.
What data ethics actions are out current priorities?
Our current data ethics priorities are to improve our user feedback forms and alert features, to try to address issues relating to security/privacy in relation to free text boxes; and to identify areas of concern with our Colouring Cities Research Programme Partners.