Why is confidentiality important? What does 'confidentialise' mean? The confidentiality information series Definitions Agencies collecting information from people and organisations have a legal and ethical responsibility to ensure: - they respect the privacy of those providing the information; and - that individuals and organisations cannot be identified in a disseminated dataset. There is a clear relationship between confidentiality and privacy. A breach of confidentiality can result in disclosure of information which might intrude on the privacy of a person or an organisation. Confidentiality refers to the obligation of data custodians (agencies that collect information) to keep the confidential information they are entrusted with secret. Why is confidentiality important? Agencies collecting data often rely on the trust and goodwill of the Australian people to provide information. Maintaining public trust helps to achieve better quality data and a higher response to data collections.
This leads to reliable data to inform governments, researchers and the community. Confidentiality and therefore trust can be broken when a person or organisation can be identified in a disseminated dataset, either directly or indirectly. For example, a person could be directly identified in a dataset if that dataset contains their name and address. However, a person or an organisation could also be indirectly identified if there is a combination of information in the dataset from which their identity can be deduced. Example: the combination of date of birth and a detailed area code (for example, a town where 300 people live) may enable identification as there will be some unique dates of birth in such a small area. What does ‘confidentialise’ mean? The term confidentialise refers to the steps a data custodian must take to mitigate the risk that a particular person or organisation could be identified in a dataset, either directly or indirectly. Confidentialisation requires two key steps: 1. de-identification of the data, that is, the removal of any direct identifiers (e.g. name and address) from the data; and 2. assessment and management of the risk of indirect identification occurring in the de-identified dataset.
Removing identifying information such as name and address protects data providers from direct identification. However, it may still be possible to indirectly identify a person or an organisation in a de-identified dataset. If enough detail is available, the identity of a particular person or organisation may be derived from the presence of a very rare characteristic or the combination of unique or remarkable characteristics. Example: the identity of a person could be deduced if a dataset indicates the person is over 85 years old, has yearly income of more than one million dollars, and resides in a town of 400 people. Example: the identity of a person with a very rare disease or health condition could be deduced even in highly aggregated data.
Confidentialising data involves removing or altering information, or collapsing detail, to ensure that no person or organisation is likely to be identified in the data (either directly or indirectly). There are various methods used to confidentialise data. These methods aim to protect the identity of individuals and organisations while enabling sufficiently detailed information to be released to make the data useful for statistical and research purposes. The main techniques for confidentialising data are described in Confidentiality Information Sheet 4: ’How to confidentialise data: the basic principles’. For more information about assessing and managing the risks of indirect identification in microdata see Confidentiality Information Sheet 5: ‘Managing the risk of disclosure in the release of microdata’. The confidentiality information series This information sheet is part of a series designed to explain, and provide advice on, a range of issues around confidentialising data, comprising: – Sheet 1: ‘Confidentiality: what is it and why is it important?’; – Sheet 2: ‘Confidentiality: the obligation to protect identity and privacy’; – Sheet 3: ‘Confidentiality: managing identification risks’; – Sheet 4: ‘How to confidentialise data: the basic principles’; – Sheet 5: ‘Managing the risk of disclosure in the release of microdata’; and – Glossary. This series will be expanded in the future to provide further information about aspects of confidentiality. For more information about confidentiality, or to provide feedback on this series, please email: statistical.data.integration@nss.gov.au
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||