Agencies involved in the collection of data have a responsibility to ensure that procedures are implemented to ensure confidentiality of data and privacy of respondents who provided the data. The primary goal is to ensure that there is no risk that an individual or organisation is identifiable from the output released by a statistical collection agency.
Confidentiality and privacy is usually achieved by ensuring that identifiable information about individuals, households and businesses is not released outside the collection agency, is available inside the agency on needs to know basis only and cannot be derived from released data.
Legislative obligations governing the collection and release of information are outlined through Acts of Parliament and other government policy and guideline initiatives outlined below. These obligations seek to strike a balance between the need to collect and use demographic and sensitive information and the need to protect respondent/provider identity.
These obligations relate to the collection and dissemination of information in both public and private agencies.
11.1.1 Privacy Act 1988
The Commonwealth Privacy Act 1988 enacted the principles outlined in the Organisation of Economic Cooperation and Development Guidelines on the Protection of Privacy and Transborder Flows of Personal Data, 1984. The 'Privacy Act 1988' provides protection for personal information that is handled by Federal and ACT government agencies.
Section 14 of the Privacy Act sets out eleven Information Privacy Principles that govern the conduct of Commonwealth agencies in their collection, management and use of data containing personal information. The Information Privacy Principles do not permit agencies to use or disclose, in identifiable form, records of personal information for research and statistical purposes, unless specifically authorised or required by another law, or the individual has consented to the use or disclosure.
The Privacy Amendment (Private sector) Act 2000, amended the Privacy Act 1988 to ensure that most private organisations are bound by the ten National Privacy Principles outlined in schedule 3 of the Act. The National Privacy Principles in the Privacy Act 1988 set out how private sector organisations and health service providers should collect, use, keep secure and disclose personal information. The principles give individuals a right to know what information an organisations holds about them and a right to correct that information if it is wrong.
The principles and guidelines of the Privacy Act are enacted either through state legislation or practice guidelines in each of the state and territories not covered directly by the Privacy Act, 1988.
For a copy of the Privacy Act, Information Privacy Principles, National Privacy Principles, data-matching guidelines or other related legislation see the website of the Office of the Federal Privacy Commissioner.
11.1.2 Privacy Commissioner
The Office of the Federal Privacy Commissioner has responsibilities under the federal Privacy Act 1988.
The Privacy Commissioner gives advice:
· to individuals on their rights under the Privacy Act and related legislation
· about the Privacy Act and privacy issues more generally and promoting best practice in privacy standards
· to Federal and all state government agencies and other organisations on how to comply with the Privacy Act and related legislation.
The Privacy Commissioner also provides policy advice on privacy issues in response to written requests from Ministers, Federal and all state government agencies and the private sector, examines proposed legislation for privacy implications and conducts research into technological and social developments that affect individual privacy. The Privacy Commissioner investigates complaints from individuals about instances of interference with privacy, conducts audits of the personal information handling practices of Federal and all state government agencies and organisations under the Privacy Act and monitors the conduct of Federal government data-matching programs.
All Federal, State and Territory privacy legislation and policies can be accessed through the website of the Office of the Federal Privacy Commissioner.
11.1.3 Freedom of Information Act (FOI)
The Freedom of Information Act 1982 provides individuals with the right to obtain access to documents (rather than information) in the possession of Ministers, departments and public authorities, other than exempt documents. This right is not restricted simply to documents in relation to personal information but allows an individual to request the amendment of records containing personal information that is incomplete, incorrect, out of date or misleading.
The 'Freedom of Information Act' lists several grounds on which an agency may deny access. This includes where disclosure of the information may be injurious to law enforcement, or could threaten the safety of individuals.
All Australian States and Territories have their own Freedom of Information legislation which can be accessed through the website of the Department of the Prime Minister and Cabinet.
11.1.4 Ethics Committees
In sensitive areas, such as health and medical issues, research ethics committees may exist or can be created directly for a specific purpose or project. The Guidelines Under Section 95 of the Privacy Act 1988, issued by the National Health and Medical Research Council provide a framework for the conduct of medical research using information held by Commonwealth agencies where identified information needs to be used without consent. In these situations, a Commonwealth agency may collect or disclose, in identifiable form, records for medical research purposes without infringing the 'Privacy Act' if the proposed medical research has been approved by a properly constituted Human Research Ethics Committee in accordance with the 'Guidelines Under Section 95 of the Privacy Act 1988'.
11.1.5 Data Matching
In its data matching guidelines, the Privacy Commissioner has defined data matching as "the large scale comparison of records or files of personal information, collected or held for different purposes, with a view to identifying matters of interest."
The Commissioner has issued advisory 'Guidelines for the use of data-matching in Commonwealth administration' for voluntary adoption by agencies conducting matching other than the programs specifically regulated by the 1990 Act. These guidelines therefore apply when the tax file number is not used in the matching process.
Data matching involves the bringing together of two or more datasets, at the unit record level, to form a composite record. It includes statistical matching as well as data linking (using identifiers) of datasets. The use of data matching techniques for statistical purposes is likely to increase as they have clear advantages in reducing provider load (compared with asking additional questions) and supporting longitudinal analysis. There are privacy issues, real and perceived, however which need to be managed.
Statistical matching involves selecting core items that have been collected in different surveys or administrative datasets and using statistical matching techniques to synthesise records so that a richer dataset can be used. These core items may be from different household surveys (e.g. age, sex, ethnicity, income, geographic location) or economic collections (e.g. industry, geography and numbers of employees). Statistical matching techniques can be used to provide integrated microdatasets for supporting micro-simulation and other analytical techniques.
Data linking (exact matching) involves drawing together datasets at the unit record level on the basis of a common identifier, for example Australian Business Number.
While linked datasets which bring together information from different agencies, potentially on a longitudinal basis, offer rich sources of statistical information, the feasibility of these datasets will depend on resolving related privacy, confidentiality and associated legislative concerns.
When record linkage of longitudinal data is to be performed, the record identifier should be used to ensure that once records are selected, they are followed over time. On the surface, it seems relatively straightforward to link units over time by matching the record identifiers at different points in time. However, potential complexities include tracking entries and exits, and identifying statistical units and appropriate record identifier links for complex units. If dealing with businesses, not all entries and exits will be legitimate births or deaths. Many will be the result of restructures, mergers, etc. Entries and exits need to be categorised so that appropriate units can be linked from one year to the next. Since businesses are continually changing the way they are organised/structured, it is necessary to ensure that like businesses are being linked.
11.1.6 Methods to Maintain Privacy
There are a number of ways in which the privacy of respondents can be maintained:
· code numbers, instead of names, can be used on the questionnaires to minimise the links that can be made between questionnaires and respondents
· suppression of personal identification information such as names, addresses and telephone numbers should be undertaken at the earliest possible stage in data processing or analysis
· questionnaires and sources should be destroyed as early as possible in the process
· when creating tables, broad cross-classifications can be used to avoid cells with only a small number of contributing units.
11.2 TECHNIQUES TO CONFIDENTIALISE DATA
The aim of confidentialising is to ensure that no respondent can be identified, while maximising the statistical usefulness of the data. Confidentialising often has a negative impact on the usefulness of data as some of the detail may be suppressed or modified.
11.2.1 Tabular Data
Threshold rules and cell concentration rules can be established for tables. A threshold rule specifies the minimum number of units that must contribute to the value of a cell. Where the number of units contributing to the value of a cell is less than a pre-specified threshold value, the cell would be suppressed in order to prevent disclosure.
The cell concentration rule (also called a cell dominance rule) prevents the publication of cells where a small number of respondents contribute a large percentage to the cell total. For example, it may be decided that if any respondent/contributor accounts for a large percentage of a cell total, the cell will not be published.
There are several techniques that have been developed to minimise the risk of disclosure of information that can be traced back to the responding units. These techniques fall into three main categories, listed below:
· Data Suppression
This technique simply involves not releasing information which may identify individuals in a cell. There are some simple automated suppression algorithms available for two dimensional tables. However, there are complex problems associated with tabulations of higher dimensions. To produce an efficient and practical automated system requires high resource input and much fine tuning. The primary suppression of sensitive or small cells may need to be complemented by the secondary (or consequential) suppression of other cells. For example, generally at least one other cell in a row or column containing a suppressed value also needs to be suppressed to prevent the original cell value from being recalculated by subtraction from the total.
· Data Rounding
Random rounding involves the technique of replacing small values that would appear in a table with other small random numbers. Since random rounding results in data distortion, it is not additive (additivity means that the table total, either between or within tables, are equal to the sum of the relevant cell values or subtotals). This technique can be unbiased if done in an appropriate manner. A value is biased if the expected value of the data after a confidentiality technique has been applied does not equal the value of the original entry it is replacing.
Controlled rounding is a combination of conventional rounding and random rounding. Controlled rounding may result in additivity, unbiasedness, and reduction in data distortion (when compared to other rounding methods). However, this method may not provide consistency among tables.
· Category Collapsing
Data items may be collapsed across classifications. Classifications which are very detailed, such as geography, country of birth, industry or occupation, can be collapsed down to a broader level.
11.2.2 Confidentialising Unit Record Data (Microdata)
The generation and release of statistical information from surveys as well as administrative record collections usually include data which are available at a detailed level both in terms of the characteristics of individuals and their geographic location such as postcode. Although personal information such as name and address may be removed, identification of individuals may occur by putting together information already known with the data provided.
The issue for agencies to address is what level of aggregation of data is required to avoid compromising the confidentiality of the individual's information and still produce meaningful data.