• NSS Home
  • Contact Us
  • Sitemap
 

 

Advanced Search

  • Home
  • Need to know
  • Register a project
  • Find a project
  • Glossary
  • FAQs
  • Feedback


GLOSSARY


Accreditation

An accredited Integrating Authority (IA) must be used for all high risk data integration projects involving Commonwealth data. The interim accreditation process includes: an assessment by the applicant against eight accreditation criteria (these include the ability to ensure secure data management, availability of appropriate skills, transparency of operation and a culture and values that ensure the protection of confidential information and support the use of data); an audit by an independent third party to assess whether a prospective agency or organisation meets the criteria; and a final decision made by the Cross Portfolio Data Integration Oversight Board. For more information on the accreditation process and criteria see ‘The interim accreditation process for Integrating Authorities’.

Administrative data

Information collected by agencies for the administration of programs, policies or services (e.g. Medicare data, taxation data) with the potential to be used for statistical or research purposes.

Agency head

The person legally accountable for the activities of an organisation, and those of its staff and affiliates. For example,
- Government Department – the Secretary of the Department
- Private sector – CEO, Company Secretary or Managing Director
- University – Vice-Chancellor, Pro Vice-Chancellor, Deputy Vice-Chancellor or University Registrar.

Back to top
Commonwealth dataset

Includes any dataset containing information collected by, or on behalf of, the Australian government or any dataset containing information collected by another Australian jurisdiction and provided to the Australian government for the common good.

Confidentialise

The protection of data confidentiality through the removal or altering of information, collapsing detail within a dataset, or aggregation and related techniques.

For more information about confidentiality and how to confidentialise data see the Confidentiality Information Series


Confidentiality

The legal and ethical obligation to the provider of information to maintain and protect the privacy and secrecy of that information. Also see Confidentialise.

For more information about confidentiality and how to confidentialise data see the Confidentiality Information Series


Content data

Refers to service or clinical information contained in a health record. It does not include demographic information.

Cross Portfolio Data Integration Oversight Board

This Board was established in 2011 to oversee the development of a cross government environment that is safe and effective for data integration involving Commonwealth data for statistical and research purposes. The Board is chaired by the Australian Statistician and membership includes the Secretaries of the Department of Families, Housing, Community Services and Indigenous Affairs; the Department of Health and Ageing; and the Department of Human Services. For more information see Cross Portfolio Data Integration Oversight Board Terms of Reference.
Back to top
Data custodians

Agencies responsible for managing the use, disclosure and protection of source data used in a statistical data integration project. Data custodians collect and hold information on behalf of a data provider. The role of data custodians may also extend to producing source data, in addition to their role as a holder of datasets.

Data integration

See Statistical Data Integration.

Data linking

An element in the process of data integration. Data linking creates links between data from different sources based on common features present in those sources. Also known as 'data linkage'.

Data provider

An individual, household, business or other organisation which supplies data either for statistical or administrative purposes.

Data user

A person involved in accessing and investigating integrated datasets for statistical and research purposes. Data users include academics working in research institutions and employees undertaking research in Commonwealth and State/Territory agencies.

De-identified data

De-identified data is data that has had any identifiers (i.e. information that directly establishes the identity of an individual or organisation, such as name, address, Australian Business Number) removed.

Deterministic (exact) linking

Linking records belonging to the same unit by using a unique identifier such as Australian Business Number.
Back to top
End users

People who examine research findings rather than produce outputs. Examples include employees undertaking research in public and private sector organisations, representatives from media outlets and consumer advocacy groups, and members of the wider community.

Ethics approval

A judgement made by an approved Human Research Ethics Committee that a human research proposal meets the requirements of the National Statement on Ethical Conduct in Human Research and is ethically acceptable before the commencement of such research.


Ethics committee

Shortened form of Human Research Ethics Committee (HRECs). HRECs protect the welfare and rights of participants involved in research. HRECs review proposals for research that involves humans, monitor the conduct of research and deal with complaints that arise from research. In the context of data integration involving Commonwealth data, some data custodians require that an ethics committee must approve a data integration project prior to the release of data. Ethics approval does not however guarantee that approval for data release will be given. More information on HRECs, including a list of registered HRECs, is available from the National Health and Medical Research Council.


Exact linking

See Deterministic linking.
Back to top
Governance arrangements

The way that decisions are made, how they are communicated, how they are monitored and the extent to which sanctions are imposed for non-compliance.

Identifiable data

Identifiable data enables a person to establish the identity of a person or organisation to which some data relate. The identity of a person or organisation could be established directly if the dataset contains identifiers such as name and address, or indirectly if there is a combination of information in the dataset from which their identity can be deduced.

Identifier

Information that directly establishes the identity of an individual or organisation. Examples of identifiers are: name, address, driver's licence number, Medicare number and Australian Business Number. Also known as direct identifier.

Institutional arrangements

The organisation of activities associated with data integration, along with the characteristics and roles of institutions involved in such activities.

Integrated dataset

A dataset created by bringing together two or more datasets, generally at the unit level (i.e. for an individual person or business) or micro level (e.g. information for a small geographic area), for statistical and research purposes.

Integrating Authority

An Integrating Authority (IA) is the single agency ultimately accountable for the sound conduct of the statistical data integration project, leading it through its approval and implementation. For more information, see the paper on ‘Rights, roles and responsibilities of Integrating Authorities’.
Back to top
Microdata

Microdata are unit record data, i.e. data for an individual person or organisation.

Personal information

Information or an opinion (including information or an opinion forming part of a database), whether true or not, and whether recorded in a material form or not, about an individual whose identity is apparent, or can reasonable be ascertained, from the information or opinion.
Source: Privacy Act 1988

Privacy

An individual’s right to have their personal information managed so that it is kept confidential except where informed consent has been given, or a legal authority exists, in accordance with the requirements of the Privacy Act 1988.

Privacy Act

The Privacy Act 1988 regulates the collection, storage, use and disclosure of personal information by Commonwealth and ACT government agencies and certain private sector organisations. Section 14 of the Privacy Act sets out 11 Information Privacy Principles that govern the conduct of Commonwealth agencies in their collection, management and use of data containing personal information. The Information Privacy Principles do not permit agencies to use or disclose, in identifiable form, records of personal information for research and statistical purposes, unless specifically authorised or required by another law, or the individual has consented to the use or disclosure. The states and territories have their own regulations governing privacy of personal information.

Privacy Impact Assessment

An assessment tool that describes the personal information flows in a project, and analyses the possible privacy impacts that those flows, and the project as a whole, may have on the privacy of individuals. The aim of a Privacy Impact Assessment is to identify and recommend options for managing, minimising or eradicating privacy impacts. For more information on Privacy Impact Assessments see Privacy Impact Assessment Guide, August 2006, Office of the Australian Information Commissioner, www.privacy.gov.au.


Probabilistic linking

Data linking based on the relative likelihood that two records belong to the same unit given a set of similarities/differences between the values of the linking variables (e.g. name, date of birth, sex) on the two records.

Providers

See Data provider.
Back to top
Re-identifiable data

Data from which identifiers have been removed and replaced by a code, but it remains possible to re-identify a specific individual by, for example, using the code or linking different datasets.


Research purposes

Activities to investigate or explain phenomena, which result in statistical outputs or conclusions drawn in relation to population groups and not in relation to specific individuals, households, businesses or organisations.


Separation principle

The separation principle is one mechanism to protect the identities of individuals and organisations in datasets. The separation principle means that no-one can see the identifying or demographic information, used to identify which records relate to the same person or organisation (e.g. name, address, date of birth), in conjunction with the content data (e.g. clinical information, benefit information, company profits). Instead, staff can see only the information they need to do the linking or analysis. So, rather than someone being able to see that John Smith has a rare medical condition, or the profits earned by Company X, the person doing the linking sees only the information needed to do the linking (e.g. John Smith’s name and address) and the analyst just sees a record, with no identifying information, showing that a person has a rare medical condition together with any other variables needed for analysis (e.g. broad age group, sex).


Statistical data integration

Involves bringing together datasets, generally at the unit level (i.e. for an individual person or business) or micro level (e.g. information for a small geographic area), based on information common to both datasets, to provide new datasets for statistical and research purposes. Statistical data integration includes the full range of management and governance practices around the data linkage process, encompassing the key phases of data acquisition, data approval, data linking and data release/dissemination. This is also known as 'data integration'.

Back to top
Statistical and research purposes

Using data for statistical and research purposes means using it to describe characteristics of groups within the population, and relationships that might exist between variables such as social and economic conditions, behaviours and outcomes. It precludes use of a dataset for administrative or client management purposes (e.g. it cannot be used for detecting fraud nor for ensuring compliance), where there is an impact on specified individuals.


Statistical disclosure control

Involves managing the risks of an individual or organisation being identified, either directly or indirectly through released data. This risk is managed by confidentialising the data to minimise the risk of identification.


Statistical disclosure control techniques

Techniques for confidentialising a dataset to minimise the risk that the identity of a particular individual or organisation may be disclosed. Two broad statistical disclosure control techniques are data reduction methods which aim to control or limit the amount of detail available without compromising the usefulness of the information available for research, and data modification methods (perturbation) which involve changing the data slightly to reduce the risk of disclosure.


Statistical outputs

The result of any collection, storage, analysis and transformation of data where the individual statistical unit is of no interest in itself, and the results are presented in a form that does not reveal information about identifiable individuals.


Statistical purposes

Purposes which support the collection, storage, compilation, analysis and transformation of data for the production of statistical outputs, and the dissemination of those outputs and information describing them. Statistical purposes include the collection or use of information to provide for the drawing of a sample of statistical units for data collection.

Back to top
Return to Statistical Data Integration home

NSS QuickFind


  • This website is managed and maintained by the Australian Bureau of Statistics.
  • Privacy
  • Disclaimer
  • © Copyright
  • Contact Us
  • Sitemap
  • Creative Commons License