Statistical infrastructure refers to tools which support the operation of a statistical system. These tools can help to organise the statistical system, improve efficiency, add value, create new outputs or simply perform tasks within the system. Examples of statistical infrastructure include computer systems, metadata repositories, legislation, standards and classifications, frameworks and information development plans. The latter four infrastructure categories are being discussed in detail in this chapter. The legislation, privacy and confidentiality issues are discussed in Chapter 11 - Confidentiality and Privacy.
Standards refer to a comprehensive set of statistical and methodological concepts and definitions used to achieve uniform treatment of statistical issues across a collection or collections across time and space.
Standards assist in maximising the effectiveness of statistical outputs and the efficiency of the production process in terms of comparability (over time, space, industry, etc) and coherence (i.e. the capacity for integration) of the statistics.
While comparability and coherence are important for any dataset, they are particularly important where data is obtained from multiple sources and have to be combined or where outputs are used in a wide variety of contexts. For example, the use of standard collection units (e.g. families, households, businesses etc) helps the compilation, comparison and dissemination of statistics for these standardised units.
There are basically two broad types of standards. Firstly there are standards which are applied to the structure and content of data. These include classifications and standard collection methodology.
The second type of standard is that applying to the structure and content of metadata.
Statistical Classifications facilitate the accurate and systematic arrangement of data into categories based on shared and common properties. The use of standard classifications aids in the production of consistent and comparable statistics over time, regions and across different collections.
Comparison of statistics across different geographical regions or demographic groups is an important focus for many administrative and statistical collections. The use of standard classifications assists in such collections. Administrative data is also an important source of small area data which can be combined with other data sources to provide a more complete picture of the population of interest.
Statistical classifications can also aid in the analysis of data at different hierarchical levels by aggregation or disaggregation of data. They can also be used to derive new variables from data collected from diverse collections. For example, data on socio-economic status of individuals or households can be derived from data such as education, labour force status, occupation, income etc., which may be acquired through different collections. The use of relevant classifications may help to achieve this objective.
10.2.1 Australian Standards and Classifications
The ABS has developed a number of classifications to serve a wide range of its data collections. These are largely integrated or closely aligned with comparable international classifications. For example, the Australian and New Zealand Standard Industrial Classification (ANZSIC) is aligned to the OECD’s International Standard Industrial Classification (ISIC). A list of ABS classifications and links to the relevant documentation can be accessed in the ABS website www.abs.gov.au
Other government agencies have also developed standards and classifications for collections within their statistical domain. For example the Australian Institute of Health and Welfare (AIHW) publish the National Health Data Dictionary, which contains a set of definitions for use in Australian health data collections. The data dictionary is produced in consultation with a number of agencies including all Australian health departments, the Australian Bureau of Statistics, the National Centre for Classification in Health, the Department of Veterans' Affairs, the Australian Private Hospitals Association, representatives of the private insurance industry and Medicare Australia.
10.2.2 International Standards and Classifications
Many international organisations also have developed a number of standard classifications for collection of statistics or for compilation and comparison of statistics provided by different countries. The United Nations Statistical Office (UNSO), the International Labour Office (ILO), Eurostat, and the Organisation for Economic Cooperation and Development (OECD) are some international agencies that have developed international standard classifications.
Frameworks exist for integrating and presenting data in many fields. The use of standards and classifications in frameworks greatly reduces the effort required for integration and reconciliation of data.
The potential benefits of statistical integration include:
· more coherent data - statistics from different collections can be compared through the use of common data items, classifications, and terminology.
· more efficient systems - use of common systems such as statistical standards and classifications can avoid duplication or reduce resources required to develop concepts and processing systems.
· reduced provider load - for example, a person may supply employment status data to three different surveys, but if a standard definition for employment status was used the data supplied to one survey could be used for the purpose of the three surveys
The Australian Bureau of Statistics has developed a framework for Integrated Economic Statistics under which most of its business surveys are conducted. This framework requires that the statistical structure of each business entity is based on a standard units model, and an industry code is allocated based on predominant activity. A key component of the framework is a centralised Business Register which stores this information about each business and which is used to produce population frames for collections of economic statistics, based on the industry code. A set of standard classifications is used to describe characteristics of businesses, such as their industry, number of employees, and type of legal organisation. The classification used is the Standard Institutional Sector Classification of Australia (SISCA). In addition, standard data item definitions have been developed, and these are accompanied by the use of standard questions in questionnaires.
A framework is a set of assumptions, concepts, principles values and practices that underpin statistical collections in particular areas of interest. Frameworks provide a context and guidance especially in the planning phase. Frameworks can be simple, developed for a narrow subject of interest or can be highly complex and encompass entire subject matter area.
All frameworks have the following attributes:
· show the key relationships, processes or flows between elements;
· have a logical structure;
· comprehensive but concise;
· allow for change; and
· consistent with other frameworks such as classifications and standards.
The development of a framework can
· act as a catalyst for further development work (e.g. new standards);
· help standardise the language used for a field of statistics; and
· inform the development of future statistical collections.
10.5 INFORMATION DEVELOPMENT PLAN (IDP)
Information Development Plans (IDPs) map the broad issues and information needs for a given field to the available information sources, in order to determine information gaps, overlaps and deficiencies. An IDP presents priorities and a plan for action to improve information. It also provides a framework for the systematic improvement, integration and use of data sources. IDPs are developed through consultation and collaboration, with responsibilities for actioning specific data development actions accepted by appropriate agencies.
In simplistic terms, each IDP embodies three kinds of knowledge and shared commitment to statistical development activity:
· demand for information - a picture of the statistics that would, ideally, support informed design and evaluation of policy, other decision-making, research and community discussion.
· supply of information (including raw data that might be used to create statistics) - a picture of the existing data pool that might satisfy the demand for information.
· agreed statistical development activity, identified through the comparison of demand and supply, which defines and priorities.
· information gaps (such as key variables arising in policy, decision making research or debate that have not yet been given statistical expression).
· information overlaps (such as variables for which competing or inconsistent measures are available).
· other information deficiencies (such as missing low level data such as by region or industry or sub-population, differing definitions or counting rules, or only rough approximations to the desired socioeconomic concept).
For an example of an IDP developed by the ABS see the IDP on Improving Statistics on Children and Youth 2006.
Steps in developing an IDP
1) Identify and describe the key issues and policy concerns associated with the topic chosen for the information development plan. This is achieved through consultation with key stakeholders.
2) Determine existing data sources and information needs. Match existing data sources to information needs and identify data gaps and deficiencies.
3) Draft a set of priorities and a plan for action based on data gaps and deficiencies identified in the previous process. Prioritise the data needs and plan for improvement of data sources across agencies.
4) Draft an IDP. The IDP, as a living document, will outline the current priorities, actions to address them, and the agencies responsible for each priority. Several rounds of consultation and discussions may be needed to finalise the IDP.
5) The final step is to provide the basis for monitoring progress and ongoing review of information development needs.