This chapter discusses the main issues involved in planning a new statistical data collection or process to extract statistical information from an administrative collection. It identifies planning issues related to developing objectives, budget and resources, setting timeframes, identifying existing data sources etc. It also covers other important aspects such as project management, confidentiality, data analysis and reporting which will need to be considered during this phase.
2.1.1 Develop objectives
Creating a list of the key end users and stakeholders at the planning stage will greatly assist in developing objectives for the statistical collection. A common problem when developing objectives is a narrow focus, which limits the usefulness of resulting data. Wider consultation at the planning stage can enhance the usefulness of data.
Potential stakeholders include users from within and outside the organisation undertaking the collection. It is also important to consider the wider corporate objectives and ensure the focus is not restricted to a section, branch or division in the organisation. External users play an important role in informing and influencing policy directions through their data requirements.
Agencies setting up administrative data collections should consider likely external user of their statistics. Organisations planning an administrative data collection should enlist the support of such potential external stakeholders while developing objectives for their data collection.
2.1.2 Define objectives
Poorly defined objectives will produce data which do not meet the needs of stakeholders. Well defined objectives would take into account crucial information such as concepts to be used, populations to be surveyed and the level of accuracy required. These are largely determined by the intended use of the data and may also impact on the cost of collecting data through surveys or extracting data from administrative collections.
2.1.3 Prioritise objectives
The planning phase should also balance users’ expectations with available resources. The collection can be made easier if objectives are prioritised as not all objectives can generally be met by a single collection. As the planning progresses to detailed development phase it will become clear whether the number of objectives need to be reduced or can be expanded. 2.1.4 Information Development Plans
Developing objectives of a statistical collection can be enhanced through the use of Information Development Plans (IDPs). The aim of an IDP is to identify user needs for statistics in a particular field and develop statistical data collection strategies to meet these needs. IDPs also identify and develop strategies to address deficiencies in frameworks, classifications and standards to be used in the statistical collection. IDPs are usually developed jointly by all stakeholders in an area of interest and provide a road map for statistical activities to be undertaken. See Chapter 10 - Statistical Infrastructure for further details on Information Development Plans.
2.2 BUDGET AND RESOURCES
Shortage of funds is one of the major constraints to a statistical collection activity. During early stages of planning it may be difficult to estimate all costs accurately. Costs largely depend on choices such as whether a new survey needs to be undertaken or if data can be extracted from an existing administrative collection. Hence, it is advisable to consider a number of different statistical collection options for a range of funding scenarios.
When estimating costs it is important to consider all possible resource requirements which usually fit into one of these three categories:
· Salaries - include both office staff and any temporary or field staff that may be required. Also consider whether you need to include full staff costs (i.e. include superannuation, information technology (IT) costs) or just direct costs (i.e. only the wage component).
· IT costs - include computing costs beyond the normal staff IT use costs. Such costs may include software, IT infrastructure development and data storage.
· Administrative costs - include consultancy costs, training, travel, printing, postage, publication and dissemination costs and the purchase of sample lists if required (for the survey frame).
In addition to the budget, availability of other resources needs to be considered in the planning stage. Some statistical activities may require specialised resources such as staff skilled in survey design or computer programming. Sometimes even if sufficient funding is available it may be difficult to get these resources when supply is scarce. In such instances, either the collection has to be conducted within available resources or the timeframe needs to be adjusted to fit resource availability.
Staff with appropriate skills may be needed to carry out aspects of the project. Project staff will therefore need to be provided timely training to acquire required skills. Organisations should assess staff skill and training needs as part of their planning and put in place strategies to address the needs, including recruiting skilled staff if required. For further information on statistical skills and training see Chapter 12 - Statistical Capabilities.
Users would generally want data as quickly as possible. A collection activity which delivers data after a lengthy time lag is not likely to meet the needs of users and may even become obsolete.
The timeframe for a collection should be consistent with its objectives. Issues such as whether the data is required for policy development or for monitoring effects of a policy change may have a major impact on the timeliness of collection. The timeframe to undertake a statistical collection can be often shortened by making more funds and/or more resources available. However, if resources are limited then prioritising objectives may be the only option if timeliness is crucial.
2.4 EXISTING DATA SOURCES
Before initiating a new statistical collection activity research should be undertaken to know the availability of any existing data and its usefulness for the purpose. Sometimes, the information sought by organisations through new collection activity may be available in administrative collections of other organisations. Using existing data can save time and cost provided the existing data can meet user requirements.
In considering the usefulness of the existing data it is important to know why and how the data was collected. If appropriate methodologies were not used in an existing statistical collection then the data may not meet the needs of the proposed statistical activity. It is also necessary to understand the legislative and confidentiality issues of using administrative data.
In addition to the Australian Bureau of Statistics, Commonwealth and State/Territory Government agencies, local councils, universities and other research agencies are rich sources of published and/or unpublished statistics. Sometimes existing survey or administrative data may not entirely satisfy the current need but may fulfil part of the requirement.
2.5 DATA QUALITY
A specific set of criteria or framework should be used to assess the suitability of data and data sources. The ABS uses a Data Quality Framework (DQF) for a comprehensive, multi-dimensional assessment of statistical dataset or release quality. It is recommended that producers of statistics consider the following seven quality dimensions before designing collections, collecting statistics or producing outputs.
The Institutional Environment refers to the institutional and organisational factors which may have a significant influence on the effectiveness and credibility of the agency producing the statistics. Consideration of the institutional environment associated with a statistical product is important as it enables an assessment of the surrounding context, which may influence the validity, reliability or appropriateness of the product.
Relevance refers to how well the statistical product or release meets the needs of users in terms of the concept(s) measured, and the population(s) represented. Consideration of the relevance associated with a statistical product is important as it enables an assessment of whether the product addresses the issues most important to policy-makers, researchers and to the broader Australian community.
Timeliness refers to the delay between the reference period (to which the data pertain) and the date at which the data becomes available. This is an important aspect in assessing quality, as lengthy delays between the reference period and data availability can have implications for currency and/or reliability.
Accuracy refers to the degree to which the data correctly describe the phenomenon they were designed to measure. This is an important component of quality as it relates to how well the data portray reality, which has clear implications for how useful and meaningful the data will be for interpretation or further analysis.
Coherence refers to the internal consistency of a statistical collection, product or release, as well as its comparability with other sources of information, within a broad analytical framework and over time. The use of standard concepts, classifications and target populations promotes coherence, as does the use of common methodology across surveys.
Interpretability refers to the availability of information to help provide insight into the data. Information available which could assist interpretation may include the variables used, the availability of metadata, including concepts, classifications, and measures of accuracy.
Accessibility refers to the ease of access to data by users, including the ease with which the existence of information can be ascertained, as well as the suitability of the form or medium through which information can be accessed. The cost of the information may also represent an aspect of accessibility for some users.
While it is advisable to consider all seven quality dimensions users and producers are encouraged to consider which quality dimensions are most relevant and important for their particular purpose.
For further details of the ABS DQF see 1520.0 ABS Data Quality Framework, May 2009
The collection and use of sensitive data, or data considered sensitive by respondents, can provoke adverse respondent reactions and jeopardise response rates to surveys. The time taken to collect sensitive data is generally greater than for non-sensitive data.
Development time and costs for this type of survey are likely to be greater as more detailed testing of the effects of collecting sensitive data needs to be undertaken.
Being well informed of the potential sensitivity of a collection will greatly assist management of provider relationships. Assurances of confidentiality and good collection design will help to improve response rates to sensitive information.
Confidentiality provisions and Privacy legislation must be considered when determining whether data can be used for the intended statistical purposes.
When collecting administrative data, organisations usually inform providers that their data will be used for the organisations’ own purpose or will be made available only to selected agencies. When sharing data, the organisations that own the data should manage inherent risks to avoid compromising the privacy of their clients. It is also important to consider if any legislation, rules and regulations specific to your own organisation which govern sharing of data from other organisations. See Chapter 11 – Confidentiality and Privacy for further details on some of the procedures followed by the ABS to confidentialise data.
For a comprehensive coverage of legislation on privacy issues see Chapter 11 - Confidentiality and Privacy in the Handbook.
2.7 PROJECT MANAGEMENT
A detailed project management strategy should be developed as part of the planning phase to ensure collections and administrative data are well managed, and a balance is achieved between objectives, budget and resources, and timeliness. Usually it is the balancing between objectives and the budget which requires greater attention. It is desirable to fix at least one of these parameters with adjustments or alignments made to other parameters as the situation unfolds. For example, if the budget is limited then the objectives need to be brought into line with available budget.
Project management for statistical collection also involves the capacity to effectively use committees, working groups, service level agreements with users etc. See Chapter 13 – Managing Risks in Statistical Collections for further information on managing risks associated with project management.
2.8 ANALYSIS AND REPORTS
Decisions should be made at the planning phase on the analysis to be conducted and final reports to be produced. Expert assistance may be needed on what particular analytical and publication tools are appropriate. Publications on survey results may range from a comprehensive report to a condensed summary. In all cases, however, it is important that details of the survey methodology (sample selection, method of data collection etc) are documented because the use of specific methodologies can affect analysis of data and interpretation of results. These need to be built into the overall project plan.
Research should be undertaken at an early stage to familiarise with issues related to the statistical collection. The objective of the research is to identify issues which may impact on the proposed statistical collection and should cover specific issues associated with the subject matter as well as processes used in the collection. Some potential issues for research in the planning phase are:
· context surrounding the objectives
· availability of existing survey data or administrative collection
· suitability of current standards, classifications and concepts
· likely impact of any recent changes on the collection (e.g. legislative changes)
Research should be also undertaken on statistical processes with the objective to identify issues which will affect the procedures adapted for collection, processing and analysis of data. This will effect your ability to project manage effectively.