Transformation of the processed data into statistics is the next step in the statistical cycle. Transformation involves analysis and interpretation of data to identify important characteristics of a population and provide insights into the topic being investigated. The tools and techniques used range from simple (e.g. mean and median) to quite complex (e.g. econometric modelling and regression). An appropriate analytical tool or technique is necessary to achieve the best outcome for the statistical collection. The statistical analysis and interpretation are usually at the core of outputs produced in statistical collections. This chapter examines the major issues to consider when analysing results from statistical collections.
6.1 APPROACHES TO DATA ANALYSIS
There are many different approaches for analysing data. One method, referred to as the research question based approach, consists of the following steps:
I. Identify issues and formulate questions to be addressed by the research or investigation.
II. Develop models or hypotheses to test in the analysis.
III. Gather appropriate data from administrative data sources or survey collections.
IV. Analyse the data (i.e. test hypotheses, relationships and models).
6.4 USING APPROPRIATE DATA
By identifying issues and formulating research objectives you will be able to select the most relevant data and apply the most appropriate statistical techniques. If the survey was not designed to measure the items of interest, then it may not be appropriate to use the survey data for the proposed analysis. See Section 2.5 – Data Quality for more information on criteria that can be useful to assess the quality of statistical data..
Before starting to analyse a dataset you should become familiar with the data (i.e. reviewing the documentation, ensuring that the values and ranges of variables are in line with expectations). Although most data may have already been subjected to editing in the processing stage, additional edits may be required to ensure the data is fit for analysis.
6.5 ANALYSING THE DATA
People sometimes find it easier to interpret statistics if they are presented visually or textually. The process of turning data into information can be thought of as the conversion of numbers into text or other forms, such as graphs. Arithmetic analysis such as mean, standard deviation and percentile are useful as they can highlight aspects that readers might otherwise overlook. Summary measures also allow data to be compared with other data or allow aspects of the data (e.g. characteristics of population groups) to be compared.
Statistical analysis helps to understand complex relationships among the different data measures. Modelling techniques such as linear regression, logistic regression, and time series analysis are some ways to explore these relationships. Assistance can be sought from experienced analysts when undertaking complex statistical analysis.
Analytical methods may also vary according to the type of data. Statistical surveys usually deal with cross-sectional data involving many individuals, households or business units at a single point in time. Time series analysis involves analysis of data concerning data collected from similar types of entities across time. Cross sectional data analysis involves analysis of data collected from different entities at a point of time. Longitudinal data analysis blends characteristics of both cross-sectional and time series data analysis.
Any statistical analysis should be documented so that others can assess or duplicate the analysis and interpret the results if required.
See the Basic Survey Design Manual (Chapter 11) in www.nss.gov.au for more information on statistical analysis techniques and the statistical methods used to summarise results. An introduction to some discussion on underlying theories used for time series analysis and some of the major issues relating to seasonal adjustment can be found in the ABS Information Paper An Introductory Course on Time Series Analysis (ABS cat. no.1346.0.55.001).
6.6 PRESENTING A COHERENT STORY
The ultimate aim of all statistical analysis is to inform decision making through a plausible and coherent story or report through text, tables, graphs and other forms of presentations.
The report or presentation of statistical results should:
· satisfy the potential users of the results;
· convey the main findings clearly;
· follow a logical progression;
· minimise the use of jargon;
· provide clear insights into the data
· include information on how the data was collected, compiled, processed, edited and validated; and
· provide information on aspects such as data quality and data limitations.
As a general rule the report should include the following:
· Introduction (setting out the purpose and aims of the survey, background to research, defines terms and concepts etc)
· Methodology (describes method of sampling and information on survey population, data analysis and statistical procedures used)
· Analysis and findings (details of sample numbers, response rates, results and interpretation of tabulations)
· Conclusions and recommendations (summarising major findings and outlining future actions)
· Appendices and references (e.g. questionnaire).
6.6.1 Statistical Presentation in Graphs
Graphs are often the simplest and best presentation tools which allow users of results easy and quick understanding, observation, analysis and interpretation of relationships among different data measures.
Although graphs are generally easier to interpret compared to tables, they are also easily subject to misinterpretation. Graphs are also not ideal to present a large number of data points for which tables may be more suitable. Therefore you should consider using a mix of text, tables and graphs for the interpretation and presentation of data. There are many different types of graphs and the decision as to which one to use will depend on the type and complexity of data to be presented. See the Basic Survey Design Manual (Chapter 12) in www.nss.gov.au for more information on statistical presentation for graphs.