Description
|
The Australian Priority Investment Approach to Welfare (PIA) policy initiative was established as part of the 2015-16 Budget, following a comprehensive review of Australia’s welfare system. The initiative uses data analysis to identify groups at risk of long-term welfare dependence. These analyses provide insights into how the system is working and uses those insights to find innovative ways of helping more Australians live independently of welfare. As part of the PIA, in September 2016, the Minister for Social Services announced a plan to allow limited public access to PIA data. A synthetic version of the PIA data has been created for use by researchers and teachers. The synthetic data relates to individuals who have made a claim for, are receiving or have received payments or services administered under social security law and family assistance law. This includes benefit types such as Aged Pension, Youth Allowance, Newstart and Disability Support Pension. The synthetic data contains a limited number of variables suitable for research, while maintaining the privacy and confidentiality of individuals. The synthetic dataset has been created by applying a privacy-preserving algorithm on the original PIA data. This process results in each person’s true data being modified such that the overall group data very closely represents that of the original dataset, yet no one individual’s data can be identified in the synthetic dataset. That is, each line of data that would normally represent an individual no longer does. The dataset is a combination of synthetic records that, when combined, reflect the shape of the original dataset. The synthetic PIA data contains a series of point-in-time quarterly snapshots dated from July 2001 to June 2015. This results in 56 separate quarters of administrative data. Each quarter includes 31 variables (available in the ‘PIA Data Dictionary – Variable and Codes’ file) that are consistent across all quarters. There are approximately 5 million individual records in each quarter.
|
Notes
| The Synthetic PIA Data files are loaded as 56 separate zipped .csv files to reduce the user’s required resources for download. Please note, software programs like Excel are not recommended for use with the Synthetic PIA Data files due to the limitation of 1,048,576 rows. That is, the full dataset with approximately 5 million records will not load in Excel. |