Showing posts with label big data. Show all posts
Showing posts with label big data. Show all posts

Friday, March 15, 2024

How the digital world is getting better at measuring us up

These days we hear incessantly about “data”. The media is full of reports of new data about this or that, and there’s a new and growing occupation of data analysts and even data scientists. So, what is data, where does it come from, what are people doing with it, and why should I care?

Google “data” and you find it’s “facts and statistics collected together for reference or analysis”. The advent of computers has allowed businesses and governments to record, calculate, play with and store huge amounts of data.

Businesses have data about what goods and services they’re making, buying and selling, importing or exporting, and paying their workers, going back for 30 or 40 years.

Our banks have data about what we earn and what we spend it on, especially when we use a credit or debit card – or our phone – to pay for something.

Much of this data is required to be supplied to government agencies. If you ever go onto the Australia Taxation Office’s website to do your annual tax return, it will offer to “pre-fill” your return with stuff it already knows about your income from wages, bank interest and dividends.

Try it sometime. You’ll be amazed by how much the taxman knows and how accurate his data are.

Another dimension of the “information revolution” is how advances in international telecommunications – including via satellites – have allowed us to be in touch with people and institutions around the world in real-time via email and the web – news, entertainment, social media, whatever.

Last month, the Australian Statistician – aka the boss of the Australian Bureau of Statistics – Dr David Gruen, gave a speech outlining some of the ways these huge banks of “big data” about the economic activities of the nation’s businesses, workers, consumers and governments can be used to improve the way we measure the economy in all its aspects: employment, inflation, gross domestic product and the rest.

We’re getting more information and more accurate information, and we’re getting it much sooner than we used to. But we’re still in the early days of exploiting this opportunity to be better informed about what’s happening in the economy and to have better information to guide the government’s decisions about its policies to improve the economy’s performance.

Gruen starts by describing the Tax Office’s “single-touch” payroll system, software that automatically receives information about employees’ payments every time an employer runs its payroll program.

Not all employers have the software, but those who do account for more than 10 million of our 14 million employees.

Gruen says the arrival of the pandemic in early 2020 made access to this “rich vein of near real-time information” an urgent priority. The taxman pulled out the stops, and the stats bureau began receiving these data in early April 2020.

With a virus spreading through the land and governments ordering lockdowns and border closures, they couldn’t afford to wait a month or more to find out what was happening in the economy. Thus, the whole project of using big data to help measure the economy received an enormous kick along – here and in all the other rich economies.

So, in addition to the longstanding monthly sample survey of the labour force, we now have a new publication: Weekly Payroll Jobs and Wages Australia. These data allowed the econocrats—and the rest of us—to chart the dramatic collapse in jobs across the economy over the three weeks from mid-March 2020.

They show employment in the accommodation and food services industry falling by more than a quarter in just three weeks. Employment in the arts and recreation services industry fell by almost 20 per cent. By contrast, falls in utilities and education and training were minor.

The monthly labour force survey has a sample size of about 50,000 people, compared with the payroll program’s 10 million-plus people, meaning it provides information on far more dimensions of the workforce than the old way does.

So, the bureau’s access to payroll data taught it new ways of doing things. And the pandemic increased econocrats’ appetite for more info about the economy that was available in real-time.

With household consumption – consumer spending – accounting for about half of gross domestic product, improving the timeliness and detail of the data was a great idea.

So, in February 2022, the bureau released the first monthly household spending indicator using (note this) aggregated and de-identified data on credit and debit card transactions supplied by the major banks. This indicator provides two-thirds coverage of household consumption, compared with the less than one-third coverage provided by the usual survey of retail trade.

The bureau has also begun publishing a monthly consumer price index in addition to the usual quarterly index. This is possible because big data – in the form of data from scanners at checkout counters and data scraped from the websites of supermarket chains – is much cheaper to gather than the old way.

The bureau has also started integrating different but related sets of big data from several sources, so analysts can study the behaviour of individual consumers or businesses. It has developed two large integrated data assets.

The one for individuals links families and households with data sets on income and taxation, social support, education, health, migrants and disability.

The one for businesses links them with a host of surveys of aspects of business activity, income and taxation, overseas trade, intellectual property and insolvency.

The purpose is to allow analysts from government departments, universities or think tanks to shed light on policy problems from multiple dimensions.

For instance, one study showed that people over 65 who’d had their third COVID vaccination within the previous three months were 93 per cent less likely to die from the virus than an unvaccinated person.

But that’s just the tiniest example of what we’ll be able to find out.

Read more >>

Monday, November 19, 2018

Benefits from big data at risk from untrustworthy politicians

The digital revolution holds the potential to use mere “data” to improve the budget and the economy, and hence our businesses and our lives. But you have to wonder whether our politicians are up to the challenge.

In a speech last week, the Australian Statistician, boss of the Australian Bureau of Statistics, David Kalisch, said the new statistical frontier is “data integration” – you take two or more separate sets of statistics and put them together in ways that reveal new information. Things you didn’t know about how bits of the world work.

This is just exploring the huge, still largely untapped potential of computers to manipulate a lot of figures and produce useful information about what’s going on in this field or that. But it also involves new statistical techniques for combining data in ways that make sense and don’t mislead.

(This, BTW, raises a bugbear of mine. Digitisation, which allows us to measure any number of aspects of a company’s performance cheaply and easily, has given rise to the enthusiasm for “metrics”. But bosses who allow their metrics to be chosen and presented by people who know a lot about IT but nothing about the science of statistics, or who draw conclusions from those metrics without any knowledge of stats, are asking to be led up the garden path. They never know when the metric is answering a different question to what they imagine.)

Kalisch says data integration is already delivering new insights, such as improved estimates of Indigenous life expectancy, understanding outcomes for successive cohorts of migrants, and the importance of small to medium enterprises for job creation (not as outstanding as the propaganda would lead you to expect).

There’s much more of that kind of thing we could do. But Kalisch points also to the considerable untapped potential to use data integration to assess the performance of government policies and programs, and thus to target budget funding to programs assessed as more likely to be effective.

Kalisch says “Australia does not have a strong tradition of rigorously evaluating outcomes of government programs and policies”. That’s putting it politely. The Americans do (because Congress insists on it) and so do many other countries – even those backward and poverty-stricken Kiwis do.

Why don’t we? Because too many ministers and department heads fear the embarrassment if rigorous assessment showed a program was a waste of money, as many would. And also because Treasury and Finance don’t bother pushing it – perhaps because program evaluation costs money upfront, and only saves money down the track.

But that’s only one reason we risk failing to exploit all the benefits of big data analysis. The biggest is the very real probability bully-boy politicians and over-zealous agency heads try to ram through data aggregation schemes over the worries of people concerned about breaches of their privacy.

Consider the hash they’re making of My Health Record where, among other things, the instigators are relying more on slick ads than honest explanation. Consider the long running attempt by the masterful Alan Tudge, the department and the Centrelink PR man to deny there was any problem with robodebt, until the full extent of the fiasco – and the hurt it caused many innocent victims – could no longer be concealed.

Then consider the way Tudge used the shield of Parliament to reveal very private information about a woman who'd had the temerity to criticise him. And he escaped uncensured.

Such episodes, and many years of spin doctor-led politicians playing the true-but-misleading game, have hugely reduced the public’s trust in politicians and their happy assurances that nothing could possibly go wrong.

We stand on the cusp of reaping huge benefits from data analysis, or stuffing it up so badly the electorate punishes any government that touches it.

Part of this is the risk that government penny-pinching doesn’t give the data gatherers enough funding to install adequate privacy safeguards, or enough resources to respond honestly and adequately to the public’s questioning.

But that’s just part of a bigger money question: data integration isn’t particularly dear relative to the benefits of greater understanding, better public policy and more effective government spending it offers, but that doesn’t mean the pollies have the sense to cough up.

Operational funding of our bureau of statistics has been cut by 30 per cent in real terms over the past decade, by governments of both colours.

An independent benchmarking exercise in 2016 found that our bureau’s funding was about half the funding provided to Statistics Canada for roughly equivalent work. Even New Zealand’s official statistician got more than ours did. Smart thinking.
Read more >>