How The Experts are Turning Big Data Into Insights
Meet the rising stars of data: the experts using its insights to do good and making sure we can trust it.
Tapping into the collective intelligence to solve complex problems
Anthony Goldbloom, CEO & Founder of Kaggle
In the emergent global data industry, Anthony Goldbloom’s star has already risen. Last year, in a multimillion-dollar deal, the former Australian Treasury and Reserve Bank economist sold Kaggle to Google.
He co-founded Kaggle in 2009 to solve complex problems by tapping the collective intelligence of the worldwide data-science community through machine-learning competitions. The light-bulb moment for the startup came to the Melbourne University graduate when he was asked, as an intern at The Economist, to write an article on the nascent topic of big data. “When I started Kaggle, I thought how cool it would be if I could make a living from it,” he recalls.
The San Francisco-based company’s success story can be told in numbers. Today, data scientists across the globe – about 10,000 individuals or 6000 teams – vie for prize money (up to $US1.5 million at a time) in Kaggle competitions. So far, the scientists have created about seven million models, including writing algorithms to improve diagnoses of lung cancer from CT scans or heart failure from MRIs; accelerating airport security-scanning processes; predicting American house-sale prices (within a few per cent); and outstripping teachers at marking high-school essays.
In its first years, Kaggle had early-mover disadvantage, notes Goldbloom, because it could only run competitions for the relatively small number of organisations – NASA and other government entities or progressive corporations – that were already working with artificial intelligence (AI) and machine learning. So Kaggle branched out, maximising opportunities for both its community and itself. Now the company offers Kaggle Kernels (a cloud-based workbench where data scientists share code and analysis), a public datasets platform, short AI education courses and a jobs board for employers to source hard-to-find data-science talent.
“There are almost two million people in the Kaggle community – and there’s not a data scientist today who doesn’t have a Kaggle account,” boldly ventures Goldbloom.
The grunt of Kaggle’s machine-learning and AI capabilities for the cloud caught the eye of Google in the race to catch up with the dominance of Amazon Web Services. Now the competitions are just 15 to 20 per cent of Kaggle’s activity, says Goldbloom, who heads a 55-strong team. He points to the reciprocal smarts – software, engineering power, infrastructure, security, recruiting and legal expertise – that his new boss, Google, has opened for Kaggle.
“We’re in the golden age of machine learning,” he adds. “Our aim is to make Google the best cloud for analytics and machine learning and we’re well positioned to do that.”
Making healthcare more accessible with wearable technology
Samaneh Movassaghi, Telecommunications research fellow, Data61 CSIRO
Multi-award-winning innovator Samaneh Movassaghi is developing next-generation wearable technology. The Iranian-born, Australian-educated research scientist received a Google fellowship in 2017 for her breakthroughs in wireless body area networks (WBANs). Her work – integrating a network of low-power sensory devices placed all over the human body to collect data on an individual’s vital signs – allows medical practitioners to monitor patients while they get on with their everyday lives. It has the potential to revolutionise e-health and make health care more affordable.
Can you recall when your interest in data and e-health began?
As a child, I was fascinated with how communications worked in remote-controlled toys. I would break them apart and look at every electronic component. My parents weren’t always happy but I’m from an academic family so they were pleased that I was curious about science. Then, when I was about 13, a relative died of breast cancer and I remember watching what she went through and thinking there had to be a better way to manage someone’s health.
How did that influence your tertiary studies and research?
I did a science degree with a major in electronics at the University of Tehran. After I did my master’s degree by research in telecommunications engineering at the University of Technology Sydney in 2012, I started looking at WBANs, which were then being standardised. I wanted to find ways to make them safer and more effective. That was also the focus of my PhD.
What breakthroughs did you achieve?
The sensors and actuators used for data collection in WBANs are battery-powered, have limited capacity and need to be adaptive, as there’s a lot of variation in body movements and the environments they’re placed in. I devised protocols that prolong battery savings, minimise interference, increase the speed of data transfer and allow them to adapt to variations, through self-organising. I was inspired by nature; by how thousands of fireflies synchronise [their lightshow]. My work simplified the process to avoid computational calculations that eat up the network’s energy and to allow simultaneous and interference-free communication among coexisting networks.
What difference will that make to health care?
The patient data from WBANs is commuted to a base station then a server, where a professional or specialist can immediately spot abnormalities and decide on changes in diet, medication, sleep and other things that impact health.
What does this mean for the future?
New low-power sensors that minimise risks for the people who wear them or have them implanted are being developed for WBANs. By the time they are tested and have been through regulatory approvals, product coming to market is probably five to 10 years off.
Where do you envision your career going?
I’m a strong believer in change, in making a difference and building a path when there isn’t a way. My work extends beyond WBANs to the IoT [Internet of Things] and I’ve been involved in projects for smart cities, the mining industry, blockchain and cloud technologies. Ultimately, I see myself leading a team of dedicated researchers and engineers, solving problems that are crucial to the evolvement and wellbeing of humankind.
SEE ALSO: Leadership lessons from the outback
A robot data scientist named Hyper Anna
Natalie Nguyen, CEO, Hyper Anna
Before big data rolled into mainstream consciousness, Natalie Nguyen had a refined appreciation of its potential. “My entire career has involved the critical nature of data for businesses,” says Nguyen, a University of Sydney design computing graduate who spent seven years working in data analytics and commercialisation for corporates. “At the same time, I could see that getting insights from data is not easy.”
In December 2015, Nguyen quit her job with Quantium, Australia’s largest analytics company, to “play around” with a few ideas; she determined that finding a smarter, easier way for organisations to surface data insights was the way to go. Two months later, Hyper Anna, a data-discovery and real-time-analysis company based in Sydney, was born.
It’s a simple concept: just as you might ask AI assistants Siri (Apple), Alexa (Amazon) or Cortana (Microsoft) for help or information, businesspeople can pose a data-related question to Anna via a dashboard on a computer screen. Using natural language processing, she replies by voice or email within 20 seconds. Anna’s data insights come in narrative and as charts. “Her outputs are the same as you’d expect from any analyst or data scientist,” says Nguyen.
Even those who aren’t data- or tech-savvy can work with Anna, no training required. “To really democratise data and insights, the application has to be as intuitive to use as Google Search,” explains Nguyen.
Hyper Anna eliminates the wait for busy human data analysts to deliver the goods on time-critical matters, such as the cause of last week’s sales spike or why people are quitting a company division in droves. The technology also allows businesses to dodge the global talent war for data scientists. (According to a Harvard Business Review article, it’s “the sexiest job of the 21st century”. Due to the scarcity of candidates, the median annual salary in the United States is $US120,000, reportedly rising to $US200,000 at top companies. Australia isn’t far behind.)
Hyper Anna’s big break came when it landed its first client, insurer IAG, less than six months in. IAG soon became an investor, as did Reinventure, a venture capital group backed by Westpac, another early client.
Following a $16 million funding round in 2017 led by global big-deal tech venture capital firm Sequoia China, Nguyen now heads a team of 56 working across offices in Sydney, Singapore and Hong Kong. The investment has fuelled Hyper Anna’s growth into Asia, where the company is working with several top-tier financial institutions, reports Nguyen, whose sights are shifting ever further afield. Will the next stop be the US and a move into the telco and retail sectors? Better ask Anna.
Decoding data politics
Julia Powles, Research Fellow, New York University School of Law and Cornell Tech
The data revolution carries considerable risks, argues Julia Powles, an outspoken “big thinker” whose interest in the conjunction of law and technology started with her Australian science and law degrees, followed by stints at Oxford and Cambridge universities. New rules and regulations are needed for data governance and control – and the time is now, says the former MinterEllison lawyer whose voice as a global influencer is growing louder.
How would you define the nature of data?
Think of data as observations about and constructions of the world. Data helps us navigate but it also blinkers us. Data depends on who’s doing the observing, their purpose and what they notice – as well as what they don’t. It’s a grave error
to think of data in objective or neutral terms. Data is political.
And what are the problems associated with how data is used?
By observing and identifying patterns among people similar to you – but not, ironically, actually you – companies and governments can act on that proxy information in ways that affect you and your life chances. You won’t know and you won’t have the chance to contest, resist or refuse. This affects everything from how Google Maps dictates discovery to Facebook’s sway over news and elections.
New data breach notification laws came into effect in Australia in February. Is the world at a critical point in terms of regulating data?
For half a century, policymakers, led by Europe, have refined rules for the negative consequences of data accumulation and use against individuals. The challenge we have now is that the biggest companies in the world – like Google, Amazon and Facebook – have made a business of ignoring those rules. The question is whether any state or entity has the power and will to take them on. Bold yet sensible regulation should impose costs on data hoarding to anticipate and prevent harmful use.
Are some people more vulnerable than others?
There’s this trope about data and technology that somehow they will unleash a fairer world. But here’s the thing: data-driven systems revere and replicate the observed world, past and present. They embed and deepen the patterns of inequality that are seared into human history. The idea that anyone is engaging in a considered transaction of data when they use apps, social media and the internet is a farce.
How can individuals protect themselves?
We can support investigative journalism that asks how data is used; who gains and who loses? We can demand that cities, agencies and institutions are accountable in handling our data and ask for rules to address the imbalance between individuals and those who aggregate data. And we can diversify our tools by using Signal instead of WhatsApp and Messenger or Citymapper instead of Google Maps.