Digitalisation of healthcare is gaining momentum, and Finland is a forerunner in health data. Finland’s digital health data vaults, such as the vast HUS DataLake and Aalto University’s new environment for very sensitive data, are showing what Big Data can do for healthcare and offering exciting opportunities for research.
As it turns out, “dive” is not the worst word to describe what HUS Helsinki University Hospital researchers are doing. Over the years, all the medical information has been collected into one vast, digital vault – known as HUS DataLake.
Miika Koskinen, D.Sc., Docent, and Development Manager from HUS Information Management, says that HUS occupies a rather unique position in the digital race:
“HUS is one the biggest university hospital organisations in Europe, and with each patient being treated, information is collected. This data has tremendous value going forward,” he says.
Koskinen sees two major application areas for the data: first of all, hospitals can use data as an engine of sorts, making processes and operations run more efficiently.
“This kind of optimisation via data will also drive down costs.”
The other application area is healthcare itself. If you know every illness, treatment and medicine a citizen has seen through their lifetime, that’s one thing. But if you have that same data from thousands and thousands of people, you can use advanced analytics to really zero in on the causes of problems, for example.
“With HUS being responsible for providing specialist healthcare in South Finland for 2.2 million residents, we’re in a great position to see the big picture through data use,” says Koskinen.
HUS has collected data digitally for decades: In fact, electronic health records have been stored since the 1980s. For about 10-20 years now, laboratory tests, X-rays and other health records of the 2.8 million annual patient visits at HUS have been accessible in digital format.
A few years ago, HUS ICT management started looking for a way to make the most of this digital treasure. Together with IT company TietoEVRY, they set out to create a data lake to explore the data to its fullest.
HUS is one the biggest university hospital organisations in Europe, and with each patient being treated, information is collected. This data has tremendous value.
Miika Koskinen, Development Manager, HUS Information Management
The end product – HUS DataLake – enables the analysis of huge health datasets in order to create, for example, healthcare-related predictions – and, further down the road, truly game-changing innovations. Risto Renkonen, Professor of Glycobiology and former Dean for Faculty of Medicine at the University of Helsinki talks about a “paradigm shift” in modern healthcare that craves Big Data.
“Under the old system, we’ve gone as far as we can go. We need DataLake and analytics to get to the next level.”
Renkonen believes that the imagination of the researchers is the only limiting factor in the utilisation of HUS DataLake, once the researchers get familiar with its use.
“DataLake does require a new kind of mindset. We must now ask ourselves: what can we accomplish with this new tool – what’s the most useful thing we can do for patients?”
Researchers’ imagination is the only limiting factor in the utilisation of HUS DataLake.
Risto Renkonen, Professor of Glycobiology and former Dean for Faculty of Medicine at the University of Helsinki
Having been involved with the development of HUS DataLake for a couple of years, Renkonen is quite familiar with its evolution and present status. So far, mostly early adopters have embraced the new tool.
“It takes some time and effort before you learn to use DataLake properly, meaning that we don’t have that many researchers using it effectively just yet,” he says, adding that this is sure to change as especially the younger researcher generations catch wind of the innovation.
Dr. Koskinen adds that large sets of preprocessed clinical real-world data – together with leading-edge analytics platform HUS Acamedic that enables high-performance computing – is also a “real calling card” in attracting international attention.
“It’s one of the biggest datalakes in Europe which makes HUS a very appealing research partner, also from the perspective of the corporate world.”
And the HUS Data Lake keeps getting even bigger: a constant stream of 2-3 million annual visits enriches the HUS registers with a variety of information about care episodes, diagnoses, laboratory results, medical imaging, genomics, surgeries, medications…
Koskinen talks about the creation of entire ecosystems driven by data:
“With its ability to accommodate many large customers at once, DataLake and data services of HUS presents a new kind of competitive advantage.”
The broad-spanning, secure use of health-related data is also an issue at the Aalto University. Last year, Aalto launched a secure IT environment called SECDATA for very sensitive research data, such as health data and special categories of personal data. Health & Wellbeing is one of the seven key research areas at Aalto.
“Researchers may use SECDATA in projects involving research data that needs a highly secure, controlled and restricted environment,” explains Ilari Lähteenmäki, Aalto Project Manager for SECDATA.
Data environment was audited in summer 2022 and shown compliant with the act on the secondary use of health and social data as well as Findata requirements. The SECDATA environment is now ready to be used by research projects that handle secret or sensitive data, such as health data or special categories of personal data.
“We already have some research teams who are making use of the environment, but it’s still early going, since we’re really just getting started,” says Lähteenmäki, adding that the SECDATA project parameters can accommodate as many as 10 research projects per year.
“The data handled in each case is specific to that project and accessed only in the context of that project,” explains Lähteenmäki.
Lähteenmäki is the head of the seven-person project team that put SECDATA together. He believes that SECDATA has “a big role” to play in enabling cutting-edge, data-driven research.
“We’re excited about doing modern research that utilises new technology, while still working within the legislative framework. With such high volumes of information available, we expect that many new discoveries can be made.”
We believe Aalto SECDATA has a big role to play in enabling cutting-edge, data-driven research.
Ilari Lähteenmäki, Project Manager for SECDATA at the Aalto University
A brand new example of the potency of HUS DataLake is found in a fresh study which targeted subgroups of the most common diseases. In an internationally unique data analysis, the 100 most common diagnoses were plucked from the records of HUS’ 1.28 million patients. After that, the joint effects of these diseases were investigated in more detail in more than 520,000 patients during a four-year follow-up period.
The recently published study provides remarkable insight into just how diverse different diseases and their combined effects can be. Dr. Miika Koskinen from HUS, the first author of the study, explains that in typical clinical trials, the patient groups are strictly defined and do not necessarily represent the population.
“In this study, both the amount of data and the scope of the analysis were exceptionally large,” Koskinen says, adding that without HUS DataLake to perform the “heavy lifting” there would be little chance of successfully undertaking such projects.
The study revealed that two out of three patients had several simultaneous diseases and almost half had at least three disease diagnoses. Subgroups were found for all 100 diseases. Of special interest to the data-cracking researchers were the subgroups whose lab results or life expectancy differed from those of other patients with the same underlying disease.
“The study shows the potential of computational modeling. This may indicate that different patient subgroups need different treatments. This is one step towards more customised care,” says Koskinen.
“But it also demonstrates how central computational modeling is in generating evidence from real-world data,” he adds.
Text: Sami J. Anteroinen