By 2023, approximately 120 zettabytes of data has been collected worldwide, an exceptionally large quantity. To put this into perspective: if you were to stream HD movies, it's estimated that 1 zettabyte would allow you to stream movies continuously for many millions of years. But less than 1% of the 120 zettabytes is actually used. A large portion of this data is of a structured nature, for example tables, spreadsheets, and relational databases, which typically drive important decision-making processes in healthcare, governments and finance. Smaller companies, non-profit organisations, and public institutions are falling behind in developing their data analytics capabilities, creating an inequality in data literacy compared to large corporations.
The project
Artificial Intelligence (AI) could help, as it has proven to be highly effective in applications involving unstructured data (such as text) and images. However, proportional progress on structured data is currently lacking. With the DataLibra project, Hulsebos and her colleagues want to diminish these gaps by democratizing insight retrieval from (semi-)structured data through Table Representation Learning.
The goal is to provide trustworthy, secure, and responsible data analytics, empowering everyone to make data-driven decisions, easily, effectively and efficiently. The DataLibra project aims to tackle challenges throughout the entire data analytics pipeline. This includes efficient data storage and query execution, automated responsible data quality improvement, multimodal data integration and querying, and retrieval systems. The 5-year project will involve collaborations across various knowledge institutes and innovation labs due to its multidisciplinary nature.
About the fellowship
Hulsebos is one of five researchers who received a National Growth Fund AiNed Fellowship Grant. With this grants programme NWO aims to attract AI talent to Dutch academic research organizations. The programme promotes the development and application of artificial intelligence (AI) in Dutch businesses and governments, and was developed by the Netherlands AI Coalition. It started in 2022 and has awarded 14 grants in total. The DataLibra project receives over 900,000 euros from NWO.