The research group in Database and Data Mining was established to design, develop methods and implement prototype for multi-sources heterogeneous information integration. The goal of our group is to advance research and integrate technologies to support real applications through data analysis, algorithm modeling, and prototype realization. During data federation, privacy issue is one of our most concerns.

Regarding datatypes to integrate, we consider data from structured (e.g, records in DB), semi-structured (e.g, XML, JSON) and unstructured sources (e.g, web pages).   In a broad view of the core techniques, our group applies technologies of database, data mining, natural language processing, machine learning, and ontology based semantic web technology.  As application-driven research, we aim to realize general data integration framework to adapt multiple applications (e.g, information retrieval, recommendation systems, online advertisements) and meanwhile acquire the unique characteristics of domain-data to boost the integration accuracy on specialized domains (e.g, micro-data on demographic, public health and medicine).  

Research Topics

Information extraction
Entity resolution
Data mining based privacy preservation

Project Introduction

Our focus ongoing project is "Privacy-aware Federated Database Infrastructure Construction for Heterogeneous Data Analysis on Micro-Data". This project is central to the investment made ​​by Umeå University in federated databases, including related security solutions and to Actively Participate in the design and selection of technology for the federated database and data integrity solutions of Umeå University.

The research is initiated to include development of both theory and methodologies that are highly relevant for a federated database infrastructure, with research regarding integrity and security in database systems as the most important example.  Our group plays a technology leading role in a University- wide steering which group also includes representatives of database owners, users, operating staff, and individuals with legal and ethical competence.

Group Coordinator: Lili Jiang (homepage)

Advisory Committee: Prof. Erik Elmroth  (homepage)

PhD Co-supervision with Prof. Erik Elmroth

Computational support/collaboration with group Distributed Systems

