Database and Data Mining

The research group in Database and Data Mining was established to design, develop methods and implement prototype for multi-sources heterogeneous information integration. The goal of our group is to advance research and integrate technologies to support real applications through data analysis, algorithm modeling, and prototype realization. During data federation, privacy issue is one of our most concerns.

Regarding datatypes to integrate, we consider data from structured (e.g, records in DB), semi-structured (e.g, XML, JSON) and unstructured sources (e.g, web pages).   In a broad view of the core techniques, our group applies technologies of database, data mining, natural language processing, machine learning, and ontology based semantic web technology.  As application-driven research, we aim to realize general data integration framework to adapt multiple applications (e.g, information retrieval, recommendation systems, online advertisements) and meanwhile acquire the unique characteristics of domain-data to boost the integration accuracy on specialized domains (e.g, micro-data on demographic, public health and medicine).  

Research Topics

Information extraction                                                                         
Entity resolution 
Machine learning based differential privacy

Project Introduction

Our focus ongoing project is "Privacy-aware Federated Database Infrastructure Construction for Heterogeneous Data Analysis on Micro-Data". This project is central to the investment made ​​by Umeå University in federated databases, including related security solutions and to Actively Participate in the design and selection of technology for the federated database and data integrity solutions of Umeå University.

The research is initiated to include development of both theory and methodologies that are highly relevant for a federated database infrastructure, with research regarding integrity and security in database systems as the most important example.  Our group plays a technology leading role in a University- wide steering which group also includes representatives of database owners, users, operating staff, and individuals with legal and ethical competence.

Group Coordinator: Lili Jiang (homepage)

Postdoc Researcher:   TBD

Doctoral Student: Xuan-Son Vu (homepage)

PhD Co-supervision with Prof. Erik Elmroth  (homepage)

Computational support/collaboration with group Distributed Systems

Page Editor: Frank Drewes

Print page

News and Events

Congratulations to Xuan-Son Vu, got best student paper award from CICLing 2018. Paper title "Self-adaptive Privacy Concern Detection for User-generated Content"

Congratulations to Xuan-Son Vu, got a short paper accepted by the International Conference on Knowledge Capture (K-CAP) 2017.  Paper title "Personality-Based Knowledge Extraction for Privacy-preserving Data Analysis"

Congratulations to Xuan-Son Vu, got a paper accepted by the 9th Global WordNet Conference (GWC),  2018.  Paper title "Lexical-semantic resources: yet powerful resources for automatic personality classification" 

In focus

Privacy-aware Data Federation Infrastructure

Heterogeneous Data Analysis on Register Data Read more

Contact Information

Lili Jiang
Department of Computing Science, Umeå University
901 87 Umeå 

Visiting Address
Plan 4, MIT-huset, B415

Tel:  +46 (0) 90 786 5827

Contact Form