Oct 23rd, 2006 by Cristiana Bolchini
In October Giorgio Orsi graduated, defending a thesis on Data Integration; supervisor is Letizia Tanca, co-supervisor is Carlo Curino.
Interoperability among automatic systems is a well known problem, especially for information systems. After the â€™80s, the massive adoption of database systems inside organizations lead to the need to integrate different data
repositories with possibly incompatible data schemata. The process of integrating different data residing at different sources to provide a unified view of this information is known as the Data Integration problem.
This merge can be done by exploiting many different techniques that may involve several levels of the database architecture.
One of the most known approaches to data integration is Datawarehousing; in this approach, data originating from different sources are submitted to a process called ETL (Extraction, Transformation and Loading)
and then stored into a new database with a single and usually denormalized schema. This final database is often structured to store various aggregations of the sourcesâ€™ data to speedup query processing. From an architectural point of view, Datawarehousing can be seen as a tightly coupled approach because the integrated data reside in a single place at query time.