algorithms; duplicate detection; data sources; data management