Database Answers Bienecke Reference Library, Yale University

Home Ask a Question Careers Contact us Data Models First Timers Search Site Map
  Data Analysis for Cleansing
'Data Cleansing' is the process of removing errors in source data, typically from operational systems.
This is our Approach :-
  1. Use a Data Dictionary to record each Data Item being cleaned up,
  2. For each Item, identify the responsible owner, and the validation criteria, specified both in English and SQL.
  3. Choose an appropriate Tool.
  4. Run batch jobs repetitively and review the results with the Owners after each Run
  5. Continue until the required level of correctness is reached.
    This will depend on the data items, and will not always be 100%.
  6. Ensure that full documentation is produced for Audit purposes.
This table lists some representative Products, and please let me have any recommendations or comments.

Pricing - prices are often difficult to find on the Web Sites of expensive Products, with an entry-level cost of $100 to $250.
When I email the vendor they are often reluctant to give me costs, meaning 'If you have to ask the price, then you can't afford our product.'

Ascential $250k    
Database Help      
DataFlux $$$   An SAS Company
DigDB $49 "30 power add-ins in 1 toolkit for only $49" For Excel only - gets great reviews.
First Logic   Successful in the States and getting started in the UK,(in April 2006). Recently acquired by Business Objects
Group 1   Looks good, and has European HQ in England. A Pitney-Bowes Company.Offers GeoCoding software.
Identity Systems     Previously called Search Software America
Innovative Systems   Also provide CRM and Customer Data Integration UK Office - Phone 01483-730446 (Woking)
MatchIt $2,995 Consists of four Modules for deduplication. A comprehensive Toolkit.
Prism Solutions      
Trillium   Outstanding global data capabilities. Plugs into many different DB applications. A Harte-Hanks Company
WinPure ListCleaner Pro $249 "A stand-alone product with eight Data Cleansing modules.  


Home Ask a Question Careers Contact us Data Models First Timers Search Site Map