Friday, December 14, 2007

SQL Duplicate Records - Distinct keyword

Every run into a situation where records in your tables have no unique field and for some reason there are duplicate records? Well, this happened to me in our EDI database in our invoice tables and was cause some errors during the import process in Gentran Server. Here is what I did:

First I created temporary tables that mirrored my production tables. (You can get the script for this through SQL Enterprise Manager by selecting the tables and under Action->All Tasks->Generate SQL Script).

Now pull only one of each of the duplicate records into the temp database using the the 'distinct' keyword:

insert into invoiceheadertemp
select distinct * from invoiceheader where tranno = 217589 or tranno = 217590 or tranno = 217591

insert into invoicedetailtemp
select distinct * from invoicedetail where tranno = 217589 or tranno = 217590 or tranno = 217591


Now delete the records from the production database:

delete from invoiceheader where tranno = 217589 or tranno = 217590 or tranno = 217591

delete from invoicedetail where tranno = 217589 or tranno = 217590 or tranno = 217591


Now insert the unique records from the temp tables:

insert into invoiceheader select distinct * from invoiceheadertemp where tranno = 217589 or tranno = 217590 or tranno = 217591

insert into invoicedetail select distinct * from invoicedetailtemp where tranno = 217589 or tranno = 217590 or tranno = 217591


Someone might ask: "Why not use real temp tables with the # prefix?" Well, these temp tables are also my testing tables that I use to test modified stored procedures as well as testing maps in Gentran Server, so it made sense for me to go ahead and duplicate the tables for this dual purpose.

I hope this helps someone who finds themselves in a similar situation and let me know if it does!

No comments:

Post a Comment