Deleting Duplicates Across A Staggering Amount of Data

  • My job's responsibilities recently got drastically increased, and I need help. I've gone from interpreting data to needing to sort it. I have been given an excel csv of 13 sheets, each with 2 columns of half a million entries, for a total of ~5.6 million rows, and told to get rid of the duplicates across all the sheets. I can combine or mess with the sheets as much as I want, as long as at the end there aren't any duplicates. I'm running on a pretty beefy machine, but I do not know what VBA to input to figure this out. Any help would be immensely appreciated.

    Thank you so much for any help you can give.

  • I should clarify that if an entry appears twice, both entries need to be deleted, not just the duplicate. Thank you again for any help you can offer.

  • I've solved it by pasting each column from the sheets into one document as separate columns then running the duplicate check on this super document. It isn't pretty or elegant, but it works.

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!