Title: REMOVING DUPLICATIONS IN EMAIL ADDRESSES IN LINEAR TIME

Issue Number: Vol. 1, No. 4
Year of Publication: Dec - 2011
Page Numbers: 1011-1017
Authors: Eyas El-Qawasmeh
Journal Name: International Journal of New Computer Architectures and their Applications (IJNCAA)
- Hong Kong

Abstract:


Currently, many government offices and companies use mailing lists for reaching their clients. Any mailing list needs continuous updates that include removing unsubscribed emails, inserting new coming emails, and removing duplications. Duplication can occur when merging two mailing lists into one master mailing list, where both the merged lists contain the same email more than one or when any one of the mailing lists contain the same email many times. Existing algorithms for removing duplications in mailing list require time complexity greater than linear. Most of them sort emails in alphabetical order and then remove the duplication in O(n log n). However, we are able to reduce the time complexity to O(n) using hashing. This saves the time and the efforts of the senders.