Hi guys, new to the forum and I have a rather complex question / problem that I really need help with. The rate that Im learning VBA coding is not going nearly fast enough to handle this myself! ☹
I have a spreadsheet with a list of clients collected from various state sites, but the formatting of the information is fairly random at best. Now Ive manually parsed the combined lists, but as we keep them updated in the future, we need a simpler way to essentially check for duplicates.
In Sheet1, we have the manually combined list with the following column headers: Client Name, DBA Name 1, DBA Name 2, DBA Name 3, Address_Line1, Address_Line2, Address_City, Address_State, Address_Zip, phone number, Website URL, email 1, email 2, email 3, email 4, email 5. All the records are complete in Sheet1.
Now in Sheet2, well paste the partially complete state records in the same field format as Sheet 1. The main problem for us in simply doing a Remove Duplicates is that often the states dont require the same formatting for the LLC, LP, LTD, etc. So in Sheet1, client names (and the DBA fields) will read like ABC, LLC, but many state agencies will have the names be ABC LLC or just ABC, so when we try to remove duplicates, it doesnt realize theyre the same. In addition to this, client ABC, LLC might have a DBA Name 1 as ABC1, LLC , so it doesnt remove a newly added entry if ABC1, LLC is in the Client Name field.
Ideally, this is what we need: We have a complete database in Sheet1, and we need to remove the duplicates in Sheet2 using Client Name, DBA Name 1, DBA Name 2, DBA Name 3, Address_City, and Address_State. We need to check the client name in sheet 2 against the DBA names we have in Sheet1 and treat those as duplicates as well. The non-duplicates in Sheet2, it will copy to Sheet 3 (or even just delete all the duplicates in Sheet2 and leave the non-duplicates there) with the correct ABC, LLC formatting. Client Name almost has to be a bit of a fuzzy match in order to work (Im assuming), and were just at wits end on trying to get this figured out. Manually parsing the lists took us over 2 weeks to do.
I attached an example file of what were dealing with (no info is real).
THANK YOU in advance!
I have a spreadsheet with a list of clients collected from various state sites, but the formatting of the information is fairly random at best. Now Ive manually parsed the combined lists, but as we keep them updated in the future, we need a simpler way to essentially check for duplicates.
In Sheet1, we have the manually combined list with the following column headers: Client Name, DBA Name 1, DBA Name 2, DBA Name 3, Address_Line1, Address_Line2, Address_City, Address_State, Address_Zip, phone number, Website URL, email 1, email 2, email 3, email 4, email 5. All the records are complete in Sheet1.
Now in Sheet2, well paste the partially complete state records in the same field format as Sheet 1. The main problem for us in simply doing a Remove Duplicates is that often the states dont require the same formatting for the LLC, LP, LTD, etc. So in Sheet1, client names (and the DBA fields) will read like ABC, LLC, but many state agencies will have the names be ABC LLC or just ABC, so when we try to remove duplicates, it doesnt realize theyre the same. In addition to this, client ABC, LLC might have a DBA Name 1 as ABC1, LLC , so it doesnt remove a newly added entry if ABC1, LLC is in the Client Name field.
Ideally, this is what we need: We have a complete database in Sheet1, and we need to remove the duplicates in Sheet2 using Client Name, DBA Name 1, DBA Name 2, DBA Name 3, Address_City, and Address_State. We need to check the client name in sheet 2 against the DBA names we have in Sheet1 and treat those as duplicates as well. The non-duplicates in Sheet2, it will copy to Sheet 3 (or even just delete all the duplicates in Sheet2 and leave the non-duplicates there) with the correct ABC, LLC formatting. Client Name almost has to be a bit of a fuzzy match in order to work (Im assuming), and were just at wits end on trying to get this figured out. Manually parsing the lists took us over 2 weeks to do.
I attached an example file of what were dealing with (no info is real).
THANK YOU in advance!