combine 24,000 text files and append filename details to each record

  • I receive 24,000 text files once a month that need to be combined into one csv/txt file and/or spreadsheet(tab).


    About a year ago I posted a thread on the same topic which received a fantastic response from jindon that worked great( found here:http://www.ozgrid.com/forum/sh…php?t=165341&goto=newpost**)


    Unfortunately, the format in which the text files are ouput has changed, as has the filename layout. The files are now output with filenames such as:
    (lic#, company name, displaying # records found, date, type.txt)


    40298827_Windham Professionals Inc _Displaying records 1 through 10 of 100_041813_AGENTS.txt
    40298827_Windham Professionals Inc _Displaying records 11 through 20 of 100_041813_AGENTS.txt
    40303726_HEARTLAND CREDIT RESTORATION INC _EANF_041913_AGENTS.txt


    (files with EANF in the filename have no records inside them and can be skipped)


    While the contents of each file look like this: (see attached text file reference)
    --------------------------------------------------------
    Displaying records 1 through 10 of 100
    1020304050


    Full Name
    License Number
    License Type

    Audette , Anthony
    40305196
    Sales Provider

    Burritt , Kimberly
    40313800
    Sales Provider
    --------------------------------------------------


    I would like to combine the contents of the text files while appending the lic#, company name and date from the filenames to each record so the resulting file looks like this:


    40305196 Audette , Anthony Sales Provider 40298827 Windham Professionals Inc 041813
    40313800 Burritt , Kimberly Sales Provider 40298827 Windham Professionals Inc 041813


    As far as I can tell jindon's code is fine except the regex expression needs to be modified to handle the new layout, however that is far beyond me. If anyone can help me with this I would appreciate it very much as I have been spending many many hours combining these records manually. cheers!forum.ozgrid.com/index.php?attachment/53194/

  • Re: combine 24,000 text files and append filename details to each record


    I think this does the job.

  • Re: combine 24,000 text files and append filename details to each record


    This might be sufficient:

  • Re: combine 24,000 text files and append filename details to each record


    Thank you Jindon! I just did a quick test and your code worked great.


    Thank you for your reply as well snb. Unfortunately I get a 'path not found' error on

    Code
    [COLOR=#333333]c00 = c00 & vbLf & Replace(Join(Filter(Split(Replace(Replace(Replace(.opentextfile(ThisWorkbook.Path & "\" & sn(j)).readall, vbCrLf, "|") & "||||", [/COLOR][COLOR=blue]String[/COLOR][COLOR=#333333](4, "|"), ";" & Join(Filter(Filter(Split(sn(j), "_"), ".txt", [/COLOR][COLOR=blue]False[/COLOR][COLOR=#333333]), "Display", [/COLOR][COLOR=blue]False[/COLOR][COLOR=#333333]), ";") & "|"), [/COLOR][COLOR=blue]String[/COLOR][COLOR=#333333](2, "|"), ";"), "|"), ","), vbLf), ";", ",")

    . If it helps, both my spreadsheet and the text files are in c:\test4[/COLOR]

  • Re: combine 24,000 text files and append filename details to each record


    You might have adapted it yourself.....


    If Jindon's code works you gave the wrong information:
    my code looks for TXT files, Jindon's code looks for CSV files



  • Re: combine 24,000 text files and append filename details to each record


    Thank you snb. You are right I specified txt files instead of csv. Sorry about that. I actually did notice that while I was trying to figure out why your code wasn't working and I had already tried changing ".txt" to ".csv" but still received the same error, however instead of appearing instantly it processed for about 20 seconds before popping up. I tried your modified code above just now and received the same 'Path not found' error after 20 seconds or so. Thanks again for looking at this.

  • Re: combine 24,000 text files and append filename details to each record


    Did you use ?


    You can step through the code using F8 in the VBEditor and report which line produces the error.

  • Re: combine 24,000 text files and append filename details to each record


    If that's the same code as your previous post then yes. I just copied it on top of what I had. When I choose the Debug option after it stops it highlights the same line as before:

    Code
    [COLOR=#333333]c00 = c00 & vbLf & Replace(Join(Filter(Split(Replace(Replace(Replace(.opentextfile("C:\test4\" & sn(j)).readall, vbCrLf, "|") & "||||", [/COLOR][COLOR=blue]String[/COLOR][COLOR=#333333](4, "|"), ";" & Join(Filter(Filter(Split(sn(j), "_"), ".csv", [/COLOR][COLOR=blue]False[/COLOR][COLOR=#333333]), "Display", [/COLOR][COLOR=blue]False[/COLOR][COLOR=#333333]), ";") & "|"), [/COLOR][COLOR=blue]String[/COLOR][COLOR=#333333](2, "|"), ";"), "|"), ","), vbLf), ";", ",")

    I will try it again using F8 as you suggest and let you know what I see in the watch window. Tks[/COLOR]

  • Re: combine 24,000 text files and append filename details to each record


    You can see it's not the same code.


    what is the value of sn(j) in that line ? (it's the only variable)

  • Re: combine 24,000 text files and append filename details to each record


    The value of j in the locals window is 5141, which equals the current # of csv files in the folder.
    I can also see the names of all the files it processed by expanding sn.

  • Re: combine 24,000 text files and append filename details to each record


    I assume the directory doesn't contain files that do not match the required contents as indicated in the sample file you posted.
    It the error occurs processing the last file.
    You better tell the value of sn(j).
    I think this might prevent the error:


  • Re: combine 24,000 text files and append filename details to each record


    The directory contains the .xlsm file containing my project and a few csv files I had changed to ".txt" while troubleshooting previously. In any case, the code runs great now and it is FAST :) Thank you so much!!


    I did notice upon examining the results that my data contains about 50 names (out of 100,000) with 2 commas instead of 1 so they don't align properly. I didn't see any of those in the sample files I was working with originally or I would have mentioned it. Is that something that can be adjusted for without too much difficulty? I realize that is a new question and may require a separate post. I can fix them manually in a few minutes so it's not that big a problem but this is a process I have to repeat on a regular basis so it would be nice if it could accommodate them as well. Thanks again for all your help!


    BELK, III , JAMES
    Burton Jr. , Anthony
    ROBERTSON JR. , RICHARD
    Cupp, Jr. , DonaldPerkins, Jr. , Gary

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!