spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel conv.

  • Hi, I'm looking for help extracting data that resides in a single cell into separate cells for each data point separated by a space with in the original 'merged' cell. That's the redux solution I have come up with from the overall problem, but I'll also address the bigger picture in case there is a more comprehensive solution.


    I have mountains of pdf pages of data in a visual column/row format that I wish to convert to Excel in order to perform further analysis, however when exporting to Excel, Adobe is primarily separating by column and not by row. This is ok if I can figure out some other way to divide the column data into rows such that each piece of individual data resides in it's own cell, unless someone has an even more efficient solution to my overall PDF to Excel conversion problem. I have attached sample documents of both the source pdf and the extracted Excel cells. Each data set in my mountain is two pdf pages and the number of rows are not always consistent in each set. There are many formatting challenges that occur with the PDF to Excel conversion. I can work around them, but at minimum (see 'at minimum' tab in the Excel file) if I can learn how to take a single cell/column that contains many non-separated rows and separate the data to their individual cells, that would be most helpful! Thanks in advance for your consideration/solutions.


    forum.ozgrid.com/index.php?attachment/56575/


    forum.ozgrid.com/index.php?attachment/56576/

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    I know it may be confusing at least for me to try to convey the issue with sentences, feel free to ask questions. I have updated the sample spreadsheet with a tab 'ultimate outcome' as this is how the two page data sets will eventually be assembled. I'm not as concerned with the column/row titles as they will be the same for the thousands of data sets I have. Essentially any short cut for even a small part of this will shave hours/days off the conversion process and reduce the potential for keystroke error from entering by hand. Thanks again!


    forum.ozgrid.com/index.php?attachment/56579/

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    Check your pm. Working on helping you, but you need to look at one thing first.

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    Will the Pdf format change, or is it ok to plan on this format being constant? (even with multiple pairs of pdf sheets)

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    It will be fairly consistent but not completely, let me expand:
    Every two pages of the larger pdf should be stitched side by side, the column headings on each of the two pages will remain the same throughout the full document. Since you can see the blacked out parts (essentially not sensitive info), for every municipality there will be two years of line data. What will change is the number of rows contained in each of the 2 page sets, plus occasionally there will be a new county added and a summary line once the county section is finished, it's these slight row variations which has made many of my other efforts useless. does this help?

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    I'm about to leave work to head home. It will be later tonight before I can revisit this, but I will work more on it this evening.

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    Thank you for your effort. I didn't want to bombard the original post with too much info/files and/or requesting without demonstrating trying but since your looking into a more comprehensive solution I have attached another file for consideration.


    2007_sect_7-edit-II - this is the format which I will eventually be putting in the data. Essentially Eliminating the county summation lines, I can run totals and apply the county designation along with other analysis with formulas elsewhere. forum.ozgrid.com/index.php?attachment/56581/


    for those also perusing, the original document is : http://www.njsp.org/info/ucr2007/pdf/2007-sect-7.pdf


    I have also explored other pdf to excel converters... the best one so far does the odd pages perfectly but the even pages have the merged cell syndrome. I didn't follow your previous merged cell solution, perhaps I need more practice/exploration/explanation.


    I will be doing this for all previous years to 2007

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    Tried for a while to get around the importing from pdf, not going very well as of yet.


    While dealing with your example though, had a little progress.


    Ok... few steps here, but best I can do as of right now...


    Lets look at cell F3 in your example.
    Right now the soft line breaks are causing excel to not seperate the numbers.
    If you highlight that cell, and select replace, then under the replace what section hold alt down, then press 010 on the keypad.
    Under the replace with section, enter a single space.


    That will add a space between every item in that cell, by replacing every soft line break with one.


    Next, select the cell and go to the data tab.
    Select text to columns.
    Delimited
    Only select spaces when given the option,
    and leave general in the next section.


    Once that is done, you'll have all your information in one row.


    From there, select the entire row, and Copy it.
    Right click, and select paste special. Then select transpose.


    That will paste the information in a vertical format for you, which will allow you to select the columns you need and paste the information where you need it.


    It will work with any cell that has multiple information in it, like F3 does in your example.


    Still quite a few steps to get where you want, sorry for that. Will try to work on this as I can, but at least as it stands now you can start sorting the data. Also been working on basic macros to to text to columns and then transpose, but not sure if that would be a route you'd like to go.

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    Something like this?


    This is for the first uploaded file.


  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    Wow. Very nice. Started going through that code looking at what each bit does, and began truly realizing the scope of what I don't know.
    Very nice indeed Jindon.

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    Yes indeed, both answers are great! the find/replace space transposing steps I'm putting in my back pocket as I've come across this scenario in the past. But Jindon's code works incredibly well for my particular situation. I'm a vba nubie so this will escalate my knowledge by leaps and bounds. Thanks for all the efforts!


    I'm going to leave the thread open temporarily while I explore the code incase I have further questions as I apply/adapt it to the other pages.

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    I was experimenting with jindon’s code and I believe I’ve come up with a process that will shave an immense number of hours off my efforts to the end goal. Keep in mind I’m an extreme nubie to vba code. I don’t understand how all of the processes work but I studied it long enough and think I was able to correctly assign new variables and change cell references and variable arrays. Essentially I created tab with the vba code such that I can copy and paste selected cells of the merged cell data from the original pdf conversion such that it spits out the results I want in the order I want. I couldn't attach the new file due to the size limitations however I have pasted the modified code below - Thanks for all the help!



  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    Edit your post, select your code, then click the # key above this area... that will show your code properly.
    Thanks.

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    Thanks for the # key visual fix. I also edited variables e, d, c, b, as I found some errors when applying the macro. I guess my next question would be: This code spits out to a new sheet, 25 columns, my question is how can I adjust the code such that I get two blank columns between column 11 & 12 or rather variables q & p?

  • Re: spliting single cell data w/ spaces btwn the data into new cells / pdf to Excel c


    Good.
    It can be tidy up, but no problem if it runs as you wanted...


    Quote from Nerpilis;686045

    TI guess my next question would be: This code spits out to a new sheet, 25 columns, my question is how can I adjust the code such that I get two blank columns between column 11 & 12 or rather variables q & p?


    1)
    Change

    Code
    Redim a(1 To .Execute(myYears).Count, 1 To 25)


    to

    Code
    Redim a(1 To .Execute(myYears).Count, 1 To 27)


    2) Change


    to

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!