I've been trying to solve this issue for a couple of hours now, either from the mongodb side or the csv side, it seems like it will be easier to do it from the csv side of things.
What I am trying to do is filter a column called guid by another column called time. The guid field has multiple duplicates and the time is unique per guid.
Sample data:
GUID, Time
6, 1
6, 2
6, 3
7, 4
7, 5
7, 6
The output needs to be:
6, 3
7, 6
Eliminating the duplicate guids by the most recent time.
I know I can kind of do this by sorting the guids then adding a level and sorting time by greatest then eliminating duplicates but when comparing the data I found that it actually eliminated the wrong duplicates and I had data that was older than the most recent.
I don't mean to seem like I'm putting this work on someone else, I am also trying to figure out how to do this on my own.