Selection by Date and Time

  • Help needed!

    I have the access to following type of data, where the first black numbers are dates located in a column refered to as Tm, the red numbers are located in a column refered to as B and the blue numbers are located in a column refered to as A:

    2004-09-13 09:31:23.0 [COLOR="Red"]44.9[/COLOR] [COLOR="Blue"]45.0[/COLOR]
    2004-09-13 09:31:36.0 [COLOR="Red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:31:43.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:31:43.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:31:43.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:32:31.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:32:31.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:32:36.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:32:53.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:32:53.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:32:53.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:32:53.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:32:58.0 [COLOR="red"]44.8[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:32:59.0 [COLOR="red"]44.8[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:33:16.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:33:34.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:33:45.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:34:28.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:34:28.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:34:28.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:35:07.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]
    2004-09-13 09:35:46.0 [COLOR="red"]44.9[/COLOR] [COLOR="blue"]45.0[/COLOR]

    where the first column are dates specified in days, hours and seconds.

    I would like create a formula or macro that does the following:

    a.) allows me to set an intervall in time, t, say 15 seconds starting from the first observation
    b.) that searches through column Tm and picks out each observation closest in time to t1, t2, t3 etc.
    c.) picks out the corresponding value in column B and A
    d.) places the t:s and the corresponding B:s and A:s in a separate sheet


    t 1 = 2004-09-13 09:31:23:0 (first observation)
    B = 44,9
    A = 45,0.

    t 2 = t 1 + 15 seconds = 2004-09-13 09:31:38:0
    B = 44,9
    A = 45,0

    where the value of A and B are taken from row 2 because the distance in seconds is smaller from t 2 to Tm in row 2 then to all other nearby observations in Tm:

    |2004-09-13 09:31:38:0 (t 2) – 2004-09-13 09:31:36:0 (row 2)| < |2004-09-13 09:31:38:0 ( t 2) – 2004-09-13 09:31:43:0 (row 3)|

    And so on. The data stretches 6 months on intraday basis. Does anyone have a clue on how to make this sorting more managable? I guess a good start is by transforming the dates to numbers, but such an insight wont carry me far. Any help is truly much appreciated.

    Kind regards


  • Re: Selection by Date and Time

    I'm a little confused about what you're asking here.

    I understand that you want to look at the first three time values, and then compare them to each other to determine which values among the three are the closest--right?

    In the example you gave, the first two values are closer than the second and third values.

    But where do we go from there? Are we then comparing the second, third, and fourth values, as the first choice might be inclusive to itself?

    Or should we be comparing the third, fourth and fifth values, since we've already used the first and second data values?

    Still, perhaps we should be examining the fourth, fifth, and sixth values, since we've already done a comparison with the first three values?

    If we're simply comparing every three rows, you could put a formula in every three rows. Say Cell A1 is the label "TM", Cell B1 is "B", and Cell C1 is "A"--and then you'd have your data immediately beneath that (in Cells A2, B2, and C2), in Cell D2, you could put a formula like this:
    =IF(A4-A3>A3-A2,"T1 and T2","T2 and T3")

    That would then compare three values at once. But I'm really not sure where to go next?

    What if the first and third values were closest, what to do with the second value? Do we then compare it to the fourth and fifth values?

    Also, do you disregard the end-product if there are only two rows of data left?

    What about one row at the end?

    Does T1, T2, T3, etc. always refer to values down the list (i.e., the fourth row down is T4)?

    What if the time-values become the same, or would that never occur?

  • Re: Selection by Date and Time

    Hi MrrkFrl,

    Excuse me for my very unpolitely late answer. I hope you are still there :).

    I'll try to clarify, it was quite confusing I must admit:

    Firstly we have three columns, one column containing dates and two columns containing numbers.

    The dates are financial data presented on a high frequency basis, that means it is presented chronologically regarding to the day, hour and second when the actual trades have been done. Normal financial data is presented in fixed time intervalls, say days, or hours. So you might see the quote at the ending of the specific day or hour of your interest. The high frequency data on the other hand shows the real transactions, that is all transactions, with no regard to any time intervall, and the actual times for those transactions. Plainly speaking, high frequency data shows all transactions that have been made. Se column Tm.

    I would like to search through the columns on the following basis:

    • Pick out the first two observations in the sheet, these are 44.9 and 45.0 and where observed at 2004-09-13 09:31:23.0 (called t1)

    • Proceed picking out observations in the numbers columns on the following basis:

    o The next numbers that are closest in time to t1 + 15 seconds.
    The problem is that there might be no observation that is EXACTLY t1 + 15 seconds, so we must pick the actual numbers that are closest in time to t1 + 15 seconds.

    o Continue this process for t2, t3,, untill all the data has been searched through.

    o The time intervall we want to create is 15 seconds, so t2 = t1 + 15 seconds, t3 = t2 + 15 seconds etc. The whole point of this is to present the highfrequency based on a 15 seconds intervall, that is, make it intervall based data, not high frequency based.

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!