Event processing, interval processing in Excel

(And by Excel, I mean MS Excel, Open Office, and Google Docs.)

I was recently working with a large amount of computer-generated event data. I wanted to analyze the data, but was only concerned with events (rows) that occurred within intervals demarcated by certain start and end events.

At the time, I had no answer for this in Excel. I wrote a small computer program that read the file one line at a time and ignored lines that occurred outside the intervals of interest. Recently I came up with a solution for this problem in Excel, so I thought I would share it here.

In this example, I am going to use a highly simplified traffic study as my example. A computer at a traffic light records 2 kinds of events:

sensor events
on or off, indicating whether or not there is a car in the intersection
light events
red, amber, or green, indicating the new light color

Here are some sample data collected by this computer:

seconds event state
0 light green
7 sensor on
8 sensor off
15 sensor on
16 sensor off
25 light amber
30 light red
60 light green
85 light amber
90 light red
92 sensor on
93 sensor off
120 light green
145 light amber
150 light red
180 light green
199 sensor on
200 sensor off
204 sensor on
205 light amber
206 sensor off
210 light red
240 light green
265 light amber
269 sensor on
270 light red
271 sensor off
300 light green

Let’s say we want to find out how many cars drove through a red light–that is, the light was red when the car started driving through the intersection.

First, add a new column. This column will indicate the current state of the light for each event. That’s trivial for each light event, but associating the state of the light with each sensor event is what we’re after. In this column, add the following formula:

Excel and Google Sheets:
=IF(B2="light",C2,D1)

Open Office Spreadsheets:
=IF(B2="light"; C2; D1)

That formula means:

  • IF the current event is a light event
  • THEN set this cell to the current state
  • ELSE set this cell to the most recent light state.

Next, add another column. This column will indicate whether the row represents a driving through a red light. In this column, add the following formula:

Excel and Google Sheets
=IF(B2="sensor", IF(C2="on", IF(D2="red", 1, 0), 0), 0)

Open Office Spreadsheets
=IF(B2="sensor"; IF(C2="on"; IF(D2="red"; 1; 0); 0); 0)

The above is a nested series of if statements:

  • IF the row contains a sensor event AND
  • IF the sensor event is an on event AND
  • IF the current state of the light is red
  • THEN it is a traffic violation
  • ELSE it is not a traffic violation

Copy these formulae to the other rows, via Edit–Fill–Down (Excel and Open Office) or ctrl-d (or cmd-d on Mac). The spreadsheet should now indicate that there was one incident of running a red light, which occurred at second 92.

One thought on “Event processing, interval processing in Excel”

  1. I came up with the above method because I was attempting to provide a technique to users who were comfortable with Excel (or other spreadsheet software).

    For users familiar with command line tools, the following one-liner produces the same result:

    $ sed -n -e '/red/,/green/!d' -e '/on/p' data.file
    92      sensor  on
    

    An explanation of the options and commands to sed:

    • -n suppresses printing each line
    • -e starts a match expression
    • '/red/,/green/!d' deletes all lines that do not occur between lines matching red and ending with green
    • -e starts another match expression
    • '/on/p' prints only lines containing on

Leave a Reply

Your email address will not be published. Required fields are marked *