Alrighty folks! Today is all about stats!

I want to know which (if any) of my pages are getting hits, and ideally if they're getting hits from not-robots.

I can easily add a middleware that grabs some stats and posts to a queue to be added to a db (so it doesn't slow down the main thread too badly). Question is, what should I keep?

I very much don't want to collect PII, so I can't grab raw IP. I'd like to know the difference between bots and not-bots, so I'll need User Agent (or at least, the result of userAgent.Contains('bot')). Obviously I need the URL. I'm not expecting very much traffic, so I can bucket into hours (which also helps with the PII issue).

That looks something like:

CREATE TABLE HitsByHour (
  Text Path,
  Number Count,
  Number Date, -- seconds since 1970 to the start of the containing hour
  Number IsBot, -- 1 yes, 0 no
);

I think I also want a view like:

CREATE View IF NOT EXISTS
  HitsTotal (Path, IsBot, Count) AS 
    SELECT Path, IsBot, Sum(Count)
      FROM HitsByHour
      GROUP BY Path, IsBot;

Did some research ("cat access.log") and doing a check for "bot" isn't going to cut the mustard.