In a time series, how do I add a counter and cumulative sum that resets every 30 minutes?
In a time series, how do I add a counter and cumulative sum that resets every 30 minutes?
I keep forgetting about R’s seq_len()
function. I’m writing about it here, so when I inevitably google for it I can find it here. Let’s see how it works.
We will be using data.table for this example.
library(data.table)
library(lubridate)
#---- fake data ----
times <- seq.POSIXt(
from = as.POSIXct("2021-03-12 08:00:00"),
to = as.POSIXct("2021-03-12 12:00:00"),
by = "sec"
)
accounts <- seq(1111, 9999, by = 1000)
amounts <- seq(1, 5000, by = 1)
set.seed(865)
dt <- data.table(
datetime = sample(times, 1500, replace = TRUE),
account = sample(accounts, 1500, replace = TRUE),
amount = sample(amounts, 1500, replace = TRUE)
)
# order by account and datetime
setkeyv(dt, c("account", "datetime"))
After making our fake time series data, we can uses seq_len(.N)
along with the by
argument in data.table to add a counter that resets for each 30-minute group. Notice the transformation of a variable within the by =
of the data.table.
Cumulative sum by 30-minute periods
#---- counts and sums ----
# count and sum by half hour using seq_len() to show
# each observation's position within its 30-minute group
dt[,
`:=`(n_by_30_mins = seq_len(.N), cumulative_amount = cumsum(amount)) ,
by = .(account, floor_date(datetime, "30 minutes"))]
head(dt, 50)
datetime account amount n_by_30_mins cumulative_amount
1: 2021-03-12 08:00:00 1111 3798 1 3798
2: 2021-03-12 08:02:06 1111 303 2 4101
3: 2021-03-12 08:03:55 1111 1620 3 5721
4: 2021-03-12 08:04:10 1111 4926 4 10647
5: 2021-03-12 08:04:24 1111 3112 5 13759
6: 2021-03-12 08:07:02 1111 3515 6 17274
7: 2021-03-12 08:08:06 1111 1767 7 19041
8: 2021-03-12 08:09:02 1111 1411 8 20452
9: 2021-03-12 08:12:12 1111 1082 9 21534
10: 2021-03-12 08:14:26 1111 2164 10 23698
11: 2021-03-12 08:16:49 1111 3050 11 26748
12: 2021-03-12 08:19:03 1111 4329 12 31077
13: 2021-03-12 08:21:25 1111 4462 13 35539
14: 2021-03-12 08:23:53 1111 969 14 36508
15: 2021-03-12 08:25:41 1111 4287 15 40795
16: 2021-03-12 08:31:25 1111 2323 1 2323
17: 2021-03-12 08:33:29 1111 3629 2 5952
18: 2021-03-12 08:34:25 1111 2697 3 8649
19: 2021-03-12 08:34:57 1111 2878 4 11527
20: 2021-03-12 08:35:41 1111 2921 5 14448
21: 2021-03-12 08:36:12 1111 3286 6 17734
22: 2021-03-12 08:38:18 1111 2393 7 20127
23: 2021-03-12 08:40:44 1111 2245 8 22372
24: 2021-03-12 08:43:19 1111 3969 9 26341
25: 2021-03-12 08:44:26 1111 3517 10 29858
26: 2021-03-12 08:44:28 1111 291 11 30149
27: 2021-03-12 08:45:12 1111 1545 12 31694
28: 2021-03-12 08:45:30 1111 2679 13 34373
29: 2021-03-12 08:47:30 1111 114 14 34487
30: 2021-03-12 08:48:18 1111 958 15 35445
31: 2021-03-12 08:48:45 1111 4566 16 40011
32: 2021-03-12 08:50:03 1111 3496 17 43507
33: 2021-03-12 08:50:55 1111 1823 18 45330
34: 2021-03-12 08:51:06 1111 4881 19 50211
35: 2021-03-12 08:54:06 1111 35 20 50246
36: 2021-03-12 08:55:10 1111 484 21 50730
37: 2021-03-12 08:57:55 1111 2192 22 52922
38: 2021-03-12 09:01:45 1111 461 1 461
39: 2021-03-12 09:02:30 1111 4523 2 4984
40: 2021-03-12 09:04:10 1111 2560 3 7544
41: 2021-03-12 09:05:29 1111 1884 4 9428
42: 2021-03-12 09:06:18 1111 4933 5 14361
43: 2021-03-12 09:07:30 1111 100 6 14461
44: 2021-03-12 09:07:52 1111 99 7 14560
45: 2021-03-12 09:10:21 1111 2865 8 17425
46: 2021-03-12 09:13:05 1111 2584 9 20009
47: 2021-03-12 09:14:44 1111 3649 10 23658
48: 2021-03-12 09:18:51 1111 1139 11 24797
49: 2021-03-12 09:19:13 1111 2582 12 27379
50: 2021-03-12 09:23:20 1111 2875 13 30254
datetime account amount n_by_30_mins cumulative_amount
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Allen (2021, March 13). jeremydata: Counts and Cumulative Sums by Group. Retrieved from https://jeremydata.com/posts/2021-03-13-counts-and-cumulative-sums-by-group/
BibTeX citation
@misc{allen2021counts, author = {Allen, Jeremy}, title = {jeremydata: Counts and Cumulative Sums by Group}, url = {https://jeremydata.com/posts/2021-03-13-counts-and-cumulative-sums-by-group/}, year = {2021} }