versioninfo()
Julia Version 1.6.0-beta1.0 Commit b84990e1ac (2021-01-08 12:42 UTC) Platform Info: OS: Windows (x86_64-w64-mingw32) CPU: Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-11.0.0 (ORCJIT, sandybridge)
using DataFramesMeta
panel_data = DataFrame(
sample_id = [1, 1, 2, 2, 3, 4, 5, 5],
treatment = ["T", "T", "F", "F", "F", "T", "F", "T"],
measure = [2, 3, 1, 1, 4, 3, 4, 5]
)
sample_id | treatment | measure | |
---|---|---|---|
Int64 | String | Int64 | |
1 | 1 | T | 2 |
2 | 1 | T | 3 |
3 | 2 | F | 1 |
4 | 2 | F | 1 |
5 | 3 | F | 4 |
6 | 4 | T | 3 |
7 | 5 | F | 4 |
8 | 5 | T | 5 |
function latest(treatment, measure)
i = findlast(==("T"), treatment)
isnothing(i) ? missing : measure[i]
end
latest (generic function with 1 method)
@linq panel_data |>
groupby(:sample_id) |>
combine(
has_treated = any(==("T"), :treatment),
initial_value = first(:measure),
latest_value_when_treated = latest(:treatment, :measure)
)
sample_id | has_treated | initial_value | latest_value_when_treated | |
---|---|---|---|---|
Int64 | Bool | Int64 | Int64? | |
1 | 1 | 1 | 2 | 3 |
2 | 2 | 0 | 1 | missing |
3 | 3 | 0 | 4 | missing |
4 | 4 | 1 | 3 | 3 |
5 | 5 | 1 | 4 | 5 |