Daru allows for a host of functions for analyzing and visualizing time series data. In this notebook we'll go over a few with examples.
For details on using statistical analysis functions offered by daru see this blog post.
require 'distribution'
require 'daru'
require 'gnuplotrb'
true
rng = Distribution::Normal.rng
index = Daru::DateTimeIndex.date_range(:start => '2012-4-2', :periods => 1000, :freq => 'D')
vector = Daru::Vector.new(1000.times.map {rng.call}, index: index)
Daru::Vector:17602640 size: 1000 | |
---|---|
nil | |
2012-04-02T00:00:00+00:00 | 1.8929310014208862 |
2012-04-03T00:00:00+00:00 | -1.7877272477227173 |
2012-04-04T00:00:00+00:00 | 0.13713043059789104 |
2012-04-05T00:00:00+00:00 | 0.08248983987417105 |
2012-04-06T00:00:00+00:00 | 0.5016724046503049 |
2012-04-07T00:00:00+00:00 | 1.4755023856805087 |
2012-04-08T00:00:00+00:00 | 1.3250840528296892 |
2012-04-09T00:00:00+00:00 | -1.4449599562106255 |
2012-04-10T00:00:00+00:00 | 0.8618890299729314 |
2012-04-11T00:00:00+00:00 | 1.1155691054056835 |
2012-04-12T00:00:00+00:00 | 0.5335788810730238 |
2012-04-13T00:00:00+00:00 | -0.7589896655108529 |
2012-04-14T00:00:00+00:00 | 0.043391207814487326 |
2012-04-15T00:00:00+00:00 | 0.3163352310810718 |
2012-04-16T00:00:00+00:00 | -1.397212944702625 |
2012-04-17T00:00:00+00:00 | 0.11548292990715667 |
2012-04-18T00:00:00+00:00 | -1.490174232857472 |
2012-04-19T00:00:00+00:00 | -0.9393382026618173 |
2012-04-20T00:00:00+00:00 | -1.4113215880007444 |
2012-04-21T00:00:00+00:00 | 0.6958532971395335 |
2012-04-22T00:00:00+00:00 | -0.8904903017202142 |
2012-04-23T00:00:00+00:00 | -0.2837001343416957 |
2012-04-24T00:00:00+00:00 | -1.917197520319854 |
2012-04-25T00:00:00+00:00 | 0.7012337714486957 |
2012-04-26T00:00:00+00:00 | 1.1666246183803257 |
2012-04-27T00:00:00+00:00 | 0.29611920958332577 |
2012-04-28T00:00:00+00:00 | 0.8804928443175081 |
2012-04-29T00:00:00+00:00 | -1.6403348075359634 |
2012-04-30T00:00:00+00:00 | -0.31762253519595485 |
2012-05-01T00:00:00+00:00 | -0.12294936853487166 |
2012-05-02T00:00:00+00:00 | 2.4227489569733893 |
2012-05-03T00:00:00+00:00 | 0.11947772841630783 |
... | ... |
2014-12-27T00:00:00+00:00 | 0.18341906625828117 |
vector = vector.cumsum
Daru::Vector:28616380 size: 1000 | |
---|---|
nil | |
2012-04-02T00:00:00+00:00 | 1.8929310014208862 |
2012-04-03T00:00:00+00:00 | 0.10520375369816892 |
2012-04-04T00:00:00+00:00 | 0.24233418429605996 |
2012-04-05T00:00:00+00:00 | 0.32482402417023104 |
2012-04-06T00:00:00+00:00 | 0.826496428820536 |
2012-04-07T00:00:00+00:00 | 2.3019988145010446 |
2012-04-08T00:00:00+00:00 | 3.627082867330734 |
2012-04-09T00:00:00+00:00 | 2.182122911120109 |
2012-04-10T00:00:00+00:00 | 3.04401194109304 |
2012-04-11T00:00:00+00:00 | 4.159581046498724 |
2012-04-12T00:00:00+00:00 | 4.693159927571748 |
2012-04-13T00:00:00+00:00 | 3.934170262060895 |
2012-04-14T00:00:00+00:00 | 3.9775614698753823 |
2012-04-15T00:00:00+00:00 | 4.293896700956454 |
2012-04-16T00:00:00+00:00 | 2.896683756253829 |
2012-04-17T00:00:00+00:00 | 3.0121666861609855 |
2012-04-18T00:00:00+00:00 | 1.5219924533035134 |
2012-04-19T00:00:00+00:00 | 0.5826542506416961 |
2012-04-20T00:00:00+00:00 | -0.8286673373590483 |
2012-04-21T00:00:00+00:00 | -0.13281404021951482 |
2012-04-22T00:00:00+00:00 | -1.023304341939729 |
2012-04-23T00:00:00+00:00 | -1.3070044762814246 |
2012-04-24T00:00:00+00:00 | -3.2242019966012787 |
2012-04-25T00:00:00+00:00 | -2.522968225152583 |
2012-04-26T00:00:00+00:00 | -1.3563436067722572 |
2012-04-27T00:00:00+00:00 | -1.0602243971889314 |
2012-04-28T00:00:00+00:00 | -0.17973155287142328 |
2012-04-29T00:00:00+00:00 | -1.8200663604073868 |
2012-04-30T00:00:00+00:00 | -2.1376888956033415 |
2012-05-01T00:00:00+00:00 | -2.2606382641382132 |
2012-05-02T00:00:00+00:00 | 0.1621106928351761 |
2012-05-03T00:00:00+00:00 | 0.2815884212514839 |
... | ... |
2014-12-27T00:00:00+00:00 | -24.80984292989138 |
Daru::Vector has a bunch of functions for performing useful statistical analysis of time series data. See this blog post for a comprehensive overview of the statistics functions available on Daru::Vector.
For example, you can calculate the rolling mean of a Vector with the #rolling_mean
function and pass in the loopback length as the argument:
rolling = vector.rolling_mean 60
rolling.tail
Daru::Vector:29080000 size: 10 | |
---|---|
nil | |
2014-12-18T00:00:00+00:00 | -19.821585248153262 |
2014-12-19T00:00:00+00:00 | -19.808139883503745 |
2014-12-20T00:00:00+00:00 | -19.781216028083545 |
2014-12-21T00:00:00+00:00 | -19.73020939331389 |
2014-12-22T00:00:00+00:00 | -19.755920206890632 |
2014-12-23T00:00:00+00:00 | -19.766270574147697 |
2014-12-24T00:00:00+00:00 | -19.785619794017194 |
2014-12-25T00:00:00+00:00 | -19.795712631164005 |
2014-12-26T00:00:00+00:00 | -19.837207021104312 |
2014-12-27T00:00:00+00:00 | -19.889931452290522 |
Using the gnuplotRB gem, it is also possible to directly plot the vector and its rolling mean as line plots on the same graph:
GnuplotRB::Plot.new([vector, with: 'lines', title: 'Vector'], [rolling, with: 'lines', title: 'Rolling Mean'])
df = Daru::DataFrame.new({
a: 1000.times.map {rng.call},
b: 1000.times.map {rng.call},
c: 1000.times.map {rng.call}
}, index: index)
Daru::DataFrame:18785760 rows: 1000 cols: 3 | |||
---|---|---|---|
a | b | c | |
2012-04-02T00:00:00+00:00 | -0.6947536205170924 | 2.047322364819309 | -0.8312388511803154 |
2012-04-03T00:00:00+00:00 | 0.4288182884252429 | -0.16866673109830102 | 1.8619871129533594 |
2012-04-04T00:00:00+00:00 | 1.1122260305119145 | 2.401373414519374 | -0.22086231994040165 |
2012-04-05T00:00:00+00:00 | -0.5491357553143638 | 1.0593306090541381 | -1.2358490555536528 |
2012-04-06T00:00:00+00:00 | -0.352435933142354 | 1.036125369825205 | 2.051707999011653 |
2012-04-07T00:00:00+00:00 | -1.5459343423326082 | 1.4145046975407094 | 0.479501909302452 |
2012-04-08T00:00:00+00:00 | -0.9126814651497038 | 0.08997451440228435 | -0.33244467366719316 |
2012-04-09T00:00:00+00:00 | 1.0942284409025929 | -0.5914256631603069 | -0.10965690286961519 |
2012-04-10T00:00:00+00:00 | -1.189599766618536 | -1.6504362069360503 | -1.7834684774901928 |
2012-04-11T00:00:00+00:00 | 0.2555596824456879 | 0.09444524135559265 | -1.573776911863845 |
2012-04-12T00:00:00+00:00 | 0.08915341016926566 | -0.26820372617029165 | -0.9867829661340854 |
2012-04-13T00:00:00+00:00 | 0.06877942837600515 | 0.8308696093317287 | 0.9932475109122552 |
2012-04-14T00:00:00+00:00 | 0.5812462684559282 | 0.828455475213921 | 1.794974039065598 |
2012-04-15T00:00:00+00:00 | 0.5717653544338255 | 0.7968497134039435 | -0.2137281706627073 |
2012-04-16T00:00:00+00:00 | -0.1835472698237467 | 0.11375353633297033 | 1.3995365019881125 |
2012-04-17T00:00:00+00:00 | -1.803631620231082 | 1.1202312653723845 | -1.8772466220257593 |
2012-04-18T00:00:00+00:00 | -0.2394511541502766 | -0.35929726781506643 | 1.2625165836476817 |
2012-04-19T00:00:00+00:00 | 0.9449696065324419 | -1.2238915741892322 | -0.3445182971483625 |
2012-04-20T00:00:00+00:00 | -2.6156262185361383 | 0.8665968401408657 | 0.41715577129962633 |
2012-04-21T00:00:00+00:00 | 0.2759161208244279 | -0.02479736654918991 | -1.0281218944966948 |
2012-04-22T00:00:00+00:00 | 2.2760135389649654 | 0.4279361038038636 | -0.06980266563482719 |
2012-04-23T00:00:00+00:00 | -0.12478400655401789 | 0.19299065428255166 | 0.672079999341098 |
2012-04-24T00:00:00+00:00 | -1.0636558539565024 | 0.8628992768191084 | 0.08168988302828417 |
2012-04-25T00:00:00+00:00 | 1.1740110448821357 | -0.39046297065410035 | 0.8258712835867261 |
2012-04-26T00:00:00+00:00 | 0.9265831360275995 | -0.07846575584946325 | -0.18251048994431143 |
2012-04-27T00:00:00+00:00 | 0.9879069917975268 | -1.532686297548485 | -1.3414889817000768 |
2012-04-28T00:00:00+00:00 | -0.00995997219489966 | 1.856826171470321 | 0.01995383034537257 |
2012-04-29T00:00:00+00:00 | -0.7934548909844359 | 1.2440565724873829 | 1.0453568260856172 |
2012-04-30T00:00:00+00:00 | 0.3819023421725202 | 0.3731272622353217 | 0.10998010272644088 |
2012-05-01T00:00:00+00:00 | 0.7911424676513545 | -0.8697434179710826 | 0.4474602612770729 |
2012-05-02T00:00:00+00:00 | 1.8197435137607467 | 0.00131620355975935 | 0.8575841779364537 |
2012-05-03T00:00:00+00:00 | -0.38642762200568864 | 0.5458009874522263 | 0.14343686573268757 |
... | ... | ... | ... |
2014-12-27T00:00:00+00:00 | -1.3722456421067706 | 1.2428443635021815 | 0.3617790549036807 |
df = df.cumsum
Daru::DataFrame:25605060 rows: 1000 cols: 3 | |||
---|---|---|---|
a | b | c | |
2012-04-02T00:00:00+00:00 | -0.6947536205170924 | 2.047322364819309 | -0.8312388511803154 |
2012-04-03T00:00:00+00:00 | -0.26593533209184955 | 1.8786556337210079 | 1.0307482617730441 |
2012-04-04T00:00:00+00:00 | 0.8462906984200649 | 4.280029048240381 | 0.8098859418326425 |
2012-04-05T00:00:00+00:00 | 0.2971549431057011 | 5.339359657294519 | -0.42596311372101026 |
2012-04-06T00:00:00+00:00 | -0.05528099003665288 | 6.3754850271197245 | 1.6257448852906426 |
2012-04-07T00:00:00+00:00 | -1.601215332369261 | 7.789989724660434 | 2.1052467945930946 |
2012-04-08T00:00:00+00:00 | -2.513896797518965 | 7.879964239062718 | 1.7728021209259015 |
2012-04-09T00:00:00+00:00 | -1.4196683566163721 | 7.288538575902411 | 1.6631452180562862 |
2012-04-10T00:00:00+00:00 | -2.609268123234908 | 5.6381023689663605 | -0.12032325943390654 |
2012-04-11T00:00:00+00:00 | -2.3537084407892204 | 5.732547610321953 | -1.6941001712977515 |
2012-04-12T00:00:00+00:00 | -2.2645550306199547 | 5.464343884151662 | -2.680883137431837 |
2012-04-13T00:00:00+00:00 | -2.1957756022439496 | 6.2952134934833905 | -1.687635626519582 |
2012-04-14T00:00:00+00:00 | -1.6145293337880213 | 7.123668968697311 | 0.10733841254601595 |
2012-04-15T00:00:00+00:00 | -1.0427639793541958 | 7.920518682101255 | -0.10638975811669135 |
2012-04-16T00:00:00+00:00 | -1.2263112491779427 | 8.034272218434225 | 1.2931467438714213 |
2012-04-17T00:00:00+00:00 | -3.029942869409025 | 9.154503483806609 | -0.5840998781543381 |
2012-04-18T00:00:00+00:00 | -3.2693940235593013 | 8.795206215991543 | 0.6784167054933437 |
2012-04-19T00:00:00+00:00 | -2.3244244170268593 | 7.571314641802311 | 0.33389840834498113 |
2012-04-20T00:00:00+00:00 | -4.940050635562997 | 8.437911481943177 | 0.7510541796446075 |
2012-04-21T00:00:00+00:00 | -4.664134514738569 | 8.413114115393986 | -0.27706771485208725 |
2012-04-22T00:00:00+00:00 | -2.3881209757736035 | 8.84105021919785 | -0.34687038048691443 |
2012-04-23T00:00:00+00:00 | -2.5129049823276213 | 9.034040873480402 | 0.32520961885418354 |
2012-04-24T00:00:00+00:00 | -3.5765608362841235 | 9.89694015029951 | 0.4068995018824677 |
2012-04-25T00:00:00+00:00 | -2.402549791401988 | 9.50647717964541 | 1.2327707854691938 |
2012-04-26T00:00:00+00:00 | -1.4759666553743886 | 9.428011423795947 | 1.0502602955248823 |
2012-04-27T00:00:00+00:00 | -0.4880596635768618 | 7.895325126247462 | -0.2912286861751945 |
2012-04-28T00:00:00+00:00 | -0.49801963577176145 | 9.752151297717782 | -0.2712748558298219 |
2012-04-29T00:00:00+00:00 | -1.2914745267561973 | 10.996207870205165 | 0.7740819702557953 |
2012-04-30T00:00:00+00:00 | -0.9095721845836771 | 11.369335132440487 | 0.8840620729822362 |
2012-05-01T00:00:00+00:00 | -0.11842971693232252 | 10.499591714469403 | 1.331522334259309 |
2012-05-02T00:00:00+00:00 | 1.7013137968284242 | 10.500907918029162 | 2.189106512195763 |
2012-05-03T00:00:00+00:00 | 1.3148861748227356 | 11.046708905481388 | 2.3325433779284506 |
... | ... | ... | ... |
2014-12-27T00:00:00+00:00 | 4.228210982672566 | -7.571943878928588 | 52.135586977188005 |
rs = df.rolling_sum(60)
plots = []
rs.each_vector_with_index do |vec,n|
plots << GnuplotRB::Plot.new([vec, with: 'lines', title: n])
end
GnuplotRB::Multiplot.new(*plots, layout: [3,1], title: 'Rolling sums')