Daru allows for a host of functions for analyzing and visualizing time series data. In this notebook we'll go over a few with examples.
For details on using statistical analysis functions offered by daru see this blog post.
require 'distribution'
require 'daru'
require 'gnuplotrb'
true
rng = Distribution::Normal.rng
index = Daru::DateTimeIndex.date_range(:start => '2012-4-2', :periods => 1000, :freq => 'D')
vector = Daru::Vector.new(1000.times.map {rng.call}, index: index)
Daru::Vector:89556300 size: 1000 | |
---|---|
nil | |
2012-04-02T00:00:00+00:00 | 0.9679512581213839 |
2012-04-03T00:00:00+00:00 | 0.022459748645852044 |
2012-04-04T00:00:00+00:00 | -0.6274818246927284 |
2012-04-05T00:00:00+00:00 | -0.3967321369721622 |
2012-04-06T00:00:00+00:00 | -0.3640815613462954 |
2012-04-07T00:00:00+00:00 | -0.2655176999615409 |
2012-04-08T00:00:00+00:00 | 0.45448990105777315 |
2012-04-09T00:00:00+00:00 | -1.455655143209679 |
2012-04-10T00:00:00+00:00 | -0.09323505444387158 |
2012-04-11T00:00:00+00:00 | -0.6752958700462365 |
2012-04-12T00:00:00+00:00 | 0.6619095714438166 |
2012-04-13T00:00:00+00:00 | -0.19145774779908523 |
2012-04-14T00:00:00+00:00 | -0.6263870072028367 |
2012-04-15T00:00:00+00:00 | 0.5536198902207046 |
2012-04-16T00:00:00+00:00 | 1.196208119637418 |
2012-04-17T00:00:00+00:00 | 1.628417216135157 |
2012-04-18T00:00:00+00:00 | -1.0560283956421732 |
2012-04-19T00:00:00+00:00 | -0.02865231840955376 |
2012-04-20T00:00:00+00:00 | 1.1618475815249387 |
2012-04-21T00:00:00+00:00 | -0.20794226211471037 |
2012-04-22T00:00:00+00:00 | 0.9184260477372104 |
2012-04-23T00:00:00+00:00 | 2.0659449521627065 |
2012-04-24T00:00:00+00:00 | 1.1790764197995465 |
2012-04-25T00:00:00+00:00 | -1.3483676256628216 |
2012-04-26T00:00:00+00:00 | 0.35278495620537664 |
2012-04-27T00:00:00+00:00 | -0.5574390609364652 |
2012-04-28T00:00:00+00:00 | 0.898034418947359 |
2012-04-29T00:00:00+00:00 | 1.9099858187147065 |
2012-04-30T00:00:00+00:00 | 0.07235717115859852 |
2012-05-01T00:00:00+00:00 | 1.290704778668627 |
2012-05-02T00:00:00+00:00 | 0.37114466610068964 |
2012-05-03T00:00:00+00:00 | -0.11335398274711217 |
... | ... |
2014-12-27T00:00:00+00:00 | 0.3143039329138767 |
vector = vector.cumsum
Daru::Vector:83472220 size: 1000 | |
---|---|
nil | |
2012-04-02T00:00:00+00:00 | 0.9679512581213839 |
2012-04-03T00:00:00+00:00 | 0.990411006767236 |
2012-04-04T00:00:00+00:00 | 0.36292918207450764 |
2012-04-05T00:00:00+00:00 | -0.03380295489765456 |
2012-04-06T00:00:00+00:00 | -0.39788451624394994 |
2012-04-07T00:00:00+00:00 | -0.6634022162054909 |
2012-04-08T00:00:00+00:00 | -0.20891231514771774 |
2012-04-09T00:00:00+00:00 | -1.6645674583573966 |
2012-04-10T00:00:00+00:00 | -1.7578025128012682 |
2012-04-11T00:00:00+00:00 | -2.4330983828475046 |
2012-04-12T00:00:00+00:00 | -1.771188811403688 |
2012-04-13T00:00:00+00:00 | -1.9626465592027733 |
2012-04-14T00:00:00+00:00 | -2.58903356640561 |
2012-04-15T00:00:00+00:00 | -2.0354136761849055 |
2012-04-16T00:00:00+00:00 | -0.8392055565474876 |
2012-04-17T00:00:00+00:00 | 0.7892116595876695 |
2012-04-18T00:00:00+00:00 | -0.26681673605450373 |
2012-04-19T00:00:00+00:00 | -0.29546905446405747 |
2012-04-20T00:00:00+00:00 | 0.8663785270608813 |
2012-04-21T00:00:00+00:00 | 0.6584362649461709 |
2012-04-22T00:00:00+00:00 | 1.5768623126833812 |
2012-04-23T00:00:00+00:00 | 3.6428072648460876 |
2012-04-24T00:00:00+00:00 | 4.821883684645634 |
2012-04-25T00:00:00+00:00 | 3.4735160589828125 |
2012-04-26T00:00:00+00:00 | 3.8263010151881893 |
2012-04-27T00:00:00+00:00 | 3.268861954251724 |
2012-04-28T00:00:00+00:00 | 4.166896373199084 |
2012-04-29T00:00:00+00:00 | 6.07688219191379 |
2012-04-30T00:00:00+00:00 | 6.149239363072389 |
2012-05-01T00:00:00+00:00 | 7.439944141741016 |
2012-05-02T00:00:00+00:00 | 7.811088807841705 |
2012-05-03T00:00:00+00:00 | 7.697734825094593 |
... | ... |
2014-12-27T00:00:00+00:00 | -50.324192577321014 |
Daru::Vector has a bunch of functions for performing useful statistical analysis of time series data. See this blog post for a comprehensive overview of the statistics functions available on Daru::Vector.
For example, you can calculate the rolling mean of a Vector with the #rolling_mean
function and pass in the loopback length as the argument:
rolling = vector.rolling_mean 60
rolling.tail
Daru::Vector:89221690 size: 10 | |
---|---|
nil | |
2014-12-18T00:00:00+00:00 | -48.63936545993331 |
2014-12-19T00:00:00+00:00 | -48.78082245412963 |
2014-12-20T00:00:00+00:00 | -48.94126415692764 |
2014-12-21T00:00:00+00:00 | -49.089789216908876 |
2014-12-22T00:00:00+00:00 | -49.23217144486508 |
2014-12-23T00:00:00+00:00 | -49.36591686347972 |
2014-12-24T00:00:00+00:00 | -49.4965672112104 |
2014-12-25T00:00:00+00:00 | -49.57885611565935 |
2014-12-26T00:00:00+00:00 | -49.6638742352537 |
2014-12-27T00:00:00+00:00 | -49.74601918836437 |
Using the gnuplotRB gem, it is also possible to directly plot the vector and its rolling mean as line plots on the same graph:
GnuplotRB::Plot.new([vector, with: 'lines', title: 'Vector'], [rolling, with: 'lines', title: 'Rolling Mean'])
df = Daru::DataFrame.new({
a: 1000.times.map {rng.call},
b: 1000.times.map {rng.call},
c: 1000.times.map {rng.call}
}, index: index)
Daru::DataFrame:79698410 rows: 1000 cols: 3 | |||
---|---|---|---|
a | b | c | |
2012-04-02T00:00:00+00:00 | 1.0616143453186424 | 2.3266221496435744 | 0.800149709055792 |
2012-04-03T00:00:00+00:00 | 0.12026688538504984 | -1.4504577526844833 | -1.5388358524911092 |
2012-04-04T00:00:00+00:00 | 0.25427284864154065 | 0.13889056199985192 | -1.3185311292454616 |
2012-04-05T00:00:00+00:00 | -0.9855017281117469 | 0.9599929644102919 | -0.06439155830088318 |
2012-04-06T00:00:00+00:00 | 1.2655032740550267 | -0.16545686767846277 | 1.2928485625454482 |
2012-04-07T00:00:00+00:00 | -0.42422600474001293 | 0.7584060310346138 | 0.3821816460314443 |
2012-04-08T00:00:00+00:00 | -1.151343900683176 | 0.50805593667415 | -0.7715409166526194 |
2012-04-09T00:00:00+00:00 | 1.0912599199241675 | -0.005145772637998991 | -1.5678298940523423 |
2012-04-10T00:00:00+00:00 | -0.5047629277222628 | -0.2293931788827708 | 0.3232569177826808 |
2012-04-11T00:00:00+00:00 | -1.5764636801088165 | 0.169815232084743 | -0.639404370665545 |
2012-04-12T00:00:00+00:00 | 0.6014562570218327 | 1.2915533193414312 | 0.8853050742535737 |
2012-04-13T00:00:00+00:00 | -0.8007837626375747 | 0.5992933303631207 | 0.23395149626064318 |
2012-04-14T00:00:00+00:00 | -0.5299940902338399 | -0.7133117985562556 | 0.6165175207413627 |
2012-04-15T00:00:00+00:00 | -1.9946303094697215 | -0.2044581205118237 | 0.6771987060372404 |
2012-04-16T00:00:00+00:00 | 0.6184239088338834 | 1.5079964111771995 | 0.10776292460877393 |
2012-04-17T00:00:00+00:00 | 0.8323623906391976 | 0.10034871135135542 | -0.5136728725376927 |
2012-04-18T00:00:00+00:00 | 0.01286756851656907 | -0.8558774364266106 | -0.055980047247675926 |
2012-04-19T00:00:00+00:00 | 0.7161728013135716 | 0.9918779781177881 | -0.1575490720665611 |
2012-04-20T00:00:00+00:00 | 0.6040247255379995 | -0.14112190384703183 | -1.4682887231669144 |
2012-04-21T00:00:00+00:00 | 0.676529502082872 | -2.1552939100657813 | 1.0856411670212278 |
2012-04-22T00:00:00+00:00 | 0.5286804807866233 | -1.4110403152836055 | -0.6552554892142989 |
2012-04-23T00:00:00+00:00 | 1.9816648319971175 | -0.4121922929588636 | -0.634750036062993 |
2012-04-24T00:00:00+00:00 | 0.06779561790187136 | 0.6698057472955746 | -0.002108120154503972 |
2012-04-25T00:00:00+00:00 | -2.1150332133713228 | 0.7803197096200907 | -1.24725638253146 |
2012-04-26T00:00:00+00:00 | -0.17466378292901155 | 0.015456147582551595 | 0.17530022953772142 |
2012-04-27T00:00:00+00:00 | -0.047363922542546566 | -0.3422004085257192 | -0.60786037531044 |
2012-04-28T00:00:00+00:00 | 0.7014623590611182 | -0.32172119652522674 | 0.27950429450967074 |
2012-04-29T00:00:00+00:00 | -0.5152692094845971 | 0.5277933769847741 | 0.38844682209511927 |
2012-04-30T00:00:00+00:00 | 0.09252667224353309 | 1.130514516829213 | 1.0972521225096232 |
2012-05-01T00:00:00+00:00 | -0.16487177242938128 | 0.26241287649123896 | -0.5481012155160905 |
2012-05-02T00:00:00+00:00 | -0.012434541914728222 | -0.785001089927654 | 0.10035857307925267 |
2012-05-03T00:00:00+00:00 | -0.5128929539120438 | -0.047910903631057045 | -0.4370187065453966 |
... | ... | ... | ... |
2014-12-27T00:00:00+00:00 | -0.5667844582713343 | -0.7474991238031873 | -0.04629196201863243 |
df = df.cumsum
Daru::DataFrame:90682160 rows: 1000 cols: 3 | |||
---|---|---|---|
a | b | c | |
2012-04-02T00:00:00+00:00 | 1.0616143453186424 | 2.3266221496435744 | 0.800149709055792 |
2012-04-03T00:00:00+00:00 | 1.1818812307036923 | 0.8761643969590911 | -0.7386861434353172 |
2012-04-04T00:00:00+00:00 | 1.436154079345233 | 1.015054958958943 | -2.057217272680779 |
2012-04-05T00:00:00+00:00 | 0.4506523512334861 | 1.9750479233692348 | -2.1216088309816623 |
2012-04-06T00:00:00+00:00 | 1.7161556252885128 | 1.8095910556907722 | -0.8287602684362141 |
2012-04-07T00:00:00+00:00 | 1.2919296205484998 | 2.567997086725386 | -0.4465786224047698 |
2012-04-08T00:00:00+00:00 | 0.14058571986532375 | 3.076053023399536 | -1.2181195390573891 |
2012-04-09T00:00:00+00:00 | 1.2318456397894912 | 3.070907250761537 | -2.7859494331097316 |
2012-04-10T00:00:00+00:00 | 0.7270827120672284 | 2.841514071878766 | -2.462692515327051 |
2012-04-11T00:00:00+00:00 | -0.8493809680415881 | 3.011329303963509 | -3.102096885992596 |
2012-04-12T00:00:00+00:00 | -0.24792471101975544 | 4.30288262330494 | -2.2167918117390224 |
2012-04-13T00:00:00+00:00 | -1.0487084736573302 | 4.902175953668062 | -1.9828403154783791 |
2012-04-14T00:00:00+00:00 | -1.57870256389117 | 4.188864155111806 | -1.3663227947370165 |
2012-04-15T00:00:00+00:00 | -3.573332873360892 | 3.984406034599982 | -0.689124088699776 |
2012-04-16T00:00:00+00:00 | -2.954908964527008 | 5.492402445777182 | -0.5813611640910021 |
2012-04-17T00:00:00+00:00 | -2.1225465738878104 | 5.592751157128537 | -1.0950340366286948 |
2012-04-18T00:00:00+00:00 | -2.1096790053712415 | 4.736873720701926 | -1.1510140838763707 |
2012-04-19T00:00:00+00:00 | -1.3935062040576698 | 5.728751698819714 | -1.3085631559429318 |
2012-04-20T00:00:00+00:00 | -0.7894814785196703 | 5.587629794972682 | -2.776851879109846 |
2012-04-21T00:00:00+00:00 | -0.11295197643679833 | 3.432335884906901 | -1.6912107120886182 |
2012-04-22T00:00:00+00:00 | 0.415728504349825 | 2.0212955696232955 | -2.346466201302917 |
2012-04-23T00:00:00+00:00 | 2.3973933363469424 | 1.609103276664432 | -2.98121623736591 |
2012-04-24T00:00:00+00:00 | 2.4651889542488137 | 2.2789090239600065 | -2.983324357520414 |
2012-04-25T00:00:00+00:00 | 0.3501557408774909 | 3.059228733580097 | -4.230580740051874 |
2012-04-26T00:00:00+00:00 | 0.17549195794847935 | 3.0746848811626486 | -4.0552805105141525 |
2012-04-27T00:00:00+00:00 | 0.1281280354059328 | 2.7324844726369295 | -4.663140885824593 |
2012-04-28T00:00:00+00:00 | 0.829590394467051 | 2.4107632761117026 | -4.383636591314922 |
2012-04-29T00:00:00+00:00 | 0.31432118498245387 | 2.9385566530964766 | -3.9951897692198024 |
2012-04-30T00:00:00+00:00 | 0.40684785722598693 | 4.06907116992569 | -2.8979376467101794 |
2012-05-01T00:00:00+00:00 | 0.24197608479660565 | 4.331484046416929 | -3.4460388622262697 |
2012-05-02T00:00:00+00:00 | 0.22954154288187742 | 3.5464829564892746 | -3.345680289147017 |
2012-05-03T00:00:00+00:00 | -0.28335141103016637 | 3.4985720528582176 | -3.7826989956924137 |
... | ... | ... | ... |
2014-12-27T00:00:00+00:00 | -5.223160748687952 | 32.26613292375032 | 5.648229057252015 |
rs = df.rolling_sum(60)
plots = []
rs.each_vector_with_index do |vec,n|
plots << GnuplotRB::Plot.new([vec, with: 'lines', title: n])
end
GnuplotRB::Multiplot.new(*plots, layout: [3,1], title: 'Rolling sums')