Application showing the interpolation of a single species: Biddulphia Sinensis¶

The interpolation of the full dataset (loop, plots and netCDF generation) is performed in interp_presence_absence_prod

In [1]:

using DIVAnd
using PyPlot
using Proj4
using DelimitedFiles
using PyCall
using Dates
using NCDatasets
include("../scripts/PhytoInterp.jl")
doplot = true

Out[1]:

true

Files and directories¶

In [2]:

figdir = "../product/figures/"
outputdir = "../product/"
datadir = "../data/"
datafile = joinpath(datadir, "Biddulphia sinensis-1995-2020.csv")
isdir(datadir) ? " " : mkpath(datadir)
isdir(figdir) ? " " : mkpath(figdir)
isfile(datafile) ? @info("File already downloaded") : download("https://dox.ulg.ac.be/index.php/s/VgLglubaTLetHzc/download", datafile)

┌ Info: File already downloaded
└ @ Main In[2]:7

Prepare mask¶

Interpolation grid¶

In [3]:

longrid = -2.:0.1:10.
latgrid = 51.:0.1:56.

Out[3]:

51.0:0.1:56.0

Download file¶

In [14]:

bathname = joinpath(datadir, "gebco_30sec_4.nc")
if !isfile(bathname)
    download("https://dox.ulg.ac.be/index.php/s/RSwm4HPHImdZoQP/download", bathname)
else
    @info("Bathymetry file already downloaded")
end

Out[14]:

"../data/gebco_30sec_4.nc"

Read bathymetry¶

In [15]:

bx, by, b = load_bath(bathname, true, longrid, latgrid)
@show size(b)

if doplot
    fig = PyPlot.figure()
    ax = PyPlot.subplot(111)
    pcolor(bx,by,b', vmin=0.); 
    colorbar(orientation="horizontal")
    title("Depth")
    show()
end

size(b) = (121, 51)

Metrics¶

In [16]:

mask, (pm, pn),(xi, yi) = DIVAnd.DIVAnd_rectdom(longrid, latgrid);
xi, yi, mask = DIVAnd.load_mask(bathname, true, longrid, latgrid, 0.0);
xx, yy = ndgrid(xi, yi);

Interpolation¶

In [17]:

speciesname = get_species_name(basename(datafile))
speciesslug = get_species_slug(basename(datafile))
@info("Working on $(speciesname)")
        
# Data reading
dates, lons, lats, occurs = read_data_phyto(datafile);

# Conversion of coordinates
lon, lat = transform_coords(lons, lats)

data_presence = occurs .== 1;
data_absence = .!(data_presence);
npre = sum(data_presence)
nabs = sum(data_absence);
@info("Presence: $(npre), absence: $(nabs)");

┌ Info: Working on Biddulphia sinensis
└ @ Main In[17]:3
┌ Info: Presence: 6957, absence: 18820
└ @ Main In[17]:15

Compute heatmap¶

We compute:

the heatmap using all the observation (presence or absence),
the heatmap using only the presence observations,
the heatmap only the absence observations.

The first provides an indication of the sampling effort: high values in an area means a lot of measurements have been made in that area.

In [23]:

inflation = ones(length(lon));
@time dens1, LHM1, LCV1, LSCV1 = DIVAnd_heatmap(mask, (pm,pn), (xx, yy), 
    (lon, lat), inflation, 0.2);
@time dens2, LHM2, LCV2, LSCV2 = DIVAnd_heatmap(mask, (pm,pn), (xx, yy), 
    (lon[data_presence], lat[data_presence]), inflation[data_presence], 0.2);
@time dens3, LHM3, LCV3, LSCV3 = DIVAnd_heatmap(mask, (pm,pn), (xx, yy), 
    (lon[data_absence], lat[data_absence]), inflation[data_absence], 0.2);

┌ Warning: Data array size (25777,) is larger then maximum 10000. Superobservations will be created. To avoid, increase nmax to the desired number of superobs
└ @ DIVAnd /home/ctroupin/.julia/packages/DIVAnd/ZfEqh/src/DIVAnd_heatmap.jl:48

  1.142954 seconds (928.38 k allocations: 1.171 GiB, 4.46% gc time)
  4.177570 seconds (621.36 k allocations: 4.755 GiB, 4.26% gc time)

┌ Warning: Data array size (18820,) is larger then maximum 10000. Superobservations will be created. To avoid, increase nmax to the desired number of superobs
└ @ DIVAnd /home/ctroupin/.julia/packages/DIVAnd/ZfEqh/src/DIVAnd_heatmap.jl:48

  0.974854 seconds (698.74 k allocations: 1.013 GiB, 3.90% gc time)

Finally, the probability field is computed as:

d = npre * dens2 / (npre * dens2 + nabs * dens3)

The reason for this equation is that the heatmap are computed so that their integral over the domain is 1, whatever the number of observations.

In [22]:

d = npre .* dens2 ./ (npre .* dens2 .+ nabs .* dens3);

Plotting¶

In [10]:

if doplot
    plot_heatmap(longrid, latgrid, d, lon, lat, occurs,
        "$(speciesname): normalised heatmap")
    pcolormesh(longrid, latgrid, mask', cmap=PyPlot.cm.binary_r)
end

UndefVarError: d not defined

Stacktrace:
 [1] top-level scope at In[10]:2

Write results in netCDF¶

In [11]:

create_nc_results(joinpath(outputdir, "$(speciesname)_heatmap.nc"), 
    longrid, latgrid, d, speciesname);

UndefVarError: d not defined

Stacktrace:
 [1] top-level scope at In[11]:1