This notebook serves as explanation for a stackoverflow solution:

333403 ¶

More answers can be found at https://stackoverflow.com/search?q=user:333403+[pandas].

Please remember: upvoting answers keeps people motivated. (And reputation helps to place bounties and thus get answers to hard questions).

In [1]:

import time
print(time.ctime())

try:
    import ipydex
    # this optional and enables the magic comment `##:` to view intermediate results
    %load_ext ipydex.displaytools
except ImportError:
    pass

Sun Dec 25 10:54:25 2022

In [2]:

# original data

import numpy as np
import pandas as pd
df = pd.DataFrame(
    {"Timestamp" : ['1993-11-01' ,'1993-11-02', '1993-11-03', '1993-11-04','1993-11-15'],
     "Austria" : [6.18 ,6.18, 6.17, 6.17, 6.40],
     "Belgium" : [7.05, 7.05, 7.2, 7.5, 7.6],"France" : [7.69, 7.61, 7.67, 7.91, 8.61]
    },
    index = [1, 2, 3,4,5]
) ##:

	Timestamp	Austria	Belgium	France
1	1993-11-01	6.18	7.05	7.69
2	1993-11-02	6.18	7.05	7.61
3	1993-11-03	6.17	7.20	7.67
4	1993-11-04	6.17	7.50	7.91
5	1993-11-15	6.40	7.60	8.61

---

In [3]:

import itertools as it

# omit timestamp-col
cols = list(df.columns)[1:] ##:

col_pairs = list(it.combinations(cols, 2)) ##:

# new data frame (yet empty) 
res = pd.DataFrame()

cols := ['Austria', 'Belgium', 'France']

---

col_pairs := [('Austria', 'Belgium'), ('Austria', 'France'), ('Belgium', 'France')]

---

In [4]:

# complete solution

for pair in col_pairs:
    # select the first three letters of each name of the pair
    corr_name = f"{pair[0][:3]}_{pair[1][:3]}_corr"
    res[corr_name] = df[list(pair)].\
                    rolling(min_periods=1, window=3).\
                    corr().iloc[0::2, -1].reset_index(drop=True)

print(str(res))

   Aus_Bel_corr  Aus_Fra_corr  Bel_Fra_corr
0           NaN           NaN           NaN
1           NaN           NaN           NaN
2     -1.000000     -0.277350      0.277350
3     -0.755929     -0.654654      0.989743
4      0.693375      0.969346      0.849167

In [5]:

# one iteration (i.e. one column):

pair = col_pairs[0] ##:
corr_name = f"{pair[0][:3]}_{pair[1][:3]}_corr" ##:

pair := ('Austria', 'Belgium')

---

corr_name := 'Aus_Bel_corr'

---

In [6]:

# select the corresponding columns
tmp = df[list(pair)] ##:

	Austria	Belgium
1	6.18	7.05
2	6.18	7.05
3	6.17	7.20
4	6.17	7.50
5	6.40	7.60

---

In [7]:

# this rolling objects does not yet specify the operation
tmp2 = tmp.rolling(min_periods=1, window=3) ##:

tmp2 := Rolling [window=3,min_periods=1,center=False,axis=0,method=single]

---

In [8]:

# now specify the operation to `corr`
tmp3 = tmp2.corr() ##:

		Austria	Belgium
1	Austria	NaN	NaN
1	Belgium	NaN	NaN
2	Austria	NaN	NaN
2	Belgium	NaN	NaN
3	Austria	1.000000	-1.000000
3	Belgium	-1.000000	1.000000
4	Austria	1.000000	-0.755929
4	Belgium	-0.755929	1.000000
5	Austria	1.000000	0.693375
5	Belgium	0.693375	1.000000

---

In [9]:

# The previous result contains to much data. We are interested only in every second row.
# It also has a "messed up" index:
tmp3.index

Out[9]:

MultiIndex([(1, 'Austria'),
            (1, 'Belgium'),
            (2, 'Austria'),
            (2, 'Belgium'),
            (3, 'Austria'),
            (3, 'Belgium'),
            (4, 'Austria'),
            (4, 'Belgium'),
            (5, 'Austria'),
            (5, 'Belgium')],
           )

In [10]:

# To only select every second row, we use the iloc property which is documentend here:
# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html

# in short: it allows access to the dataframe cells by purely integer indexing (without using column names etc.)

print(tmp3.iloc.__doc__[:800])

Purely integer-location based indexing for selection by position.

``.iloc[]`` is primarily integer position based (from ``0`` to
``length-1`` of the axis), but may also be used with a boolean
array.

Allowed inputs are:

- An integer, e.g. ``5``.
- A list or array of integers, e.g. ``[4, 3, 0]``.
- A slice object with ints, e.g. ``1:7``.
- A boolean array.
- A ``callable`` function with one argument (the calling Series or
  DataFrame) and that returns valid output for indexing (one of the above).
  This is useful in method chains, when you don't have a reference to the
  calling object, but would like to base your selection on some value.

``.iloc`` will raise ``IndexError`` if a requested indexer is
out-of-bounds, except *slice* indexers which allow out-of-bounds
indexing (this conforms

In [11]:

# the slicing-syntax of iloc is like for numpy arrays: `start:stop:step` for rows and columns 
# `0::2` means: beginning with row index 0 until the end (no stop index) with a stepsizt of 2
# the `-1` means: just select the last column (index -1), no colon -> no slice

# for more information see: https://numpy.org/doc/stable/user/basics.indexing.html#slicing-and-striding

tmp4 = tmp3.iloc[0::2, -1] ##:\n
type(tmp4) ##:

tmp4 :=
1  Austria         NaN
2  Austria         NaN
3  Austria   -1.000000
4  Austria   -0.755929
5  Austria    0.693375
Name: Belgium, dtype: float64

---

(type(tmp4)) := pandas.core.series.Series

---

In [12]:

# finally, we install a new index beginning at 0:
tmp5 = tmp4.reset_index(drop=True)##:\n

tmp5 :=
0         NaN
1         NaN
2   -1.000000
3   -0.755929
4    0.693375
Name: Belgium, dtype: float64

---

See https://stackoverflow.com/a/64832684/333403¶

See https://stackoverflow.com/a/64832684/333403 ¶