Recommendation List Data Prep¶

This notebook does the data preparation for the recommendation list analysis.

Setup¶

In [1]:

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from itertools import product

In [2]:

import ujson

In [3]:

from bookgender.config import data_dir

Load Data¶

Load book gender data and clean it up:

In [4]:

book_gender = pd.read_csv('data/author-gender.csv.gz')
book_gender = book_gender.set_index('item')['gender']
book_gender.loc[book_gender.str.startswith('no-')] = 'unknown'
book_gender.loc[book_gender == 'unlinked'] = 'unknown'
book_gender = book_gender.astype('category')
book_gender.describe()

Out[4]:

count     12234574
unique           4
top        unknown
freq       7271039
Name: gender, dtype: object

In [5]:

book_gender.head()

Out[5]:

item
0       male
1    unknown
2       male
3    unknown
4       male
Name: gender, dtype: category
Categories (4, object): [ambiguous, female, male, unknown]

And load hashes:

In [6]:

book_hash = pd.read_parquet('data/book-hash.parquet').rename(columns={'cluster': 'item'})
book_hash['dcode'] = book_hash['md5'].apply(lambda x: int(x[-1], 16) % 2)
book_hash = book_hash.set_index('item')
book_hash.head()

Out[6]:

	nisbns	md5	dcode
item
0	17	3781b82fabd530590c70cac955b52bb0	0
1	2	4c6606ab43bfbe946a436c0ce7633a7a	0
2	38	e16249d40bf94b35d8a784d73d0511c5	1
3	2	289071ab1041c090ac252616a76fe079	1
4	4	7308735b39347b616ee6be0ab093541e	0

Load the user profile data:

In [7]:

profiles = pd.read_pickle('data/profile-data.pkl')
profiles.head()

Out[7]:

		count	linked	ambiguous	male	female	dcknown	dcyes	PropDC	Known	PropFemale	PropKnown
Set	user
AZ	529	8	8	2	1	4	8	3	0.375000	5	0.800000	0.625000
	1723	25	24	3	15	6	25	14	0.560000	21	0.285714	0.840000
	1810	14	6	0	6	0	8	1	0.125000	6	0.000000	0.428571
	2781	8	8	1	5	1	8	5	0.625000	6	0.166667	0.750000
	2863	6	6	0	6	0	6	4	0.666667	6	0.000000	1.000000

In [8]:

datasets = list(profiles.index.levels[0])
datasets

Out[8]:

['AZ', 'BX-E', 'BX-I', 'GR-E', 'GR-I']

And load the recommendations:

In [9]:

recs = pd.read_parquet('data/study-recs.parquet')
recs.rename(columns={'dataset': 'Set', 'algorithm': 'Algorithm'}, inplace=True)
recs.head()

Out[9]:

	Set	Algorithm	item	score	user	rank
0	BX-E	user-user	49809	11.815094	206877	1
1	BX-E	user-user	481460	11.815094	206877	2
2	BX-E	user-user	559704	11.815094	206877	3
3	BX-E	user-user	572967	11.815094	206877	4
4	BX-E	user-user	2340806	11.815094	206877	5

The original paper truncated recommendation lists to 50. Let's do that too:

In [10]:

recs = recs[recs['rank'] <= 50]

In [11]:

recs.Set.unique()

Out[11]:

array(['BX-E', 'BX-I', 'AZ', 'GR-E', 'GR-I'], dtype=object)

In [12]:

recs.Algorithm.unique()

Out[12]:

array(['user-user', 'item-item', 'als', 'wrls', 'bpr', 'user-user-imp',
       'item-item-imp', 'wrls-imp', 'bpr-imp'], dtype=object)

We will need to extract implicit/explicit from those. In the new paper, we are going to separate out implicit and explicit data for presentation; these functions will help with that.

In [13]:

def select_implicit(data, reset=True):
    if reset:
        data = data.reset_index()
    implicit = data['Algorithm'].str.endswith('-imp') | data['Set'].str.endswith('-I')
    data = data[implicit].assign(Set=data['Set'].str.replace('-I', ''),
                                 Algorithm=data['Algorithm'].str.replace('-imp', ''))
    data['Algorithm'] = data['Algorithm'].str.replace('wrls', 'als')
    return data

In [14]:

def select_explicit(data, reset=True):
    if reset:
        data = data.reset_index()
    implicit = data['Algorithm'].str.endswith('-imp') | data['Set'].str.endswith('-I')
    data = data[~implicit].assign(Set=data['Set'].str.replace('-E', ''))
    return data

And give ourselves a handy way to relable algorithms:

In [15]:

algo_labels = {
    'als': 'ALS',
    'bpr': 'BPR',
    'item-item': 'II',
    'user-user': 'UU'
}

Analyze Rec List Composition¶

In the mean time, let's proceed by computing recommendation list gender data.

In [16]:

recs.drop(columns=['gender'], errors='ignore', inplace=True)
recs = recs.join(book_gender, on='item', how='left')
recs['gender'] = recs['gender'].fillna('unknown')
recs['gender'].describe()

Out[16]:

count     4883258
unique          4
top          male
freq      1935586
Name: gender, dtype: object

And mix in the dummy code data:

In [17]:

recs.drop(columns=['dcode'], errors='ignore', inplace=True)
recs = recs.join(book_hash['dcode'], on='item', how='left')
recs.head()

Out[17]:

	Set	Algorithm	item	score	user	rank	gender	dcode
0	BX-E	user-user	49809	11.815094	206877	1	female	1.0
1	BX-E	user-user	481460	11.815094	206877	2	male	0.0
2	BX-E	user-user	559704	11.815094	206877	3	male	1.0
3	BX-E	user-user	572967	11.815094	206877	4	male	1.0
4	BX-E	user-user	2340806	11.815094	206877	5	male	0.0

Count up the statistics for each list by gender:

In [18]:

rec_stats = recs.groupby(['Set', 'Algorithm', 'user'])['gender'].value_counts().unstack(fill_value=0)
rec_stats.columns = rec_stats.columns.astype('object')
rec_stats['Total'] = rec_stats.sum(axis=1)
rec_stats['Known'] = rec_stats['male'].fillna(0) + rec_stats['female'].fillna(0)
rec_stats['PropKnown'] = rec_stats['Known'] / rec_stats['Total']
rec_stats['PropFemale'] = rec_stats['female'] / rec_stats['Known']
rec_stats

Out[18]:

		gender	ambiguous	female	male	unknown	Total	Known	PropKnown	PropFemale
Set	Algorithm	user
AZ	als	529	2	8	19	21	50	27	0.54	0.296296
		1723	0	12	9	29	50	21	0.42	0.571429
		1810	2	6	9	33	50	15	0.30	0.400000
		2781	1	8	17	24	50	25	0.50	0.320000
		2863	2	4	25	19	50	29	0.58	0.137931
...	...	...	...	...	...	...	...	...	...	...
GR-I	wrls	874933	4	32	14	0	50	46	0.92	0.695652
		875157	8	5	34	3	50	39	0.78	0.128205
		875408	14	7	29	0	50	36	0.72	0.194444
		875441	3	40	7	0	50	47	0.94	0.851064
		875516	25	1	24	0	50	25	0.50	0.040000

99146 rows × 8 columns

In [19]:

rec_stats.info()

<class 'pandas.core.frame.DataFrame'>
MultiIndex: 99146 entries, ('AZ', 'als', 529) to ('GR-I', 'wrls', 875516)
Data columns (total 8 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   ambiguous   99146 non-null  int64  
 1   female      99146 non-null  int64  
 2   male        99146 non-null  int64  
 3   unknown     99146 non-null  int64  
 4   Total       99146 non-null  int64  
 5   Known       99146 non-null  int64  
 6   PropKnown   99146 non-null  float64
 7   PropFemale  98926 non-null  float64
dtypes: float64(2), int64(6)
memory usage: 6.6+ MB

Mix in info from dummy codes:

In [20]:

rec_dc_stats = recs.groupby(['Set', 'Algorithm', 'user'])['dcode'].agg(['count', 'sum', 'mean'])
rec_dc_stats.rename(columns={'count': 'dcknown', 'sum': 'dcyes', 'mean': 'PropDC'}, inplace=True)
rec_dc_stats['dcyes'] = rec_dc_stats['dcyes'].astype('i4')
rec_dc_stats.head()

Out[20]:

			dcknown	dcyes	PropDC
Set	Algorithm	user
AZ	als	529	44	23	0.522727
		1723	39	17	0.435897
		1810	31	16	0.516129
		2781	35	17	0.485714
		2863	37	20	0.540541

In [21]:

rec_stats = rec_stats.join(rec_dc_stats)
rec_stats.head()

Out[21]:

			ambiguous	female	male	unknown	Total	Known	PropKnown	PropFemale	dcknown	dcyes	PropDC
Set	Algorithm	user
AZ	als	529	2	8	19	21	50	27	0.54	0.296296	44	23	0.522727
		1723	0	12	9	29	50	21	0.42	0.571429	39	17	0.435897
		1810	2	6	9	33	50	15	0.30	0.400000	31	16	0.516129
		2781	1	8	17	24	50	25	0.50	0.320000	35	17	0.485714
		2863	2	4	25	19	50	29	0.58	0.137931	37	20	0.540541

Quick status-check on the number of recommendation lists per algorithm, implicit feedback:

In [22]:

select_implicit(rec_stats).groupby(['Set', 'Algorithm'])['Total'].count().unstack()

Out[22]:

Algorithm	als	bpr	item-item	user-user
Set
AZ	5000	5000	5000	5000
BX	5000	5000	4994	4987
GR	5000	5000	5000	4994

Explicit feedback:

In [23]:

select_explicit(rec_stats).groupby(['Set', 'Algorithm'])['Total'].count().unstack()

Out[23]:

Algorithm	als	item-item	user-user
Set
AZ	5000.0	4986.0	4503.0
BX	5000.0	4988.0	4818.0
GR	NaN	5000.0	4876.0

Non-personalized Recommendations¶

We also want to compute the makeup of non-personalized recommendations, to get a baseline level for each algorithm.

In [24]:

az_ratings = pd.read_parquet('data/AZ/ratings.parquet')
bxi_ratings = pd.read_parquet('data/BX-I/ratings.parquet')
bxe_ratings = pd.read_parquet('data/BX-E/ratings.parquet')
gre_ratings = pd.read_parquet('data/GR-E/ratings.parquet')
gri_ratings = pd.read_parquet('data/GR-I/ratings.parquet')

Popularity¶

In [25]:

istats = pd.concat({
    'AZ': az_ratings.groupby('item')['user'].count().nlargest(50),
    'BX-I': bxi_ratings.groupby('item')['user'].count().nlargest(50),
    'BX-E': bxe_ratings.groupby('item')['user'].count().nlargest(50),
    'GR-I': gri_ratings.groupby('item')['user'].count().nlargest(50),
    'GR-E': gre_ratings.groupby('item')['user'].count().nlargest(50)
}, names=['Set'])
istats = istats.reset_index(name='count')
istats.head()

Out[25]:

	Set	item	count
0	AZ	785	21575
1	AZ	8022	19922
2	AZ	1264	18640
3	AZ	27660	15105
4	AZ	5668	13905

In [26]:

irecs = istats.join(book_gender, on='item', how='left')
irecs['gender'] = irecs['gender'].fillna('unknown')
irecs.head()

Out[26]:

	Set	item	count	gender
0	AZ	785	21575	female
1	AZ	8022	19922	female
2	AZ	1264	18640	ambiguous
3	AZ	27660	15105	ambiguous
4	AZ	5668	13905	male

In [27]:

pop_gender = irecs.groupby(['Set', 'gender']).item.count().unstack().fillna(0).astype('i4')
pop_gender.columns = pop_gender.columns.astype('object')
pop_gender['Total'] = pop_gender.sum(axis=1)
pop_gender['Known'] = pop_gender['male'] + pop_gender['female']
pop_gender['PropKnown'] = pop_gender['Known'] / pop_gender['Total']
pop_gender['PropFemale'] = pop_gender['female'] / pop_gender['Known']
pop_gender

Out[27]:

gender	ambiguous	female	male	unknown	Total	Known	PropKnown	PropFemale
Set
AZ	8	17	19	6	50	36	0.72	0.472222
BX-E	8	20	22	0	50	42	0.84	0.476190
BX-I	8	17	25	0	50	42	0.84	0.404762
GR-E	18	13	19	0	50	32	0.64	0.406250
GR-I	17	14	19	0	50	33	0.66	0.424242

Highest Average Rating¶

In [28]:

astats = pd.concat({
    'AZ': az_ratings.groupby('item')['rating'].mean().nlargest(50),
    'BX-E': bxe_ratings.groupby('item')['rating'].mean().nlargest(50),
    'GR-E': gre_ratings.groupby('item')['rating'].mean().nlargest(50)
}, names=['Set'])
astats = astats.reset_index(name='count')
astats.head()

Out[28]:

	Set	item	count
0	AZ	13	5.0
1	AZ	23	5.0
2	AZ	26	5.0
3	AZ	39	5.0
4	AZ	66	5.0

In [29]:

arecs = astats.join(book_gender, on='item', how='left')
arecs['gender'] = arecs['gender'].fillna('unknown')
arecs.head()

Out[29]:

	Set	item	count	gender
0	AZ	13	5.0	male
1	AZ	23	5.0	unknown
2	AZ	26	5.0	male
3	AZ	39	5.0	unknown
4	AZ	66	5.0	unknown

In [30]:

avg_gender = arecs.groupby(['Set', 'gender']).item.count().unstack().fillna(0).astype('i4')
avg_gender.columns = avg_gender.columns.astype('object')
avg_gender['Total'] = avg_gender.sum(axis=1)
avg_gender['Known'] = avg_gender['male'] + avg_gender['female']
avg_gender['PropKnown'] = avg_gender['Known'] / avg_gender['Total']
avg_gender['PropFemale'] = avg_gender['female'] / avg_gender['Known']
avg_gender

Out[30]:

gender	ambiguous	female	male	unknown	Total	Known	PropKnown	PropFemale
Set
AZ	0	7	17	26	50	24	0.48	0.291667
BX-E	3	8	31	8	50	39	0.78	0.205128
GR-E	0	1	7	42	50	8	0.16	0.125000

Recommendation Coverage & Diversity¶

We want to understand how the recommendation lists work to better understand how many items we get.

In [31]:

list_counts = recs.groupby(['Set', 'Algorithm'])['user'].nunique()
list_counts.name = 'Lists'

In [32]:

item_counts = recs.groupby(['Set', 'Algorithm'])['item'].agg(['count', 'nunique'])
item_counts.rename(columns={'count': 'Recs', 'nunique': 'Distinct'}, inplace=True)
item_counts = item_counts.join(list_counts)
item_counts['FracDistinct'] = item_counts['Distinct'] / item_counts['Recs']

What does this look like for implicit?

In [33]:

df = select_implicit(item_counts).set_index(['Algorithm', 'Set']).stack().reorder_levels([0, 2, 1]).unstack().unstack()
df = df.rename(index=algo_labels)
df

Out[33]:

Set	AZ				BX				GR
	Recs	Distinct	Lists	FracDistinct	Recs	Distinct	Lists	FracDistinct	Recs	Distinct	Lists	FracDistinct
Algorithm
ALS	250000.0	17757.0	5000.0	0.071028	250000.0	10658.0	5000.0	0.042632	250000.0	16382.0	5000.0	0.065528
BPR	250000.0	13006.0	5000.0	0.052024	250000.0	42161.0	5000.0	0.168644	250000.0	98105.0	5000.0	0.392420
II	249949.0	120791.0	5000.0	0.483263	249700.0	56902.0	4994.0	0.227881	250000.0	25506.0	5000.0	0.102024
UU	249957.0	47142.0	5000.0	0.188600	248439.0	17978.0	4987.0	0.072364	249383.0	16542.0	4994.0	0.066332

In [34]:

def f_n(n):
    return '{:,.0f}'.format(n)
def f_pct(n):
    return '{:.1f}%'.format(n * 100)

In [35]:

print(df.swaplevel(axis=1).loc[:, ['Recs', 'Distinct', 'FracDistinct']].to_latex(formatters=[
    f_n, f_n, f_pct,
    f_n, f_n, f_pct,
    f_n, f_n, f_pct
]))

\begin{tabular}{lrrrrrrrrr}
\toprule
{} &    Recs & Distinct & FracDistinct &    Recs & Distinct & FracDistinct &    Recs & Distinct & FracDistinct \\
Set &      AZ &       AZ &           AZ &      BX &       BX &           BX &      GR &       GR &           GR \\
Algorithm &         &          &              &         &          &              &         &          &              \\
\midrule
ALS       & 250,000 &   17,757 &         7.1\% & 250,000 &   10,658 &         4.3\% & 250,000 &   16,382 &         6.6\% \\
BPR       & 250,000 &   13,006 &         5.2\% & 250,000 &   42,161 &        16.9\% & 250,000 &   98,105 &        39.2\% \\
II        & 249,949 &  120,791 &        48.3\% & 249,700 &   56,902 &        22.8\% & 250,000 &   25,506 &        10.2\% \\
UU        & 249,957 &   47,142 &        18.9\% & 248,439 &   17,978 &         7.2\% & 249,383 &   16,542 &         6.6\% \\
\bottomrule
\end{tabular}

And explicit?

In [36]:

df = select_explicit(item_counts).set_index(['Algorithm', 'Set']).stack().reorder_levels([0, 2, 1]).unstack().unstack()
df = df.rename(index=algo_labels)
df

Out[36]:

Set	AZ				BX				GR
	Recs	Distinct	Lists	FracDistinct	Recs	Distinct	Lists	FracDistinct	Recs	Distinct	Lists	FracDistinct
Algorithm
ALS	250000.0	47308.0	5000.0	0.189232	250000.0	65.0	5000.0	0.000260	NaN	NaN	NaN	NaN
II	239412.0	113365.0	4986.0	0.473514	248316.0	18588.0	4988.0	0.074856	245944.0	90333.0	5000.0	0.367291
UU	191553.0	109755.0	4503.0	0.572975	219082.0	43475.0	4818.0	0.198442	241523.0	67473.0	4876.0	0.279365

In [37]:

print(df.swaplevel(axis=1).loc[:, ['Recs', 'Distinct', 'FracDistinct']].to_latex(formatters=[
    f_n, f_n, f_pct,
    f_n, f_n, f_pct,
    f_n, f_n, f_pct
]))

\begin{tabular}{lrrrrrrrrr}
\toprule
{} &    Recs & Distinct & FracDistinct &    Recs & Distinct & FracDistinct &    Recs & Distinct & FracDistinct \\
Set &      AZ &       AZ &           AZ &      BX &       BX &           BX &      GR &       GR &           GR \\
Algorithm &         &          &              &         &          &              &         &          &              \\
\midrule
ALS       & 250,000 &   47,308 &        18.9\% & 250,000 &       65 &         0.0\% &     nan &      nan &         nan\% \\
II        & 239,412 &  113,365 &        47.4\% & 248,316 &   18,588 &         7.5\% & 245,944 &   90,333 &        36.7\% \\
UU        & 191,553 &  109,755 &        57.3\% & 219,082 &   43,475 &        19.8\% & 241,523 &   67,473 &        27.9\% \\
\bottomrule
\end{tabular}

Dist. Table¶

In [38]:

select_implicit(rec_stats).groupby(['Algorithm', 'Set']).PropFemale.mean().unstack()

Out[38]:

Set	AZ	BX	GR
Algorithm
als	0.408754	0.403938	0.438804
bpr	0.406626	0.423552	0.440484
item-item	0.387916	0.455593	0.484237
user-user	0.417181	0.389010	0.424384

In [39]:

np.sqrt(select_implicit(rec_stats).groupby(['Algorithm', 'Set']).PropFemale.var()).unstack()

Out[39]:

Set	AZ	BX	GR
Algorithm
als	0.308367	0.188344	0.285130
bpr	0.284455	0.273894	0.316049
item-item	0.310644	0.207831	0.247467
user-user	0.278917	0.168030	0.265406

In [40]:

select_explicit(rec_stats).groupby(['Algorithm', 'Set']).PropFemale.mean().unstack()

Out[40]:

Set	AZ	BX	GR
Algorithm
als	0.405716	0.301085	NaN
item-item	0.388304	0.433522	0.403719
user-user	0.345117	0.400959	0.381393

In [41]:

np.sqrt(select_explicit(rec_stats).groupby(['Algorithm', 'Set']).PropFemale.var()).unstack()

Out[41]:

Set	AZ	BX	GR
Algorithm
als	0.149709	0.012124	NaN
item-item	0.243808	0.137623	0.231388
user-user	0.236402	0.160681	0.161416

Rec List Distributions¶

Now that we have all of this, we can start to look at recommendation list distributions. How is Proportion Female distributed?

In [42]:

grid = sns.FacetGrid(col='Set', row='Algorithm', data=select_implicit(rec_stats), sharey=False, margin_titles=True)
grid.map(sns.distplot, 'PropFemale', kde=False, norm_hist=True)

Out[42]:

<seaborn.axisgrid.FacetGrid at 0x28c3d5ea048>

In [43]:

grid = sns.FacetGrid(col='Set', row='Algorithm', data=select_implicit(rec_stats), sharey=False, margin_titles=True)
grid.map(sns.distplot, 'PropKnown', kde=False, norm_hist=True)

Out[43]:

<seaborn.axisgrid.FacetGrid at 0x28c3eeeb1c8>

In [44]:

grid = sns.FacetGrid(col='Set', row='Algorithm', data=select_explicit(rec_stats), sharey=False, margin_titles=True)
grid.map(sns.distplot, 'PropFemale', kde=False, norm_hist=True)

Out[44]:

<seaborn.axisgrid.FacetGrid at 0x28c45d54d48>

In [45]:

grid = sns.FacetGrid(col='Set', row='Algorithm', data=select_explicit(rec_stats), sharey=False, margin_titles=True)
grid.map(sns.distplot, 'PropKnown', kde=False, norm_hist=True)

Out[45]:

<seaborn.axisgrid.FacetGrid at 0x28c705f9788>

Dummy Code Distributions¶

In [46]:

grid = sns.FacetGrid(col='Set', row='Algorithm', data=select_implicit(rec_stats), sharey=False, margin_titles=True)
grid.map(sns.distplot, 'PropDC', kde=False, norm_hist=True)

Out[46]:

<seaborn.axisgrid.FacetGrid at 0x28c731a9fc8>

In [47]:

grid = sns.FacetGrid(col='Set', row='Algorithm', data=select_explicit(rec_stats), sharey=False, margin_titles=True)
grid.map(sns.distplot, 'PropDC', kde=False, norm_hist=True)

Out[47]:

<seaborn.axisgrid.FacetGrid at 0x28c73e33d88>

Prepare for Modeling¶

With this analysis, we need to prepare our recommendation data for modeling.

Because ALS on BX-E behaves so badly, we can't really use it. Drop from further analysis.

In [48]:

rec_stats = rec_stats.drop(('BX-E', 'als'))

In [49]:

rec_stats.to_pickle('data/rec-data.pkl')

We also want to save this data for STAN.

In [50]:

def inf_dir(sname):
    return data_dir / sname / 'inference'

In [51]:

for sname, frame in rec_stats.groupby('Set'):
    print('preparing STAN input for', sname)
    lists = frame.reset_index().astype({'Algorithm': 'category'})
    algos = lists['Algorithm'].cat.categories
    print(sname, 'has algorithms', algos)
    
    # set up the users
    users = profiles.loc[sname, :]
    users = users.assign(unum=np.arange(len(users), dtype='i4') + 1)
    lists = lists.join(users[['unum']], on='user')
    
    data = {
        'A': len(algos),
        'J': len(users),
        'NL': len(lists),
        'n': users['Known'],
        'y': users['female'],
        'ru': lists['unum'],
        'ra': lists['Algorithm'].cat.codes + 1,
        'rn': lists['Known'],
        'ry': lists['female']
    }
    
    # and write
    dir = inf_dir(sname)
    dir.mkdir(exist_ok=True)
    in_fn = dir / 'full-inputs.json'
    in_fn.write_text(ujson.dumps(data))
    # in_fn.write_text(ujson.dumps(stan_inputs(frame, 'Known', 'female')))

preparing STAN input for AZ
AZ has algorithms Index(['als', 'bpr-imp', 'item-item', 'item-item-imp', 'user-user',
       'user-user-imp', 'wrls-imp'],
      dtype='object')
preparing STAN input for BX-E
BX-E has algorithms Index(['item-item', 'user-user'], dtype='object')
preparing STAN input for BX-I
BX-I has algorithms Index(['bpr', 'item-item', 'user-user', 'wrls'], dtype='object')
preparing STAN input for GR-E
GR-E has algorithms Index(['item-item', 'user-user'], dtype='object')
preparing STAN input for GR-I
GR-I has algorithms Index(['bpr', 'item-item', 'user-user', 'wrls'], dtype='object')

In [ ]: