RusAge

Data from RusAge (https://www.kaggle.com/oldaandozerskaya/fiction-corpus-for-agebased-text-classification)

RusAge: Corpus for Age-Based Text Classification Russian fiction books' previews with age rating labels.

In [ ]:
import pandas as pd

Columns in the dataset

  • name of the file (books for adults start with 'adults' and children's books start with 'children')
  • book title
  • author
  • age rating according to Russian age rating system
  • genres
In [ ]:
df = pd.read_csv('RusAge.csv', sep=';', names=["filename","book_title","author","age_rating","genres"])
df.head() 
In [ ]:
df.count()
In [ ]:
df[0:50]
In [ ]:
df['age_rating'].value_counts().plot(kind='bar')
In [ ]:
df['author'].value_counts().plot(kind='bar', figsize=(25,8))
In [ ]:
df['author'].value_counts()[df['author'].value_counts() > 20].plot(kind='bar', figsize=(20,8))
In [ ]:
df['genres'].value_counts().plot(kind='bar', figsize=(15,8))
In [ ]:
filt = (df['age_rating']) == 12
genres_count = df.loc[filt, 'genres'].value_counts()
genres_count

References

Sherratt, Tim. (2019, November 17). GLAM-Workbench/csv-explorer (Version v0.1.0). Zenodo. http://doi.org/10.5281/zenodo.3544712

RusAge (https://www.kaggle.com/oldaandozerskaya/fiction-corpus-for-agebased-text-classification)