Notebook

In [1]:

#!pip install -q textgenrnn
import os
import pandas as pd
from textgenrnn import textgenrnn

Using TensorFlow backend.

Data¶

In [16]:

BASE_DIR = os.getcwd()
DATA_DIR = os.path.join(BASE_DIR, '..', 'datasets')

# from: https://github.com/amauboussin/arxiv-twitterbot
arxiv_df = pd.read_csv(os.path.join(DATA_DIR, 'brundage_bot.csv'))
print('Total number of papers:', len(arxiv_df))
arxiv_df.head()

Total number of papers: 27188

Out[16]:

	link	time	favorites	rts	authors	category	published	summary	title	tweeted
0	arxiv.org/abs/1611.10003	NaN	NaN	NaN	[Tom A. F. Anderson, C. -H. Ruan]	q-bio.NC	2016-11-30 05:17:11	In summary of the research findings presented ...	Vocabulary and the Brain: Evidence from Neuroi...	0
1	arxiv.org/abs/1611.10007	NaN	NaN	NaN	[M. Amin Rahimian, Amir G. Aghdam]	cs.SY	2016-11-30 05:37:11	In this paper, structural controllability of a...	Structural Controllability of Multi-Agent Netw...	0
2	arxiv.org/abs/1611.10010	NaN	NaN	NaN	[Debidatta Dwibedi, Tomasz Malisiewicz, Vijay ...	cs.CV	2016-11-30 06:00:47	We present a Deep Cuboid Detector which takes ...	Deep Cuboid Detection: Beyond 2D Bounding Boxes	0
3	arxiv.org/abs/1611.10012	2016-12-01 01:46:12	11.0	2.0	[Jonathan Huang, Vivek Rathod, Chen Sun, Mengl...	cs.CV	2016-11-30 06:06:15	In this paper, we study the trade-off between ...	Speed/accuracy trade-offs for modern convoluti...	1
4	arxiv.org/abs/1611.10014	NaN	NaN	NaN	[Yoones Hashemi, Amir H. Banihashemi]	cs.IT	2016-11-30 06:12:45	In this paper, we propose a characterization o...	Characterization and Efficient Exhaustive Sear...	0

In [3]:

# cs.AI Artificial Intelligence -- cs.CL Computation and Language -- cs.CV Computer Vision and Pattern Recognition
# cs.LG Learning -- cs.NE Neural and Evolutionary Computing -- stat.ML Machine Learning

arxiv_df = arxiv_df.loc[(arxiv_df.category == 'cs.AI') | (arxiv_df.category == 'cs.CL') |
                        (arxiv_df.category == 'cs.CV') | (arxiv_df.category == 'cs.LG') |
                         (arxiv_df.category == 'cs.NE') | (arxiv_df.category == 'stat.ML')]
print('Number of deep learning papers:', len(arxiv_df))
arxiv_df.head()

Number of deep learning papers: 10003

Out[3]:

	link	time	favorites	rts	authors	category	published	summary	title	tweeted
2	arxiv.org/abs/1611.10010	NaN	NaN	NaN	[Debidatta Dwibedi, Tomasz Malisiewicz, Vijay ...	cs.CV	2016-11-30 06:00:47	We present a Deep Cuboid Detector which takes ...	Deep Cuboid Detection: Beyond 2D Bounding Boxes	0
3	arxiv.org/abs/1611.10012	2016-12-01 01:46:12	11.0	2.0	[Jonathan Huang, Vivek Rathod, Chen Sun, Mengl...	cs.CV	2016-11-30 06:06:15	In this paper, we study the trade-off between ...	Speed/accuracy trade-offs for modern convoluti...	1
5	arxiv.org/abs/1611.10017	NaN	NaN	NaN	[Gou Koutaki, Keiichiro Shirai, Mitsuru Ambai]	cs.CV	2016-11-30 06:35:39	In this paper, we propose a learning-based sup...	Fast Supervised Discrete Hashing and its Analysis	0
10	arxiv.org/abs/1611.10031	NaN	NaN	NaN	[Peng Liu, Hui Zhang, Kie B. Eom]	cs.LG	2016-11-30 07:34:46	Active deep learning classification of hypersp...	Active Deep Learning for Classification of Hyp...	0
11	arxiv.org/abs/1611.10038	NaN	NaN	NaN	[Si Li, Nianwen Xue]	cs.CL	2016-11-30 07:53:34	A patent is a property right for an invention ...	Towards Accurate Word Segmentation for Chinese...	0

In [4]:

arxiv_df.loc[32, 'title']

Out[4]:

'Fusion of EEG and Musical Features in Continuous Music-emotion\n  Recognition'

In [5]:

# remove newlines in arxiv titles
arxiv_df.title = arxiv_df.title.apply(lambda x: x.replace('\n ', ''))

arxiv_df.loc[32, 'title']

Out[5]:

'Fusion of EEG and Musical Features in Continuous Music-emotion Recognition'

In [6]:

paper_len = (arxiv_df.title.str.len()).mean()
print(f'mean paper title length: {paper_len:.2f} chars')

mean paper title length: 69.33 chars

In [7]:

songs_df = pd.read_json(os.path.join(DATA_DIR, 'song_titles_5yrs.json'))
songs_df.columns = ['title']
print('Total number of songs:', len(songs_df))
songs_df.head()

Total number of songs: 9850

Out[7]:

	title
0	Silver Lining
1	Four Winds
2	Half Love
3	Sowa (Alex Garett & Greg Herma Edit)
4	Killing My Time

In [8]:

song_len = (songs_df.title.str.len()).mean()
print(f'mean song title length: {song_len:.2f} chars')

mean song title length: 22.43 chars

In [9]:

arxiv_titles = '\n'.join(arxiv_df.title)
song_titles = '\n'.join(songs_df.title)

print('arxiv_titles:\n', arxiv_titles[:100])
print('\nsong_titles:\n', song_titles[:50])

arxiv_titles:
 Deep Cuboid Detection: Beyond 2D Bounding Boxes
Speed/accuracy trade-offs for modern convolutional o

song_titles:
 Silver Lining
Four Winds
Half Love
Sowa (Alex Gare

In [25]:

print('arxiv_titles vocab_len:', len(set(arxiv_titles)))
print('song_titles vocab_len:', len(set(song_titles)))
bs = min(len(set(arxiv_titles)), len(set(song_titles)))
print('min vocab_len (and max batch_size):', bs)

arxiv_titles vocab_len: 111
song_titles vocab_len: 142
min vocab_len (and max batch_size): 111

In [11]:

with open(os.path.join(DATA_DIR, 'arxiv_titles.txt'), 'w', encoding='utf-8') as f:
    f.write(arxiv_titles)
    
with open(os.path.join(DATA_DIR, 'song_titles.txt'), 'w', encoding='utf-8') as f:
    f.write(song_titles)

Approach 1: Pretraining¶

DL paper titles -> song titles¶

In [12]:

model_cfg = {
    'rnn_size': 128,
    'rnn_layers': 4,
    'rnn_bidirectional': False, #True,
    'max_length': int(paper_len),
    #'max_words': 10000,
    'dim_embeddings': 100,
    'word_level': False,
}

train_cfg = {
    'num_epochs': 10,
    'gen_epochs': 2,
    'batch_size': bs, # 1024,
    'train_size': 0.8,
    'dropout': 0.0,
    'max_gen_length': int(paper_len*2), #300,
    'validation': True, # False,
    'is_csv': False
}

paper_model_name = 'deep_paper_titles'
textgen = textgenrnn(name=paper_model_name)

textgen.train_from_file(
    file_path=os.path.join(DATA_DIR, 'arxiv_titles.txt'),
    new_model=True,
    num_epochs=train_cfg['num_epochs'],
    gen_epochs=train_cfg['gen_epochs'],
    batch_size=train_cfg['batch_size'],
    train_size=train_cfg['train_size'],
    dropout=train_cfg['dropout'],
    max_gen_length=train_cfg['max_gen_length'],
    validation=train_cfg['validation'],
    is_csv=train_cfg['is_csv'],
    rnn_layers=model_cfg['rnn_layers'],
    rnn_size=model_cfg['rnn_size'],
    rnn_bidirectional=model_cfg['rnn_bidirectional'],
    max_length=model_cfg['max_length'],
    dim_embeddings=model_cfg['dim_embeddings'],
    word_level=model_cfg['word_level']
)

10,002 texts collected.
Training new model w/ 4-layer, 128-cell LSTMs
Training on 562,837 character sequences.
Epoch 1/10
8794/8794 [==============================] - 406s 46ms/step - loss: 2.0727 - val_loss: 1.3352
Epoch 2/10
8794/8794 [==============================] - 398s 45ms/step - loss: 1.2272 - val_loss: 1.1812
####################
Temperature: 0.2
####################
Deep Learning of Computer Neural Networks with Deep Convolutional Neural Networks

State Recognition of Control with Deep Learning of Convolutional Neural Networks

State of Deep Learning of Convolutional Neural Networks for Structured Matrix Faction

####################
Temperature: 0.5
####################
Structural Sparse Shallow Analysis of Humanoratic Models for Prediction of Concerving Reconstruction and Typerspectral Regression

A Deep Learning Learning of Image Sparse Recognition with Automatic Deep Learning

Learning the Encoder Transformation in the Encoder in Stochastic Driven Selection of Consistent and Textures

####################
Temperature: 1.0
####################
A Droma of Treatforms: Can Learning to evide neural network for Two Imaging Structures

Grutacurs Assolonithms with semantic Graphical RCV

Deep Sentiment of Character ag Deval Improve Imaging intervention simulations

Epoch 3/10
8794/8794 [==============================] - 489s 56ms/step - loss: 1.1083 - val_loss: 1.1110
Epoch 4/10
8794/8794 [==============================] - 398s 45ms/step - loss: 1.0401 - val_loss: 1.0838
####################
Temperature: 0.2
####################
A Statistical Representation for Behavioral Subspace Classification

Semantic Segmentation in Semantic Segmentation

Semi-supervised Learning of Semantic Segmentation

####################
Temperature: 0.5
####################
Interpretation of Solvers from Saliency of Semantic Constrained Based on State Selection

Statistical Layor Matching Algorithms Based on Propagation

Network programming for the convolutional neural networks

####################
Temperature: 1.0
####################
Focus trafform Generation in space from a Building in widding and syntactc, and mlo a

InduV and Response Margin in Incremental Pindired Datasets for High Document Detection

Mislocationzant Detection Using Deep-Grammatical Images

Epoch 5/10
8794/8794 [==============================] - 397s 45ms/step - loss: 0.9978 - val_loss: 1.0565
Epoch 6/10
8794/8794 [==============================] - 398s 45ms/step - loss: 0.9664 - val_loss: 1.0459
####################
Temperature: 0.2
####################
A New Semantic Parsing with Sparse Representations

Product Networks for Predictive Adversarial Networks

Learning to Predictive Adversarial Networks for Structured Sparsity Alignment and Transfer Learning

####################
Temperature: 0.5
####################
Synthesizing Random Forest Control for Context-Supervision Applications

A New Head Network Function for Presence of Semantic Segmentation

Coupled Data of Convolutional Neural Network Through Structured Detection in Supervision Make Recognition

####################
Temperature: 1.0
####################
Deep Frame based Correction in Reasoning, Algorithms

Madiods error closures

Ehher-classifiers: Madgmability through Transfer Learning

Epoch 7/10
8794/8794 [==============================] - 398s 45ms/step - loss: 0.9214 - val_loss: 1.0258
Epoch 8/10
8794/8794 [==============================] - 398s 45ms/step - loss: 0.8805 - val_loss: 1.0247
####################
Temperature: 0.2
####################
A Deep Learning Approach to Deep Learning Models

A New Method for Deep Learning Approaches for Deep Learning

Semi-supervised learning for spectrum interaction and classification

####################
Temperature: 0.5
####################
A Deep Network Model for Spectrum Reconstruction and Selective Learning

A Deep Learning Approach to Deep Neural Networks for Real-time Transfer Learning

GPU-based Event Detection using Tree Simple Dependencies and Linear Density Estimation and Semi-supervised Learning of Medical Image Ret

####################
Temperature: 1.0
####################
Event-Shallow Using Faster Curriculum Learning Lifte Predictability ildexming

Semi-supervised dependent selection to drive modeling in the surget videos

Long Short Memory Arlience of CNNs evolving Techniques

Epoch 9/10
8794/8794 [==============================] - 398s 45ms/step - loss: 0.8375 - val_loss: 1.0179
Epoch 10/10
8794/8794 [==============================] - 398s 45ms/step - loss: 0.7921 - val_loss: 1.0204
####################
Temperature: 0.2
####################
A Survey on Deep Neural Network for Sequence Labeling and Sequence Machine Translation

A Comparison of Neural Network Architectures for Semantic Segmentation

A Survey on Deep Learning Approach for Semantic Segmentation

####################
Temperature: 0.5
####################
A Multi-View Structure of Stream Selection in Continuous Functions and Minimization

Semi-Supervised Learning for Learning a Multi-task Matching

Automatic Structure of Semantic Parsing with Convolutional Neural Networks

####################
Temperature: 1.0
####################
IDAN: Play-Mo: Reading Hand synaphrous Conversations in Linear Regression with Optimal CT Dictionary Learning

Penal-related to nissating harvetryctive multi-view system

Scale-based Graph-Machine Learning Localization using Swarmprint Anatomety in First-Person Content

Out[12]:

"textgen.train_new_model(\n    arxiv_titles,\n    num_epochs=train_cfg['num_epochs'],\n    gen_epochs=train_cfg['gen_epochs'],\n    batch_size=train_cfg['batch_size'],\n    train_size=train_cfg['train_size'],\n    dropout=train_cfg['dropout'],\n    max_gen_length=train_cfg['max_gen_length'],\n    validation=train_cfg['validation'],\n    is_csv=train_cfg['is_csv'],\n    rnn_layers=model_cfg['rnn_layers'],\n    rnn_size=model_cfg['rnn_size'],\n    rnn_bidirectional=model_cfg['rnn_bidirectional'],\n    max_length=model_cfg['max_length'],\n    dim_embeddings=model_cfg['dim_embeddings'],\n    word_level=model_cfg['word_level']\n)"

In [18]:

textgen_song = textgenrnn(weights_path=os.path.join(DATA_DIR, 'models', f'{paper_model_name}_weights.hdf5'),
                       vocab_path=os.path.join(DATA_DIR, 'models', f'{paper_model_name}_vocab.json'),
                       config_path=os.path.join(DATA_DIR, 'models', f'{paper_model_name}_config.json'))
                          
textgen_song.train_from_file(
    file_path=os.path.join(DATA_DIR, 'song_titles.txt'),
    num_epochs=1,
    gen_epochs=1,
    batch_size=train_cfg['batch_size'],
    train_size=train_cfg['train_size'],
    dropout=0.5,
    max_gen_length=train_cfg['max_gen_length'],
    validation=train_cfg['validation'],
    is_csv=train_cfg['is_csv'],
    rnn_layers=model_cfg['rnn_layers'],
    rnn_size=model_cfg['rnn_size'],
    rnn_bidirectional=model_cfg['rnn_bidirectional'],
    max_length=model_cfg['max_length'],
    dim_embeddings=model_cfg['dim_embeddings'],
    word_level=model_cfg['word_level']
)

9,849 texts collected.
Training on 184,005 character sequences.
Epoch 1/1
2875/2875 [==============================] - 131s 46ms/step - loss: 1.9834 - val_loss: 1.8403
####################
Temperature: 0.2
####################
Way (Roothine Remix)

Will In The We (Kays Remix)

Way To Feel (Bootleg)

####################
Temperature: 0.5
####################
Home (feat. K................5 ...........5.........................5...........5..............5................5....5..................

Without Keep (Flum Remix)

Inter Love

####################
Temperature: 1.0
####################
Reson feat. Jeryni Map (Rockon Text Remix)

Psners (GFURS Reqish)

Lex The Wire In Get Oiv Like French

In [22]:

textgen_song.generate_samples(max_gen_length=100, n=5)

####################
Temperature: 0.2
####################
In The Wend (RAC Mix)

Wond In The We (Roothing Remix)

Stay (Robother Remix)

Stay (Prod. by Know)

Bellon The We Downt (feat. All In The Wook Remix)

####################
Temperature: 0.5
####################
Forest (Jaman Chainsmokers Remix)

Way The Changer (Now Remix)

When In Me (Kill Remix)

Commer

Green My Hand (Boul Club Remix)

####################
Temperature: 1.0
####################
About Un

Glassica

Ku Georgetting

City (The Main Remax)

Goodly Wwin Vice (Koelien Moh Remix)

Song titles -> Paper titles¶

In [36]:

model_cfg = {
    'rnn_size': 128,
    'rnn_layers': 4,
    'rnn_bidirectional': False, #True,
    'max_length': int(song_len),
    #'max_words': 10000,
    'dim_embeddings': 100,
    'word_level': False,
}

train_cfg = {
    'num_epochs': 10,
    'gen_epochs': 2,
    'batch_size': bs, # 1024,
    'train_size': 0.8,
    'dropout': 0.0,
    'max_gen_length': int(song_len*2), #300,
    'validation': True, # False,
    'is_csv': False
}

song_model_name = 'deep_song_titles'
textgen = textgenrnn(name=song_model_name)

textgen.train_from_file(
    file_path=os.path.join(DATA_DIR, 'song_titles.txt'),
    new_model=True,
    num_epochs=train_cfg['num_epochs'],
    gen_epochs=train_cfg['gen_epochs'],
    batch_size=train_cfg['batch_size'],
    train_size=train_cfg['train_size'],
    dropout=train_cfg['dropout'],
    max_gen_length=train_cfg['max_gen_length'],
    validation=train_cfg['validation'],
    is_csv=train_cfg['is_csv'],
    rnn_layers=model_cfg['rnn_layers'],
    rnn_size=model_cfg['rnn_size'],
    rnn_bidirectional=model_cfg['rnn_bidirectional'],
    max_length=model_cfg['max_length'],
    dim_embeddings=model_cfg['dim_embeddings'],
    word_level=model_cfg['word_level']
)

9,849 texts collected.
Training new model w/ 4-layer, 128-cell LSTMs
Training on 184,530 character sequences.
Epoch 1/10
1662/1662 [==============================] - 37s 22ms/step - loss: 3.3772 - val_loss: 2.3540
Epoch 2/10
1662/1662 [==============================] - 36s 22ms/step - loss: 2.0871 - val_loss: 1.9189
####################
Temperature: 0.2
####################
Stronge (feat. Sean Brand)

Searter (feat. Ander Brand)

Wanter (feat. Aling Land)

####################
Temperature: 0.5
####################
Whe Somether

No Love (Life Strac Remix)

Comethan (Strive Remix)

####################
Temperature: 1.0
####################
Wanny

Night (Donk Coce Edit)

The Pennega

Epoch 3/10
1662/1662 [==============================] - 36s 21ms/step - loss: 1.8194 - val_loss: 1.7822
Epoch 4/10
1662/1662 [==============================] - 37s 22ms/step - loss: 1.6628 - val_loss: 1.6967
####################
Temperature: 0.2
####################
Sunnast (Prod. by Big Shain)

The Wild You (Solidis Sambi Remix)

Better Way (feat. Sam Suntant)

####################
Temperature: 0.5
####################
Better With You (feat. Miesy)

Say My Madion (Disco Bootleg)

The Wild (feat. Kelex Boys)

####################
Temperature: 1.0
####################
Nathing You Lover (Im Heud In My. Wanting 

Under Pratch (Benger Kelix Edit)

Somk Live You Feat. K.E.A. & Motthra Have)

Epoch 5/10
1662/1662 [==============================] - 37s 22ms/step - loss: 1.5423 - val_loss: 1.6513
Epoch 6/10
1662/1662 [==============================] - 36s 22ms/step - loss: 1.4369 - val_loss: 1.6237
####################
Temperature: 0.2
####################
Stay (Sam Selektah Remix)

The Starding (feat. Brasstree)

Stay (feat. Lil Yachty) (Cassion Remix)

####################
Temperature: 0.5
####################
Feel In The Cold

Come And Me (feat. Kele & Rae Mars)

Drive Me Look (Lash Remix)

####################
Temperature: 1.0
####################
Eugh feat. Blacks (produced by give Camman

BelieveSpate

Better Tois

Epoch 7/10
1662/1662 [==============================] - 37s 22ms/step - loss: 1.3404 - val_loss: 1.6078
Epoch 8/10
1662/1662 [==============================] - 36s 21ms/step - loss: 1.2503 - val_loss: 1.6263
####################
Temperature: 0.2
####################
Stay With Me

Something Bout You (Remix)

Sunshine

####################
Temperature: 0.5
####################
Hello (feat. Bright The Deep)

Can't Hear Me There (feat. Arternan Ellie 

Hold On We're Going Home (feat. Antis Bloo

####################
Temperature: 1.0
####################
Love Like Thus in Stills

Moving Closer

Cold Bob (Basswris Remix)

Epoch 9/10
1662/1662 [==============================] - 36s 21ms/step - loss: 1.1629 - val_loss: 1.6341
Epoch 10/10
1662/1662 [==============================] - 36s 21ms/step - loss: 1.0816 - val_loss: 1.6477
####################
Temperature: 0.2
####################
The One (Feat. Mathe Dayt)

Superfriends (feat. Kendrick Lamar)

Hold On We're Going Home (Dave Edwards Rem

####################
Temperature: 0.5
####################
The Startion feat. Zies Boy, Phonix Chorto

Bend On My Mind (Marce Remix)

Can't Help In Last (Like Sings Remix)

####################
Temperature: 1.0
####################
Gone Probes feat. Embreesty (Laou Kniss & 

Black Bubble)

Head Up (Loud Luxuse Remix)

In [39]:

textgen_paper = textgenrnn(weights_path=os.path.join(DATA_DIR, 'models', f'{song_model_name}_weights.hdf5'),
                       vocab_path=os.path.join(DATA_DIR, 'models', f'{song_model_name}_vocab.json'),
                       config_path=os.path.join(DATA_DIR, 'models', f'{song_model_name}_config.json'))
                          
textgen_paper.train_from_file(
    file_path=os.path.join(DATA_DIR, 'arxiv_titles.txt'),
    num_epochs=1,
    gen_epochs=1,
    batch_size=train_cfg['batch_size'],
    train_size=train_cfg['train_size'],
    dropout=0.9,
    max_gen_length=train_cfg['max_gen_length'],
    validation=train_cfg['validation'],
    is_csv=train_cfg['is_csv'],
    rnn_layers=model_cfg['rnn_layers'],
    rnn_size=model_cfg['rnn_size'],
    rnn_bidirectional=model_cfg['rnn_bidirectional'],
    max_length=model_cfg['max_length'],
    dim_embeddings=model_cfg['dim_embeddings'],
    word_level=model_cfg['word_level']
)

10,002 texts collected.
Training on 562,523 character sequences.
Epoch 1/1
5067/5067 [==============================] - 106s 21ms/step - loss: 1.3301 - val_loss: 1.1896
####################
Temperature: 0.2
####################
A Simultaneous Search for Structure Learni

A Neural Neural Networks for Multi-label D

A Statistical Search for Structure Models 

####################
Temperature: 0.5
####################
A Neural Networks for Real-time Cluster fo

Bein Low-Reduced Prediction in Multi-modal

Domain Function Using Deep Learning with B

####################
Temperature: 1.0
####################
Multi-Agents independence propagator seep 

Typeoring-Level Mundems

Intervesconronquent Variatorits for RalisN

In [40]:

textgen_paper.generate_samples(max_gen_length=100, temperatures=[0.5, 0.7, 1.0], n=5)

####################
Temperature: 0.5
####################
A Linear Learning Framework for Person Re-identification of Stochastic Sensors Using Model Predict

A Normalized Constrained End-to-End Linear Gradient Face Regression with Deep Learning and Sensor 

A Discovery of Finiter Grade for Learning for Resolution Models

A Deep Neural Networks for Fully Context Placking

Multi-labeling Large Sensing for Multi-view Constrained Residual Learning for Betweed Programs of 

####################
Temperature: 0.7
####################
Neural Networks and Function Processing

Exploining Deep Learning for Face extraction of Structure Detection

Dlatchion Detection for Learning Data for Unsupervised Learning in Visual Recognizing

A tro Constrained Neighbor Models

Trick Prediction with Minimum Band Features Model for Neural Language Modeling and Hate Tracking D

####################
Temperature: 1.0
####################
Multimor: The CLAM-IO Ranch Scoculus Tracking

Motion Large Veor Multillal Recognization Framework for Recognizing Large-scale Information for Fa

Neural Factorizits Prices

Groundded Robust Waras

Non-Seats and Pening Based Netwark Reduced Low-Rank Tabdling

Approach 2: Mixed¶

In [27]:

with open(os.path.join(DATA_DIR, 'mixed_titles.txt'), 'w', encoding='utf-8') as f:
    f.write(arxiv_titles + '\n' + song_titles)

In [28]:

model_cfg = {
    'rnn_size': 128,
    'rnn_layers': 4,
    'rnn_bidirectional': False, #True,
    'max_length': 5, #40,
    'max_words': 10000,
    'dim_embeddings': 100,
    'word_level': False,
}

train_cfg = {
    'num_epochs': 2, # 10,
    'gen_epochs': 1,
    'batch_size': 64, # 1024,
    'train_size': 0.8,
    'dropout': 0.5, # 0.0,
    'max_gen_length': 50, #300,
    'validation': True, # False,
    'is_csv': False
}

mixed_model_name = 'mixed'
textgen = textgenrnn(name=mixed_model_name)
textgen.train_from_file(
    file_path=os.path.join(DATA_DIR, 'mixed_titles.txt'),
    new_model=True,
    num_epochs=train_cfg['num_epochs'],
    gen_epochs=train_cfg['gen_epochs'],
    batch_size=train_cfg['batch_size'],
    train_size=train_cfg['train_size'],
    dropout=train_cfg['dropout'],
    max_gen_length=train_cfg['max_gen_length'],
    validation=train_cfg['validation'],
    is_csv=train_cfg['is_csv'],
    rnn_layers=model_cfg['rnn_layers'],
    rnn_size=model_cfg['rnn_size'],
    rnn_bidirectional=model_cfg['rnn_bidirectional'],
    max_length=model_cfg['max_length'],
    dim_embeddings=model_cfg['dim_embeddings'],
    word_level=model_cfg['word_level']
)

19,852 texts collected.
Training new model w/ 4-layer, 128-cell LSTMs
Training on 747,719 character sequences.
Epoch 1/2
11683/11683 [==============================] - 147s 13ms/step - loss: 1.7612 - val_loss: 1.5157
####################
Temperature: 0.2
####################
Based Recognition in Deep Neural Network for Spe

A State Convolutional Network for State Convolut

Deep Neural Networks

####################
Temperature: 0.5
####################
A Shode Linear Sequential Streaming Remix)

An Every of Localization Me Down (feat. A. Gradi

An Estimation of a Grammodition of Evolutional N

####################
Temperature: 1.0
####################
C-Neclarided Localized Dispricon Layeroa)

Arn't Frame Continuous intelliative for Video Re

Intacle Aggrix (Low Freed in Work

Epoch 2/2
11683/11683 [==============================] - 143s 12ms/step - loss: 1.4492 - val_loss: 1.3971
####################
Temperature: 0.2
####################
Multi-Task Learning for Sparse Remix)

Exploring with Recognition in Context for Neural

Recognition for Multi-modal Recognition for Spar

####################
Temperature: 0.5
####################
Improved Translations

Structural Networks

Multimodal Segmentation of End-to-End Cardand (T

####################
Temperature: 1.0
####################
Whard and Me

Line Cell (Prediction Networks for Supervised-Pe

UltTrims for Visual Loses Remix)

In [30]:

textgen.generate_samples(max_gen_length=100, n=5, temperatures=[0.5, 0.7, 1.0])

####################
Temperature: 0.5
####################
Experiment Learning Embedding for Prediction for Prediction with Media Segmentation

Metric Interpretable Generation

DeepMe (feat. Remix)

On Me (feat. Andreak (feat. Change Weight (Feat. The Minimal Recognition for Deep Neural networks 

Good (Feat. The Love (feat. Bundit Detection

####################
Temperature: 0.7
####################
Framework for Subspace (Dance of Resolutional Networks

Shot Caption

Bie Detection with Generation On (Feat Nonless Free Box Story Method for Parallel recurrent Neural

Everything

Survey

####################
Temperature: 1.0
####################
Gradients

Lesion Restarding with a Wild

Recognitional similarity

Identification Networks

Face Recognizing Label 1017Id2 ME Romix Remix)

Best Of¶

Default textgenrnn settings (2 epochs):

Flexing To The Study [temp 0.5]
Embedding feat. Ceyta Readm [temp 1.0]

Rebalancing songs and papers + max_length of 5 + not bi-directional + dropout of 0.5 + short max_gen_length (2 epochs):

Strong (feat. Salial Networks) [temp 0.2]
Moon (feat. Ligoning for Structured Remix) [temp 0.2]
Completion and Continuous remix [temp 0.5]

Same as last, but max_length of 10 + dropout of 0.1:

State (Feat. Kanna Alignment Algorithms) [temp 0.2]
Like You Way To (Original Learning from a Hadcer convex framework) [temp 0.9]
- reminds me of "(original mix)" or "(original score)" tracks

Same as last, but max_length of 5:

Space Camera & Anomaly [temp 1.0]
- could be a band name
Correction in Remix [temp 0.5]
Automatic Samples for sea Remix [temp 0.8]
Silver (Sving Recognitive Network prooft Remix) [temp 0.8]
- the easiest way to make something a song is to add 'remix' to the end of it
A Subspaces [temp 0.5]

Lost track of these:

Learning Theory (Live Self Model) [temp 0.2]
A Problems of Gents [temp 1.0]
Task2Quous Dreams [temp 1.0]
Drop Loud [temp 1.0]
DeepSEGK: A Reconstruction [temp 1.0]
- could be an album title

Same as 'space camera' one, but dropout of 0.5:

DeepMe (feat. Remix) [temp 0.5]
Framework for Subspace (Dance of Resolutional Networks) [temp 0.7]
An Every of Localization Me Down (feat. A. Gradi) [temp 0.5]

deep_paper_titles only: (not mixed with song titles at all... but they still seem musical)

Deep Sentiment of Character [temp 1.0]
Semantic Segmentation in Semantic Segmentation [temp 0.2]

Takeaways¶

More epochs didn't help
Balancing number of titles from each set helped
Shane mentioned that she (accidentally) pretrained on metal bands and then used transfer learning to add ice cream flavors - however, I couldn't get this approach to work better than just mixing the two sets together and training in one go