이 노트북은 케라스 창시자에게 배우는 딥러닝 2판의 예제 코드를 담고 있습니다.
데이터 가져오기
!rm -r aclImdb
!curl -O https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
!tar -xf aclImdb_v1.tar.gz
!rm -r aclImdb/train/unsup
rm: cannot remove 'aclImdb': No such file or directory % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 80.2M 100 80.2M 0 0 21.3M 0 0:00:03 0:00:03 --:--:-- 21.3M
데이터 준비
import os, pathlib, shutil, random
from tensorflow import keras
batch_size = 32
base_dir = pathlib.Path("aclImdb")
val_dir = base_dir / "val"
train_dir = base_dir / "train"
for category in ("neg", "pos"):
os.makedirs(val_dir / category)
files = os.listdir(train_dir / category)
random.Random(1337).shuffle(files)
num_val_samples = int(0.2 * len(files))
val_files = files[-num_val_samples:]
for fname in val_files:
shutil.move(train_dir / category / fname,
val_dir / category / fname)
train_ds = keras.utils.text_dataset_from_directory(
"aclImdb/train", batch_size=batch_size
)
val_ds = keras.utils.text_dataset_from_directory(
"aclImdb/val", batch_size=batch_size
)
test_ds = keras.utils.text_dataset_from_directory(
"aclImdb/test", batch_size=batch_size
)
text_only_train_ds = train_ds.map(lambda x, y: x)
Found 20000 files belonging to 2 classes. Found 5000 files belonging to 2 classes. Found 25000 files belonging to 2 classes.
데이터 벡터화
from tensorflow.keras import layers
max_length = 600
max_tokens = 20000
text_vectorization = layers.TextVectorization(
max_tokens=max_tokens,
output_mode="int",
output_sequence_length=max_length,
)
text_vectorization.adapt(text_only_train_ds)
int_train_ds = train_ds.map(
lambda x, y: (text_vectorization(x), y),
num_parallel_calls=4)
int_val_ds = val_ds.map(
lambda x, y: (text_vectorization(x), y),
num_parallel_calls=4)
int_test_ds = test_ds.map(
lambda x, y: (text_vectorization(x), y),
num_parallel_calls=4)
Layer
층을 상속하여 구현한 트랜스포머 인코더
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
class TransformerEncoder(layers.Layer):
def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):
super().__init__(**kwargs)
self.embed_dim = embed_dim
self.dense_dim = dense_dim
self.num_heads = num_heads
self.attention = layers.MultiHeadAttention(
num_heads=num_heads, key_dim=embed_dim)
self.dense_proj = keras.Sequential(
[layers.Dense(dense_dim, activation="relu"),
layers.Dense(embed_dim),]
)
self.layernorm_1 = layers.LayerNormalization()
self.layernorm_2 = layers.LayerNormalization()
def call(self, inputs, mask=None):
if mask is not None:
mask = mask[:, tf.newaxis, :]
attention_output = self.attention(
inputs, inputs, attention_mask=mask)
proj_input = self.layernorm_1(inputs + attention_output)
proj_output = self.dense_proj(proj_input)
return self.layernorm_2(proj_input + proj_output)
def get_config(self):
config = super().get_config()
config.update({
"embed_dim": self.embed_dim,
"num_heads": self.num_heads,
"dense_dim": self.dense_dim,
})
return config
트랜스포머 인코더를 사용하여 텍스트 분류하기
vocab_size = 20000
embed_dim = 256
num_heads = 2
dense_dim = 32
inputs = keras.Input(shape=(None,), dtype="int64")
x = layers.Embedding(vocab_size, embed_dim)(inputs)
x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)
x = layers.GlobalMaxPooling1D()(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.compile(optimizer="rmsprop",
loss="binary_crossentropy",
metrics=["accuracy"])
model.summary()
Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, None)] 0 embedding (Embedding) (None, None, 256) 5120000 transformer_encoder (Trans (None, None, 256) 543776 formerEncoder) global_max_pooling1d (Glob (None, 256) 0 alMaxPooling1D) dropout (Dropout) (None, 256) 0 dense_2 (Dense) (None, 1) 257 ================================================================= Total params: 5664033 (21.61 MB) Trainable params: 5664033 (21.61 MB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
트랜스포머 인코더 기반 모델 훈련하고 평가하기
callbacks = [
keras.callbacks.ModelCheckpoint("transformer_encoder.h5",
save_best_only=True)
]
model.fit(int_train_ds, validation_data=int_val_ds, epochs=20, callbacks=callbacks)
model = keras.models.load_model(
"transformer_encoder.h5",
custom_objects={"TransformerEncoder": TransformerEncoder})
print(f"테스트 정확도: {model.evaluate(int_test_ds)[1]:.3f}")
Epoch 1/20 625/625 [==============================] - 49s 78ms/step - loss: 0.3364 - accuracy: 0.8550 - val_loss: 0.3118 - val_accuracy: 0.8732
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py:3079: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`. saving_api.save_model(
Epoch 2/20 625/625 [==============================] - 43s 69ms/step - loss: 0.3045 - accuracy: 0.8699 - val_loss: 0.2991 - val_accuracy: 0.8732 Epoch 3/20 625/625 [==============================] - 44s 70ms/step - loss: 0.2724 - accuracy: 0.8887 - val_loss: 0.2877 - val_accuracy: 0.8802 Epoch 4/20 625/625 [==============================] - 44s 70ms/step - loss: 0.2435 - accuracy: 0.9021 - val_loss: 0.2880 - val_accuracy: 0.8804 Epoch 5/20 625/625 [==============================] - 43s 69ms/step - loss: 0.2141 - accuracy: 0.9158 - val_loss: 0.2930 - val_accuracy: 0.8814 Epoch 6/20 625/625 [==============================] - 42s 66ms/step - loss: 0.1861 - accuracy: 0.9284 - val_loss: 0.3078 - val_accuracy: 0.8786 Epoch 7/20 625/625 [==============================] - 43s 68ms/step - loss: 0.1538 - accuracy: 0.9419 - val_loss: 0.3307 - val_accuracy: 0.8772 Epoch 8/20 625/625 [==============================] - 43s 69ms/step - loss: 0.1293 - accuracy: 0.9524 - val_loss: 0.3357 - val_accuracy: 0.8776 Epoch 9/20 625/625 [==============================] - 41s 65ms/step - loss: 0.1035 - accuracy: 0.9615 - val_loss: 0.3685 - val_accuracy: 0.8768 Epoch 10/20 625/625 [==============================] - 41s 66ms/step - loss: 0.0827 - accuracy: 0.9698 - val_loss: 0.3738 - val_accuracy: 0.8778 Epoch 11/20 625/625 [==============================] - 41s 65ms/step - loss: 0.0657 - accuracy: 0.9765 - val_loss: 0.4341 - val_accuracy: 0.8772 Epoch 12/20 625/625 [==============================] - 41s 65ms/step - loss: 0.0524 - accuracy: 0.9814 - val_loss: 0.4558 - val_accuracy: 0.8726 Epoch 13/20 625/625 [==============================] - 42s 68ms/step - loss: 0.0408 - accuracy: 0.9854 - val_loss: 0.4878 - val_accuracy: 0.8640 Epoch 14/20 625/625 [==============================] - 41s 65ms/step - loss: 0.0329 - accuracy: 0.9883 - val_loss: 0.5147 - val_accuracy: 0.8688 Epoch 15/20 625/625 [==============================] - 42s 68ms/step - loss: 0.0296 - accuracy: 0.9897 - val_loss: 0.6139 - val_accuracy: 0.8692 Epoch 16/20 625/625 [==============================] - 40s 65ms/step - loss: 0.0241 - accuracy: 0.9915 - val_loss: 0.6086 - val_accuracy: 0.8654 Epoch 17/20 625/625 [==============================] - 42s 68ms/step - loss: 0.0185 - accuracy: 0.9936 - val_loss: 0.6920 - val_accuracy: 0.8594 Epoch 18/20 625/625 [==============================] - 41s 65ms/step - loss: 0.0177 - accuracy: 0.9943 - val_loss: 0.6765 - val_accuracy: 0.8652 Epoch 19/20 625/625 [==============================] - 42s 67ms/step - loss: 0.0160 - accuracy: 0.9942 - val_loss: 0.6938 - val_accuracy: 0.8624 Epoch 20/20 625/625 [==============================] - 40s 64ms/step - loss: 0.0158 - accuracy: 0.9948 - val_loss: 0.7845 - val_accuracy: 0.8628 782/782 [==============================] - 19s 24ms/step - loss: 0.2999 - accuracy: 0.8715 테스트 정확도: 0.871
서브클래싱으로 위치 임베딩 구현하기
class PositionalEmbedding(layers.Layer):
def __init__(self, sequence_length, input_dim, output_dim, **kwargs):
super().__init__(**kwargs)
self.token_embeddings = layers.Embedding(
input_dim=input_dim, output_dim=output_dim)
self.position_embeddings = layers.Embedding(
input_dim=sequence_length, output_dim=output_dim)
self.sequence_length = sequence_length
self.input_dim = input_dim
self.output_dim = output_dim
def call(self, inputs):
length = tf.shape(inputs)[-1]
positions = tf.range(start=0, limit=length, delta=1)
embedded_tokens = self.token_embeddings(inputs)
embedded_positions = self.position_embeddings(positions)
return embedded_tokens + embedded_positions
def compute_mask(self, inputs, mask=None):
return tf.math.not_equal(inputs, 0)
def get_config(self):
config = super().get_config()
config.update({
"output_dim": self.output_dim,
"sequence_length": self.sequence_length,
"input_dim": self.input_dim,
})
return config
트랜스포머 인코더와 위치 임베딩 합치기
vocab_size = 20000
sequence_length = 600
embed_dim = 256
num_heads = 2
dense_dim = 32
inputs = keras.Input(shape=(None,), dtype="int64")
x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)
x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)
x = layers.GlobalMaxPooling1D()(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.compile(optimizer="rmsprop",
loss="binary_crossentropy",
metrics=["accuracy"])
model.summary()
callbacks = [
keras.callbacks.ModelCheckpoint("full_transformer_encoder.h5",
save_best_only=True)
]
model.fit(int_train_ds, validation_data=int_val_ds, epochs=20, callbacks=callbacks)
model = keras.models.load_model(
"full_transformer_encoder.h5",
custom_objects={"TransformerEncoder": TransformerEncoder,
"PositionalEmbedding": PositionalEmbedding})
print(f"테스트 정확도: {model.evaluate(int_test_ds)[1]:.3f}")
Model: "model_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_2 (InputLayer) [(None, None)] 0 positional_embedding (Posi (None, None, 256) 5273600 tionalEmbedding) transformer_encoder_1 (Tra (None, None, 256) 543776 nsformerEncoder) global_max_pooling1d_1 (Gl (None, 256) 0 obalMaxPooling1D) dropout_1 (Dropout) (None, 256) 0 dense_7 (Dense) (None, 1) 257 ================================================================= Total params: 5817633 (22.19 MB) Trainable params: 5817633 (22.19 MB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________ Epoch 1/20 625/625 [==============================] - 60s 93ms/step - loss: 0.5504 - accuracy: 0.7326 - val_loss: 0.3707 - val_accuracy: 0.8356 Epoch 2/20 625/625 [==============================] - 50s 80ms/step - loss: 0.3100 - accuracy: 0.8709 - val_loss: 0.3869 - val_accuracy: 0.8262 Epoch 3/20 625/625 [==============================] - 47s 74ms/step - loss: 0.2424 - accuracy: 0.9011 - val_loss: 0.3069 - val_accuracy: 0.8760 Epoch 4/20 625/625 [==============================] - 46s 74ms/step - loss: 0.1992 - accuracy: 0.9220 - val_loss: 0.2885 - val_accuracy: 0.8918 Epoch 5/20 625/625 [==============================] - 44s 71ms/step - loss: 0.1664 - accuracy: 0.9370 - val_loss: 0.3382 - val_accuracy: 0.8808 Epoch 6/20 625/625 [==============================] - 44s 70ms/step - loss: 0.1395 - accuracy: 0.9460 - val_loss: 0.3723 - val_accuracy: 0.8836 Epoch 7/20 625/625 [==============================] - 44s 70ms/step - loss: 0.1140 - accuracy: 0.9571 - val_loss: 0.4272 - val_accuracy: 0.8844 Epoch 8/20 625/625 [==============================] - 45s 71ms/step - loss: 0.0952 - accuracy: 0.9646 - val_loss: 0.4487 - val_accuracy: 0.8862 Epoch 9/20 625/625 [==============================] - 43s 69ms/step - loss: 0.0735 - accuracy: 0.9739 - val_loss: 0.4597 - val_accuracy: 0.8780 Epoch 10/20 625/625 [==============================] - 45s 71ms/step - loss: 0.0556 - accuracy: 0.9793 - val_loss: 0.6001 - val_accuracy: 0.8814 Epoch 11/20 625/625 [==============================] - 44s 71ms/step - loss: 0.0441 - accuracy: 0.9847 - val_loss: 0.6343 - val_accuracy: 0.8684 Epoch 12/20 625/625 [==============================] - 44s 70ms/step - loss: 0.0355 - accuracy: 0.9886 - val_loss: 0.7697 - val_accuracy: 0.8670 Epoch 13/20 625/625 [==============================] - 44s 70ms/step - loss: 0.0258 - accuracy: 0.9913 - val_loss: 0.7606 - val_accuracy: 0.8728 Epoch 14/20 625/625 [==============================] - 43s 69ms/step - loss: 0.0227 - accuracy: 0.9924 - val_loss: 0.9193 - val_accuracy: 0.8730 Epoch 15/20 625/625 [==============================] - 43s 69ms/step - loss: 0.0235 - accuracy: 0.9927 - val_loss: 0.9620 - val_accuracy: 0.8788 Epoch 16/20 625/625 [==============================] - 43s 69ms/step - loss: 0.0180 - accuracy: 0.9944 - val_loss: 1.0631 - val_accuracy: 0.8780 Epoch 17/20 625/625 [==============================] - 43s 69ms/step - loss: 0.0155 - accuracy: 0.9956 - val_loss: 0.9305 - val_accuracy: 0.8728 Epoch 18/20 625/625 [==============================] - 43s 69ms/step - loss: 0.0124 - accuracy: 0.9961 - val_loss: 0.9398 - val_accuracy: 0.8710 Epoch 19/20 625/625 [==============================] - 43s 69ms/step - loss: 0.0106 - accuracy: 0.9967 - val_loss: 0.9718 - val_accuracy: 0.8752 Epoch 20/20 625/625 [==============================] - 43s 69ms/step - loss: 0.0112 - accuracy: 0.9963 - val_loss: 1.1590 - val_accuracy: 0.8770 782/782 [==============================] - 22s 28ms/step - loss: 0.3193 - accuracy: 0.8801 테스트 정확도: 0.880