WARNING:tensorflow:From /tensorflow-1.15.2/python3.6/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
Input-Token (InputLayer) (None, None) 0
__________________________________________________________________________________________________
Input-Segment (InputLayer) (None, None) 0
__________________________________________________________________________________________________
Embedding-Token (Embedding) (None, None, 768) 16226304 Input-Token[0][0]
__________________________________________________________________________________________________
Embedding-Segment (Embedding) (None, None, 768) 1536 Input-Segment[0][0]
__________________________________________________________________________________________________
Embedding-Token-Segment (Add) (None, None, 768) 0 Embedding-Token[0][0]
Embedding-Segment[0][0]
__________________________________________________________________________________________________
Embedding-Position (PositionEmb (None, None, 768) 393216 Embedding-Token-Segment[0][0]
__________________________________________________________________________________________________
Embedding-Norm (LayerNormalizat (None, None, 768) 1536 Embedding-Position[0][0]
__________________________________________________________________________________________________
Embedding-Dropout (Dropout) (None, None, 768) 0 Embedding-Norm[0][0]
__________________________________________________________________________________________________
Transformer-0-MultiHeadSelfAtte (None, None, 768) 2362368 Embedding-Dropout[0][0]
Embedding-Dropout[0][0]
Embedding-Dropout[0][0]
__________________________________________________________________________________________________
Transformer-0-MultiHeadSelfAtte (None, None, 768) 0 Transformer-0-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-0-MultiHeadSelfAtte (None, None, 768) 0 Embedding-Dropout[0][0]
Transformer-0-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-0-MultiHeadSelfAtte (None, None, 768) 1536 Transformer-0-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-0-FeedForward (Feed (None, None, 768) 4722432 Transformer-0-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-0-FeedForward-Dropo (None, None, 768) 0 Transformer-0-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-0-FeedForward-Add ( (None, None, 768) 0 Transformer-0-MultiHeadSelfAttent
Transformer-0-FeedForward-Dropout
__________________________________________________________________________________________________
Transformer-0-FeedForward-Norm (None, None, 768) 1536 Transformer-0-FeedForward-Add[0][
__________________________________________________________________________________________________
Transformer-1-MultiHeadSelfAtte (None, None, 768) 2362368 Transformer-0-FeedForward-Norm[0]
Transformer-0-FeedForward-Norm[0]
Transformer-0-FeedForward-Norm[0]
__________________________________________________________________________________________________
Transformer-1-MultiHeadSelfAtte (None, None, 768) 0 Transformer-1-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-1-MultiHeadSelfAtte (None, None, 768) 0 Transformer-0-FeedForward-Norm[0]
Transformer-1-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-1-MultiHeadSelfAtte (None, None, 768) 1536 Transformer-1-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-1-FeedForward (Feed (None, None, 768) 4722432 Transformer-1-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-1-FeedForward-Dropo (None, None, 768) 0 Transformer-1-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-1-FeedForward-Add ( (None, None, 768) 0 Transformer-1-MultiHeadSelfAttent
Transformer-1-FeedForward-Dropout
__________________________________________________________________________________________________
Transformer-1-FeedForward-Norm (None, None, 768) 1536 Transformer-1-FeedForward-Add[0][
__________________________________________________________________________________________________
Transformer-2-MultiHeadSelfAtte (None, None, 768) 2362368 Transformer-1-FeedForward-Norm[0]
Transformer-1-FeedForward-Norm[0]
Transformer-1-FeedForward-Norm[0]
__________________________________________________________________________________________________
Transformer-2-MultiHeadSelfAtte (None, None, 768) 0 Transformer-2-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-2-MultiHeadSelfAtte (None, None, 768) 0 Transformer-1-FeedForward-Norm[0]
Transformer-2-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-2-MultiHeadSelfAtte (None, None, 768) 1536 Transformer-2-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-2-FeedForward (Feed (None, None, 768) 4722432 Transformer-2-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-2-FeedForward-Dropo (None, None, 768) 0 Transformer-2-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-2-FeedForward-Add ( (None, None, 768) 0 Transformer-2-MultiHeadSelfAttent
Transformer-2-FeedForward-Dropout
__________________________________________________________________________________________________
Transformer-2-FeedForward-Norm (None, None, 768) 1536 Transformer-2-FeedForward-Add[0][
__________________________________________________________________________________________________
Transformer-3-MultiHeadSelfAtte (None, None, 768) 2362368 Transformer-2-FeedForward-Norm[0]
Transformer-2-FeedForward-Norm[0]
Transformer-2-FeedForward-Norm[0]
__________________________________________________________________________________________________
Transformer-3-MultiHeadSelfAtte (None, None, 768) 0 Transformer-3-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-3-MultiHeadSelfAtte (None, None, 768) 0 Transformer-2-FeedForward-Norm[0]
Transformer-3-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-3-MultiHeadSelfAtte (None, None, 768) 1536 Transformer-3-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-3-FeedForward (Feed (None, None, 768) 4722432 Transformer-3-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-3-FeedForward-Dropo (None, None, 768) 0 Transformer-3-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-3-FeedForward-Add ( (None, None, 768) 0 Transformer-3-MultiHeadSelfAttent
Transformer-3-FeedForward-Dropout
__________________________________________________________________________________________________
Transformer-3-FeedForward-Norm (None, None, 768) 1536 Transformer-3-FeedForward-Add[0][
__________________________________________________________________________________________________
Transformer-4-MultiHeadSelfAtte (None, None, 768) 2362368 Transformer-3-FeedForward-Norm[0]
Transformer-3-FeedForward-Norm[0]
Transformer-3-FeedForward-Norm[0]
__________________________________________________________________________________________________
Transformer-4-MultiHeadSelfAtte (None, None, 768) 0 Transformer-4-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-4-MultiHeadSelfAtte (None, None, 768) 0 Transformer-3-FeedForward-Norm[0]
Transformer-4-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-4-MultiHeadSelfAtte (None, None, 768) 1536 Transformer-4-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-4-FeedForward (Feed (None, None, 768) 4722432 Transformer-4-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-4-FeedForward-Dropo (None, None, 768) 0 Transformer-4-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-4-FeedForward-Add ( (None, None, 768) 0 Transformer-4-MultiHeadSelfAttent
Transformer-4-FeedForward-Dropout
__________________________________________________________________________________________________
Transformer-4-FeedForward-Norm (None, None, 768) 1536 Transformer-4-FeedForward-Add[0][
__________________________________________________________________________________________________
Transformer-5-MultiHeadSelfAtte (None, None, 768) 2362368 Transformer-4-FeedForward-Norm[0]
Transformer-4-FeedForward-Norm[0]
Transformer-4-FeedForward-Norm[0]
__________________________________________________________________________________________________
Transformer-5-MultiHeadSelfAtte (None, None, 768) 0 Transformer-5-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-5-MultiHeadSelfAtte (None, None, 768) 0 Transformer-4-FeedForward-Norm[0]
Transformer-5-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-5-MultiHeadSelfAtte (None, None, 768) 1536 Transformer-5-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-5-FeedForward (Feed (None, None, 768) 4722432 Transformer-5-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-5-FeedForward-Dropo (None, None, 768) 0 Transformer-5-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-5-FeedForward-Add ( (None, None, 768) 0 Transformer-5-MultiHeadSelfAttent
Transformer-5-FeedForward-Dropout
__________________________________________________________________________________________________
Transformer-5-FeedForward-Norm (None, None, 768) 1536 Transformer-5-FeedForward-Add[0][
__________________________________________________________________________________________________
Transformer-6-MultiHeadSelfAtte (None, None, 768) 2362368 Transformer-5-FeedForward-Norm[0]
Transformer-5-FeedForward-Norm[0]
Transformer-5-FeedForward-Norm[0]
__________________________________________________________________________________________________
Transformer-6-MultiHeadSelfAtte (None, None, 768) 0 Transformer-6-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-6-MultiHeadSelfAtte (None, None, 768) 0 Transformer-5-FeedForward-Norm[0]
Transformer-6-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-6-MultiHeadSelfAtte (None, None, 768) 1536 Transformer-6-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-6-FeedForward (Feed (None, None, 768) 4722432 Transformer-6-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-6-FeedForward-Dropo (None, None, 768) 0 Transformer-6-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-6-FeedForward-Add ( (None, None, 768) 0 Transformer-6-MultiHeadSelfAttent
Transformer-6-FeedForward-Dropout
__________________________________________________________________________________________________
Transformer-6-FeedForward-Norm (None, None, 768) 1536 Transformer-6-FeedForward-Add[0][
__________________________________________________________________________________________________
Transformer-7-MultiHeadSelfAtte (None, None, 768) 2362368 Transformer-6-FeedForward-Norm[0]
Transformer-6-FeedForward-Norm[0]
Transformer-6-FeedForward-Norm[0]
__________________________________________________________________________________________________
Transformer-7-MultiHeadSelfAtte (None, None, 768) 0 Transformer-7-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-7-MultiHeadSelfAtte (None, None, 768) 0 Transformer-6-FeedForward-Norm[0]
Transformer-7-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-7-MultiHeadSelfAtte (None, None, 768) 1536 Transformer-7-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-7-FeedForward (Feed (None, None, 768) 4722432 Transformer-7-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-7-FeedForward-Dropo (None, None, 768) 0 Transformer-7-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-7-FeedForward-Add ( (None, None, 768) 0 Transformer-7-MultiHeadSelfAttent
Transformer-7-FeedForward-Dropout
__________________________________________________________________________________________________
Transformer-7-FeedForward-Norm (None, None, 768) 1536 Transformer-7-FeedForward-Add[0][
__________________________________________________________________________________________________
Transformer-8-MultiHeadSelfAtte (None, None, 768) 2362368 Transformer-7-FeedForward-Norm[0]
Transformer-7-FeedForward-Norm[0]
Transformer-7-FeedForward-Norm[0]
__________________________________________________________________________________________________
Transformer-8-MultiHeadSelfAtte (None, None, 768) 0 Transformer-8-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-8-MultiHeadSelfAtte (None, None, 768) 0 Transformer-7-FeedForward-Norm[0]
Transformer-8-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-8-MultiHeadSelfAtte (None, None, 768) 1536 Transformer-8-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-8-FeedForward (Feed (None, None, 768) 4722432 Transformer-8-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-8-FeedForward-Dropo (None, None, 768) 0 Transformer-8-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-8-FeedForward-Add ( (None, None, 768) 0 Transformer-8-MultiHeadSelfAttent
Transformer-8-FeedForward-Dropout
__________________________________________________________________________________________________
Transformer-8-FeedForward-Norm (None, None, 768) 1536 Transformer-8-FeedForward-Add[0][
__________________________________________________________________________________________________
Transformer-9-MultiHeadSelfAtte (None, None, 768) 2362368 Transformer-8-FeedForward-Norm[0]
Transformer-8-FeedForward-Norm[0]
Transformer-8-FeedForward-Norm[0]
__________________________________________________________________________________________________
Transformer-9-MultiHeadSelfAtte (None, None, 768) 0 Transformer-9-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-9-MultiHeadSelfAtte (None, None, 768) 0 Transformer-8-FeedForward-Norm[0]
Transformer-9-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-9-MultiHeadSelfAtte (None, None, 768) 1536 Transformer-9-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-9-FeedForward (Feed (None, None, 768) 4722432 Transformer-9-MultiHeadSelfAttent
__________________________________________________________________________________________________
Transformer-9-FeedForward-Dropo (None, None, 768) 0 Transformer-9-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-9-FeedForward-Add ( (None, None, 768) 0 Transformer-9-MultiHeadSelfAttent
Transformer-9-FeedForward-Dropout
__________________________________________________________________________________________________
Transformer-9-FeedForward-Norm (None, None, 768) 1536 Transformer-9-FeedForward-Add[0][
__________________________________________________________________________________________________
Transformer-10-MultiHeadSelfAtt (None, None, 768) 2362368 Transformer-9-FeedForward-Norm[0]
Transformer-9-FeedForward-Norm[0]
Transformer-9-FeedForward-Norm[0]
__________________________________________________________________________________________________
Transformer-10-MultiHeadSelfAtt (None, None, 768) 0 Transformer-10-MultiHeadSelfAtten
__________________________________________________________________________________________________
Transformer-10-MultiHeadSelfAtt (None, None, 768) 0 Transformer-9-FeedForward-Norm[0]
Transformer-10-MultiHeadSelfAtten
__________________________________________________________________________________________________
Transformer-10-MultiHeadSelfAtt (None, None, 768) 1536 Transformer-10-MultiHeadSelfAtten
__________________________________________________________________________________________________
Transformer-10-FeedForward (Fee (None, None, 768) 4722432 Transformer-10-MultiHeadSelfAtten
__________________________________________________________________________________________________
Transformer-10-FeedForward-Drop (None, None, 768) 0 Transformer-10-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-10-FeedForward-Add (None, None, 768) 0 Transformer-10-MultiHeadSelfAtten
Transformer-10-FeedForward-Dropou
__________________________________________________________________________________________________
Transformer-10-FeedForward-Norm (None, None, 768) 1536 Transformer-10-FeedForward-Add[0]
__________________________________________________________________________________________________
Transformer-11-MultiHeadSelfAtt (None, None, 768) 2362368 Transformer-10-FeedForward-Norm[0
Transformer-10-FeedForward-Norm[0
Transformer-10-FeedForward-Norm[0
__________________________________________________________________________________________________
Transformer-11-MultiHeadSelfAtt (None, None, 768) 0 Transformer-11-MultiHeadSelfAtten
__________________________________________________________________________________________________
Transformer-11-MultiHeadSelfAtt (None, None, 768) 0 Transformer-10-FeedForward-Norm[0
Transformer-11-MultiHeadSelfAtten
__________________________________________________________________________________________________
Transformer-11-MultiHeadSelfAtt (None, None, 768) 1536 Transformer-11-MultiHeadSelfAtten
__________________________________________________________________________________________________
Transformer-11-FeedForward (Fee (None, None, 768) 4722432 Transformer-11-MultiHeadSelfAtten
__________________________________________________________________________________________________
Transformer-11-FeedForward-Drop (None, None, 768) 0 Transformer-11-FeedForward[0][0]
__________________________________________________________________________________________________
Transformer-11-FeedForward-Add (None, None, 768) 0 Transformer-11-MultiHeadSelfAtten
Transformer-11-FeedForward-Dropou
__________________________________________________________________________________________________
Transformer-11-FeedForward-Norm (None, None, 768) 1536 Transformer-11-FeedForward-Add[0]
__________________________________________________________________________________________________
Pooler (Lambda) (None, 768) 0 Transformer-11-FeedForward-Norm[0
__________________________________________________________________________________________________
Pooler-Dense (Dense) (None, 768) 590592 Pooler[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 768) 0 Pooler-Dense[0][0]
__________________________________________________________________________________________________
dense_73 (Dense) (None, 2) 1538 dropout_1[0][0]
==================================================================================================
Total params: 102,269,186
Trainable params: 102,269,186
Non-trainable params: 0
__________________________________________________________________________________________________
WARNING:tensorflow:From /tensorflow-1.15.2/python3.6/tensorflow_core/python/ops/math_grad.py:1424: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
Epoch 1/20
3125/3125 [==============================] - 755s 242ms/step - loss: 0.3337 - accuracy: 0.8556
val_acc: 0.8417, best_val_acc: 0.8417
Epoch 2/20
3125/3125 [==============================] - 747s 239ms/step - loss: 0.1676 - accuracy: 0.9366
val_acc: 0.8401, best_val_acc: 0.8417
Epoch 3/20
3125/3125 [==============================] - 744s 238ms/step - loss: 0.0996 - accuracy: 0.9636
val_acc: 0.8306, best_val_acc: 0.8417
Epoch 4/20
3125/3125 [==============================] - 746s 239ms/step - loss: 0.0684 - accuracy: 0.9758
val_acc: 0.8409, best_val_acc: 0.8417
Epoch 5/20
3125/3125 [==============================] - 745s 238ms/step - loss: 0.0521 - accuracy: 0.9819
val_acc: 0.8428, best_val_acc: 0.8428
Epoch 6/20
3125/3125 [==============================] - 746s 239ms/step - loss: 0.0423 - accuracy: 0.9854
val_acc: 0.8386, best_val_acc: 0.8428
Epoch 7/20
3125/3125 [==============================] - 746s 239ms/step - loss: 0.0368 - accuracy: 0.9874
val_acc: 0.8436, best_val_acc: 0.8436
Epoch 8/20
3125/3125 [==============================] - 746s 239ms/step - loss: 0.0315 - accuracy: 0.9893
val_acc: 0.8419, best_val_acc: 0.8436
Epoch 9/20
3125/3125 [==============================] - 746s 239ms/step - loss: 0.0288 - accuracy: 0.9904
val_acc: 0.8357, best_val_acc: 0.8436
Epoch 10/20
3125/3125 [==============================] - 744s 238ms/step - loss: 0.0269 - accuracy: 0.9908
val_acc: 0.8373, best_val_acc: 0.8436
Epoch 11/20
3125/3125 [==============================] - 747s 239ms/step - loss: 0.0241 - accuracy: 0.9916
val_acc: 0.8448, best_val_acc: 0.8448
Epoch 12/20
3125/3125 [==============================] - 746s 239ms/step - loss: 0.0218 - accuracy: 0.9928
val_acc: 0.8414, best_val_acc: 0.8448
Epoch 13/20
3125/3125 [==============================] - 744s 238ms/step - loss: 0.0212 - accuracy: 0.9932
val_acc: 0.8372, best_val_acc: 0.8448
Epoch 14/20
3125/3125 [==============================] - 744s 238ms/step - loss: 0.0192 - accuracy: 0.9935
val_acc: 0.8411, best_val_acc: 0.8448
Epoch 15/20
3125/3125 [==============================] - 747s 239ms/step - loss: 0.0181 - accuracy: 0.9942
val_acc: 0.8399, best_val_acc: 0.8448
Epoch 16/20
3125/3125 [==============================] - 745s 239ms/step - loss: 0.0178 - accuracy: 0.9941
val_acc: 0.8475, best_val_acc: 0.8475
Epoch 17/20
3125/3125 [==============================] - 744s 238ms/step - loss: 0.0160 - accuracy: 0.9948
val_acc: 0.8408, best_val_acc: 0.8475
Epoch 18/20
3125/3125 [==============================] - 745s 238ms/step - loss: 0.0172 - accuracy: 0.9943
val_acc: 0.8454, best_val_acc: 0.8475
Epoch 19/20
3125/3125 [==============================] - 747s 239ms/step - loss: 0.0152 - accuracy: 0.9952
val_acc: 0.8302, best_val_acc: 0.8475
Epoch 20/20
3125/3125 [==============================] - 744s 238ms/step - loss: 0.0159 - accuracy: 0.9951
val_acc: 0.8410, best_val_acc: 0.8475