Skip to content

Timeseries example not reproducible #2018

Open
@twoody2007

Description

@twoody2007

Issue Type

Documentation Bug

Source

binary

Keras Version

3.7.0

Custom Code

No

OS Platform and Distribution

Ubuntu 22.04

Python version

3.12.8

GPU model and memory

RTX 5000 Ada

Current Behavior?

Running the code from this time series example does not produce the same number of parameters as the example output in the documentation. Further, the model does not achieve stated accuracy.

The colab link also has the same issue.

What is strange is that the doc's final dense layer has ~64K params while running the code produces 2x the mlp unit input, which is 128. I tried increasing it to see if that fixed the problem, but it seems that there is something structurally different between how this code runs on 2.4 vs 3.7.0.

I expected the close to the same output as what is documented on the page.

Standalone code to reproduce the issue or tutorial link

You can run the colab example:
* https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/timeseries/ipynb/timeseries_classification_transformer.ipynb)

or run the code located here:
* https://github.com/keras-team/keras-io/blob/master/examples/timeseries/timeseries_classification_transformer.py

Relevant log output

(tcap) travis@travis-p1-g6:~/projects/tetra_capital$ python scripts/example_classifications.py 
Model: "functional"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                  ┃ Output Shape              ┃         Param # ┃ Connected to               ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ input_layer (InputLayer)      │ (None, 500, 1)            │               0 │ -                          │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention          │ (None, 500, 1)            │           7,169 │ input_layer[0][0],         │
│ (MultiHeadAttention)          │                           │                 │ input_layer[0][0]          │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_1 (Dropout)           │ (None, 500, 1)            │               0 │ multi_head_attention[0][0] │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization           │ (None, 500, 1)            │               2 │ dropout_1[0][0]            │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add (Add)                     │ (None, 500, 1)            │               0 │ layer_normalization[0][0], │
│                               │                           │                 │ input_layer[0][0]          │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d (Conv1D)               │ (None, 500, 4)            │               8 │ add[0][0]                  │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_2 (Dropout)           │ (None, 500, 4)            │               0 │ conv1d[0][0]               │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_1 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_2[0][0]            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_1         │ (None, 500, 1)            │               2 │ conv1d_1[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_1 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_1[0][… │
│                               │                           │                 │ add[0][0]                  │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention_1        │ (None, 500, 1)            │           7,169 │ add_1[0][0], add_1[0][0]   │
│ (MultiHeadAttention)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_4 (Dropout)           │ (None, 500, 1)            │               0 │ multi_head_attention_1[0]… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_2         │ (None, 500, 1)            │               2 │ dropout_4[0][0]            │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_2 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_2[0][… │
│                               │                           │                 │ add_1[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_2 (Conv1D)             │ (None, 500, 4)            │               8 │ add_2[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_5 (Dropout)           │ (None, 500, 4)            │               0 │ conv1d_2[0][0]             │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_3 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_5[0][0]            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_3         │ (None, 500, 1)            │               2 │ conv1d_3[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_3 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_3[0][… │
│                               │                           │                 │ add_2[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention_2        │ (None, 500, 1)            │           7,169 │ add_3[0][0], add_3[0][0]   │
│ (MultiHeadAttention)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_7 (Dropout)           │ (None, 500, 1)            │               0 │ multi_head_attention_2[0]… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_4         │ (None, 500, 1)            │               2 │ dropout_7[0][0]            │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_4 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_4[0][… │
│                               │                           │                 │ add_3[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_4 (Conv1D)             │ (None, 500, 4)            │               8 │ add_4[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_8 (Dropout)           │ (None, 500, 4)            │               0 │ conv1d_4[0][0]             │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_5 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_8[0][0]            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_5         │ (None, 500, 1)            │               2 │ conv1d_5[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_5 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_5[0][… │
│                               │                           │                 │ add_4[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention_3        │ (None, 500, 1)            │           7,169 │ add_5[0][0], add_5[0][0]   │
│ (MultiHeadAttention)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_10 (Dropout)          │ (None, 500, 1)            │               0 │ multi_head_attention_3[0]… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_6         │ (None, 500, 1)            │               2 │ dropout_10[0][0]           │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_6 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_6[0][… │
│                               │                           │                 │ add_5[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_6 (Conv1D)             │ (None, 500, 4)            │               8 │ add_6[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_11 (Dropout)          │ (None, 500, 4)            │               0 │ conv1d_6[0][0]             │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_7 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_11[0][0]           │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_7         │ (None, 500, 1)            │               2 │ conv1d_7[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_7 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_7[0][… │
│                               │                           │                 │ add_6[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ global_average_pooling1d      │ (None, 1)                 │               0 │ add_7[0][0]                │
│ (GlobalAveragePooling1D)      │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dense (Dense)                 │ (None, 2048)              │           4,096 │ global_average_pooling1d[… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_12 (Dropout)          │ (None, 2048)              │               0 │ dense[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dense_1 (Dense)               │ (None, 2)                 │           4,098 │ dropout_12[0][0]           │
└───────────────────────────────┴───────────────────────────┴─────────────────┴────────────────────────────┘
 Total params: 36,938 (144.29 KB)
 Trainable params: 36,938 (144.29 KB)
 Non-trainable params: 0 (0.00 B)
Epoch 1/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 22s 284ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5079 - val_loss: 0.6927 - val_sparse_categorical_accuracy: 0.5354
Epoch 2/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 9s 85ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.5024 - val_loss: 0.6925 - val_sparse_categorical_accuracy: 0.5354
Epoch 3/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 84ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5005 - val_loss: 0.6926 - val_sparse_categorical_accuracy: 0.5354
Epoch 4/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.5031 - val_loss: 0.6925 - val_sparse_categorical_accuracy: 0.5354
Epoch 5/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5155 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 6/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6933 - sparse_categorical_accuracy: 0.5004 - val_loss: 0.6924 - val_sparse_categorical_accuracy: 0.5354
Epoch 7/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.5078 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 8/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5096 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 9/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5131 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 10/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6928 - sparse_categorical_accuracy: 0.5196 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 11/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5021 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 12/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6934 - sparse_categorical_accuracy: 0.4936 - val_loss: 0.6924 - val_sparse_categorical_accuracy: 0.5354
Epoch 13/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6928 - sparse_categorical_accuracy: 0.5176 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 14/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6933 - sparse_categorical_accuracy: 0.4975 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 15/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5098 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 16/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5078 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 17/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6927 - sparse_categorical_accuracy: 0.5171 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 18/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5118 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 19/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5079 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 20/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 84ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5029 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 21/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5075 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 22/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5145 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 23/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5101 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 24/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5090 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 25/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5079 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
42/42 ━━━━━━━━━━━━━━━━━━━━ 5s 58ms/step - loss: 0.6925 - sparse_categorical_accuracy: 0.5264 0

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions