how do I fit a time-series multi head model?

I try to create a model by concatenating 2 models together. The models I want to use, shall handle time series, and I’m experimenting with Conv1D layers. As these have an 3D input shape batch_shape + (steps, input_dim) and the Keras TimeseriesGenerator is providing such, I’m happy being able to make use of it when handling single head models.

import pandas as pd
import numpy as np
import random
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import (Input, Dense, Conv1D, BatchNormalization, 
                                     Flatten, Dropout, MaxPooling1D,
                                     concatenate)
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
from tensorflow.keras.utils import plot_model

data = pd.DataFrame(index=pd.date_range(start='2020-01-01', periods=300, freq='D'))
data['featureA'] = [random.random() for _ in range(len(data))]
data['featureB'] = [random.random() for _ in range(len(data))]
data['featureC'] = [random.random() for _ in range(len(data))]
data['featureD'] = [random.random() for _ in range(len(data))]
data['target']   = [random.random() for _ in range(len(data))]

Xtrain_AB, Xtest_AB, yTrain_AB, yTest_AB = train_test_split(data[['featureA', 'featureB']], 
                                                            data['target'], test_size=0.2,
                                                            shuffle=False)
Xtrain_CD, Xtest_CD, yTrain_CD, yTest_CD = train_test_split(data[['featureC', 'featureD']], 
                                                            data['target'], test_size=0.2,
                                                            shuffle=False)

n_steps = 5 

train_gen_AB = TimeseriesGenerator(Xtrain_AB, yTrain_AB,
                                length=n_steps, 
                                sampling_rate=1,
                                batch_size=64,
                                shuffle=False)
test_gen_AB = TimeseriesGenerator(Xtest_AB, yTest_AB,
                                length=n_steps, 
                                sampling_rate=1,
                                batch_size=64,
                                shuffle=False)

n_features_AB = len(Xtrain_AB.columns)
input_AB      = Input(shape=(n_steps, n_features_AB))
layer_AB      = Conv1D(filters=128, kernel_size=3, activation='relu', input_shape=(n_steps, n_features_AB))(input_AB)
layer_AB      = MaxPooling1D(pool_size=2)(layer_AB)
layer_AB      = Flatten()(layer_AB)
dense_AB      = Dense(50, activation='relu')(layer_AB)
output_AB     = Dense(1)(dense_AB)
model_AB      = Model(inputs=input_AB, outputs=output_AB)
model_AB.compile(optimizer='adam', loss='mse')
model_AB.summary()
model_AB.fit(train_gen_AB, epochs=1, verbose=1)
print(f'evaluation: {model_AB.evaluate(test_gen_AB)}')

#plot_model(model_AB)

train_gen_CD = TimeseriesGenerator(Xtrain_CD, yTrain_CD,
                                length=n_steps, 
                                sampling_rate=1,
                                batch_size=64,
                                shuffle=False)
test_gen_CD = TimeseriesGenerator(Xtest_CD, yTest_CD,
                                length=n_steps, 
                                sampling_rate=1,
                                batch_size=64,
                                shuffle=False)

n_features_CD = len(Xtrain_CD.columns)
input_CD      = Input(shape=(n_steps, n_features_CD))
layer_CD      = Conv1D(filters=128, kernel_size=3, activation='relu', input_shape=(n_steps, n_features_CD))(input_CD)
layer_CD      = MaxPooling1D(pool_size=2)(layer_CD)
layer_CD      = Flatten()(layer_CD)
dense_CD      = Dense(50, activation='relu')(layer_CD)
output_CD     = Dense(1)(dense_CD)
model_CD      = Model(inputs=input_CD, outputs=output_CD)
model_CD.compile(optimizer='adam', loss='mse')
model_CD.summary()
model_CD.fit(train_gen_CD, epochs=1, verbose=1)
print(f'evaluation: {model_CD.evaluate(test_gen_CD)}')

#plot_model(model_CD)

This works fine for each of the models:)

Now I would like to experiment with concatenating both models to one (as I think it might enable me later when adding additional ‘heads’ to train them in parallel, and I guess using such models might be easier as handling a lot of separated once) and get a dual-head model, which can easily be created like this

merge=concatenate(inputs=[layer_AB, layer_CD])
dense_merge     = Dense(50, activation='relu')(merge)
output_merge    = Dense(1)(dense_merge)
model_dual_head = Model(inputs=[input_AB, input_CD], outputs=output_merge)
model_dual_head.compile(optimizer='adam', loss='mse')
model_dual_head.summary()
print(f'dual head model input_shape:{model_dual_head.input_shape}')
plot_model(model_dual_head)

This dual_head_model has an input_shape of 2 time 3D [(None, 5, 2), (None, 5, 2)] and looks finally this way

Unfortunately I don’t know how to fit it :( and hope you will be able to provide me a solution on how to generate the needed shape of data. I tried to provide the previously used generators as list model_dual_head.fit([train_gen_AB, train_gen_CD], epochs=1, verbose=1), and also lists of the raw input data frames model_dual_head.fit(x=[Xtrain_AB, Xtrain_CD], y=[yTrain_AB, yTrain_CD], epochs=1, verbose=1), but it seems not to be in the right shape.

Thanks in advance

Wasili

based on Jacks comment I tried the using following code

def doubleGen(gen1, gen2):
  assert(len(gen1) == len(gen2))
  for feature1, label1, feature2, label2 in (train_gen_AB, train_gen_CD):
    yield (feature1, feature2), label1 

gen = doubleGen(train_gen_AB, train_gen_CD)
model_dual_head.fit(gen, epochs=1, verbose=1)

but unfortunatelly it does not work, as the input_shape is not the same

ValueError: Layer model_2 expects 2 input(s), but it received 4 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, None, None) dtype=float32>, <tf.Tensor 'ExpandDims:0' shape=(None, 1) dtype=float32>, <tf.Tensor 'IteratorGetNext:2' shape=(None, None, None) dtype=float32>, <tf.Tensor 'ExpandDims_1:0' shape=(None, 1) dtype=float32>]

I adapted the parenthesis in the generator based on Jacks note and get a different error now.

def doubleGen(gen1, gen2):
  assert(len(gen1) == len(gen2))
  for (feature1, label1), (feature2, label2) in train_gen_AB, train_gen_CD:
    assert label1 == label2
    yield (feature1, feature2), label1 

gen = doubleGen(train_gen_AB, train_gen_CD)
model_dual_head.fit(gen, epochs=1, verbose=1)

<ipython-input-8-e8971cd0f287> in doubleGen(gen1, gen2)
      1 def doubleGen(gen1, gen2):
      2   assert(len(gen1) == len(gen2))
----> 3   for (feature1, label1), (feature2, label2) in train_gen_AB, train_gen_CD:
      4     assert label1 == label2
      5     yield (feature1, feature2), label1

ValueError: too many values to unpack (expected 2)

I thought about using a plain index iterating the generators which does fix the shape topic, but this results in a NonType error

def doubleGen(gen1, gen2):
  assert(len(gen1) == len(gen2))
  for i in range(len(gen1)):
    feature1, label1 = gen1[i]
    feature2, label2 = gen2[i]
    #assert label1.all() == label2.all()
    yield (feature1, feature2), label1 

gen = doubleGen(train_gen_AB, train_gen_CD)
model_dual_head.fit(gen, epochs=1, verbose=1)

TypeError                                 Traceback (most recent call last)
<ipython-input-24-6abda48a58c7> in <module>()
      8 
      9 gen = doubleGen(train_gen_AB, train_gen_CD)
---> 10 model_dual_head.fit(gen, epochs=1, verbose=1)

2 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
    853       # In this case we have created variables on the first call, so we run the
    854       # defunned version which is guaranteed to never create variables.
--> 855       return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
    856     elif self._stateful_fn is not None:
    857       # Release the lock early so that multiple threads can perform the call

TypeError: 'NoneType' object is not callable

The Notebook can be reached here

Finally Jacks input helped to find the solution. There was just missing to zip both generators to be able to iterated through them :)

def doubleGen(gen1, gen2):
  assert(len(gen1) == len(gen2))
  for (feature1, label1), (feature2, label2) in zip(train_gen_AB, train_gen_CD):
    assert label1.all() == label2.all()
    yield (feature1, feature2), label1 

gen = doubleGen(train_gen_AB, train_gen_CD)
model_dual_head.fit(gen, epochs=1, verbose=1)

4/4 [==============================] - 0s 14ms/step - loss: 0.1119
<tensorflow.python.keras.callbacks.History at 0x7f0dfd4de5d0>

Answer

I do not know if this already exists, but I believe you can create a new Generator that merges the two datasets. Assuming the two Generators are in lockstep, this should work:

for (input1, label1), (input2, label2) in generator1, generator2:
    assert label1 == label2
    yield (input1, input2), label1

This now gives a generator that yields both inputs as a tuple, and the common label as one data item. This could be a lambda, which would save the trouble of creating an entire Generator class.

Advertisement

Answer