I am new to this field and I was reading a paper “Predicting citation counts based on deep neural network learning techniques”. There the authors describe the code that they implemented if someone wants to reproduce the results. I tried to do this but I am not sure if I succeeded.
Here is their description:
-RNN module - SimpleRNN -Output dimension of the encoder - 512 -The output layer - Dense layer -Activation function - ReLU -Overfitting prevention technique - Dropout with 0.2 rate -Epochs - 100 Optimization algorithm - RMSProp Learning rate - 10^{-5} Batch size - 256
And here is my implementation. I am not sure if the model I created is sequence to sequence.
epocsh = 100 batch_size = 256 optimizer = keras.optimizers.RMSprop(lr=0.00001) model = keras.models.Sequential([ keras.layers.SimpleRNN(512, input_shape=[X_train.shape[0], X_train.shape[1]], activation='relu', return_sequences=True, dropout=0.2), keras.layers.Dense(9) ]) model.compile(loss='mse', optimizer=optimizer, metrics=[keras.metrics.RootMeanSquaredError()])
The summary of this model is:
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= simple_rnn (SimpleRNN) (None, 154521, 512) 266240 _________________________________________________________________ dense (Dense) (None, 154521, 9) 4617 ================================================================= Total params: 270,857 Trainable params: 270,857 Non-trainable params: 0 _________________________________________________________________
Update: Is this maybe the correct way to formulate this?
encoder = keras.layers.SimpleRNN(512, input_shape=[X_train.shape[0], X_train.shape[1]], activation='relu', return_sequences=False, dropout=0.2) decoder = keras.layers.SimpleRNN(512, input_shape=[X_train.shape[0], X_train.shape[1]], activation='relu', return_sequences=True, dropout=0.2) output = keras.layers.Dense(9)(decoder)
This is the dataset that I am using.
year venue c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 1989 234 0 1 2 3 4 5 5 5 5 8 8 10 11 12 1989 251 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1990 346 0 0 0 0 0 0 0 0 0 0 0 0 0 0
I need to give as an input all the columns until c5, and try to predict the other c’s (which are citation count for the upcoming years). Is this the right way to go forward?
Advertisement
Answer
Your model is token classification model not sequence-to-sequence.
Seq-2-seq model comprise of encoder and decoder (the both are RNN in your case). It can not be created with Sequentional API because there are separate inputs for encoder and decoder.
The encoder should be created with argument return_sequences=False
.
Dense layer should follow the decoder.
It should be something like that:
encoder_input = Input(shape=(None, 512)) decoder_input = Input(shape=(None, 512)) encoder_output = keras.layers.SimpleRNN(512, activation='relu', return_sequences=False, dropout=0.2)(encoder_input) encoder_output = encoder_output[:, tf.newaxis, ...] decoder_inputs = tf.concat([encoder_output, decoder_input], 1) decoder_output = keras.layers.SimpleRNN(512, activation='relu', return_sequences=True, dropout=0.2)(decoder_inputs) output = keras.layers.Dense(9)(decoder_output) model_att = tf.keras.models.Model([encoder_input, decoder_input], output ) model_att.compile(optimizer=ADAM, loss='sparse_categorical_crossentropy') model_att.summary()