Closed. This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 2 years ago. Improve this question I am new to this field and I was reading a paper "Predicting citation counts based on deep neural network learning

Modeling Encoder-Decoder according to instructions from a paper [closed]

I am new to this field and I was reading a paper “Predicting citation counts based on deep neural network learning techniques”. There the authors describe the code that they implemented if someone wants to reproduce the results. I tried to do this but I am not sure if I succeeded.

Here is their description:

-RNN module - SimpleRNN
-Output dimension of the encoder - 512
-The output layer - Dense layer
-Activation function - ReLU
-Overfitting prevention technique - Dropout with 0.2 rate
-Epochs - 100
Optimization algorithm - RMSProp
Learning rate - 10^{-5}
Batch size - 256

JavaScript
​x
 
-RNN module - SimpleRNN
-Output dimension of the encoder - 512
-The output layer - Dense layer
-Activation function - ReLU
-Overfitting prevention technique - Dropout with 0.2 rate
-Epochs - 100
Optimization algorithm - RMSProp
Learning rate - 10^{-5}
Batch size - 256
​

And here is my implementation. I am not sure if the model I created is sequence to sequence.

epocsh = 100
batch_size = 256
optimizer = keras.optimizers.RMSprop(lr=0.00001)
model =  keras.models.Sequential([
    keras.layers.SimpleRNN(512, input_shape=[X_train.shape[0], X_train.shape[1]],
                           activation='relu', return_sequences=True, dropout=0.2),
    keras.layers.Dense(9)
])

model.compile(loss='mse', optimizer=optimizer, metrics=[keras.metrics.RootMeanSquaredError()])

JavaScript
 
epocsh = 100
batch_size = 256
optimizer = keras.optimizers.RMSprop(lr=0.00001)
model =  keras.models.Sequential([
    keras.layers.SimpleRNN(512, input_shape=[X_train.shape[0], X_train.shape[1]],
                           activation='relu', return_sequences=True, dropout=0.2),
    keras.layers.Dense(9)
])
​
model.compile(loss='mse', optimizer=optimizer, metrics=[keras.metrics.RootMeanSquaredError()])
​

The summary of this model is:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
simple_rnn (SimpleRNN)       (None, 154521, 512)       266240    
_________________________________________________________________
dense (Dense)                (None, 154521, 9)         4617      
=================================================================
Total params: 270,857
Trainable params: 270,857
Non-trainable params: 0
_________________________________________________________________

JavaScript
 
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
simple_rnn (SimpleRNN)       (None, 154521, 512)       266240    
_________________________________________________________________
dense (Dense)                (None, 154521, 9)         4617      
=================================================================
Total params: 270,857
Trainable params: 270,857
Non-trainable params: 0
_________________________________________________________________
​

Update: Is this maybe the correct way to formulate this?

encoder = keras.layers.SimpleRNN(512,
                                 input_shape=[X_train.shape[0], X_train.shape[1]],
                                 activation='relu',
                                 return_sequences=False,
                                 dropout=0.2)

decoder = keras.layers.SimpleRNN(512,
                                 input_shape=[X_train.shape[0], X_train.shape[1]],
                                 activation='relu',
                                 return_sequences=True,
                                 dropout=0.2)

output = keras.layers.Dense(9)(decoder)

JavaScript
 
encoder = keras.layers.SimpleRNN(512,
                                 input_shape=[X_train.shape[0], X_train.shape[1]],
                                 activation='relu',
                                 return_sequences=False,
                                 dropout=0.2)
​
decoder = keras.layers.SimpleRNN(512,
                                 input_shape=[X_train.shape[0], X_train.shape[1]],
                                 activation='relu',
                                 return_sequences=True,
                                 dropout=0.2)
​
output = keras.layers.Dense(9)(decoder)
​

This is the dataset that I am using.

year  venue  c1  c2  c3  c4  c5  c6  c7  c8  c9  c10  c11  c12  c13  c14
1989    234   0   1   2   3   4   5   5   5   5    8    8   10   11   12
1989    251   0   0   0   0   0   0   0   0   0    0    0    0    0    0
1990    346   0   0   0   0   0   0   0   0   0    0    0    0    0    0

JavaScript
 
year  venue  c1  c2  c3  c4  c5  c6  c7  c8  c9  c10  c11  c12  c13  c14
1989    234   0   1   2   3   4   5   5   5   5    8    8   10   11   12
1989    251   0   0   0   0   0   0   0   0   0    0    0    0    0    0
1990    346   0   0   0   0   0   0   0   0   0    0    0    0    0    0
​

I need to give as an input all the columns until c5, and try to predict the other c’s (which are citation count for the upcoming years). Is this the right way to go forward?

Answer

Your model is token classification model not sequence-to-sequence.

Seq-2-seq model comprise of encoder and decoder (the both are RNN in your case). It can not be created with Sequentional API because there are separate inputs for encoder and decoder.

The encoder should be created with argument return_sequences=False.

Dense layer should follow the decoder.

It should be something like that:

encoder_input = Input(shape=(None, 512))
decoder_input = Input(shape=(None, 512))
encoder_output = keras.layers.SimpleRNN(512,
                                 activation='relu',
                                 return_sequences=False,
                                 dropout=0.2)(encoder_input)
encoder_output = encoder_output[:, tf.newaxis, ...]
decoder_inputs = tf.concat([encoder_output, decoder_input], 1)
decoder_output = keras.layers.SimpleRNN(512,
                                 activation='relu',
                                 return_sequences=True,
                                 dropout=0.2)(decoder_inputs)

output = keras.layers.Dense(9)(decoder_output)
model_att = tf.keras.models.Model([encoder_input, decoder_input], output )

model_att.compile(optimizer=ADAM, loss='sparse_categorical_crossentropy')

model_att.summary()

JavaScript
 
encoder_input = Input(shape=(None, 512))
decoder_input = Input(shape=(None, 512))
encoder_output = keras.layers.SimpleRNN(512,
                                 activation='relu',
                                 return_sequences=False,
                                 dropout=0.2)(encoder_input)
encoder_output = encoder_output[:, tf.newaxis, ...]
decoder_inputs = tf.concat([encoder_output, decoder_input], 1)
decoder_output = keras.layers.SimpleRNN(512,
                                 activation='relu',
                                 return_sequences=True,
                                 dropout=0.2)(decoder_inputs)
​
output = keras.layers.Dense(9)(decoder_output)
model_att = tf.keras.models.Model([encoder_input, decoder_input], output )
​
model_att.compile(optimizer=ADAM, loss='sparse_categorical_crossentropy')
​
model_att.summary()
​

Advertisement

Answer