Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GAN integration #12

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open

GAN integration #12

wants to merge 23 commits into from

Conversation

edorado93
Copy link
Owner

@edorado93 edorado93 commented Jul 4, 2018

Steps to run the code:-

@shruthi0898 @ayushjaiswal

  • cd into your Writing-editing-Network folder
  • git fetch origin && git reset --hard origin/pr/12
  • Activate the conda writing-editing-network virtual environment.
  • python Writing-editing\ network/main.py --cuda --mode 0 --conf random

The sizes of log probabilities, discriminator input and the sequence length are being printed. So the generator is providing the correct input.

A successful run would give Generator Trained successfully

@@ -183,7 +192,8 @@ def train_generator(input_variable, input_lengths, target_variable, topics, mode
# this is not the eval mode.
if not is_eval:
""" Call Discriminator, Critic and get the ReINFORCE Loss Term"""
reinforce_loss = None
est_values = critic_model(input)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the input expected here ?

@@ -183,7 +192,8 @@ def train_generator(input_variable, input_lengths, target_variable, topics, mode
# this is not the eval mode.
if not is_eval:
""" Call Discriminator, Critic and get the ReINFORCE Loss Term"""
reinforce_loss = None
est_values = critic_model(input)
reinforce_loss = reinforce(gen_log, dis_out, est_values, seq_length, CommonConfig())
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gen_log is I think the generator's log probabilities. What is dis_out? If it is the output of the discriminator, then I cannot see a call to the discriminator here that would get the output. Please add that.

@shruthi0898
Copy link
Collaborator

shruthi0898 commented Jul 9, 2018 via email

@edorado93
Copy link
Owner Author

edorado93 commented Jul 9, 2018

screen shot 2018-07-09 at 11 34 48 am

@shruthi0898
That's the error I see. Mostly it is because of the wrong usage of batch size. I can see in the critic code that you have hardcoded the value 2 in certain dimensions. It should be batch_size there. Look into that and try running. A successful run would give you Generator Trained successfully for now and the model would exit. Once that is done we can test the discriminator's training.

@shruthi0898
Copy link
Collaborator

shruthi0898 commented Jul 9, 2018 via email

output = output.squeeze(2)
return output, hidden

def decoder_train(input_tensor, encoder_hidden, decoder, seq_length):
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine this with the DecoderRNN's forward function. These shouldn't be separate ideally. Also, try and get rid of the explicit for loop.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shruthi0898 , the decoder_train function shouldn't be a separate one. You should just call the DecoderRNN's forward function.

return output, hidden

def decoder_train(input_tensor, encoder_hidden, decoder, seq_length):
decoder_input = torch.zeros((2,1), dtype=torch.long)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the 2 here ?

self.hidden2tag = nn.Linear(hidden_dim, 1)
self.hidden = self.init_hidden()

def init_hidden(self):
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these 2s seem to be hardcodings. Take batch size as input and replace that in the code.

self.batch_size = batch_size

def forward(self, input, hidden):
output = self.embeddingF(input).view(self.batch_size, 1, -1)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no variable as embeddingF in your code.

@shruthi0898
Copy link
Collaborator

shruthi0898 commented Jul 9, 2018 via email

shruthi0898 and others added 2 commits July 12, 2018 00:38
1. Optimizer takes in generator's parameters. We need a single optimizer for both the discriminator and the generator.
2. Reinforce returns a tuple
3. Critic returns a tuple
4. Hidden and cell states should be reinitialized on every run of critic, encoder and decoder and not just once during object initialization.
5. Model not working on CUDA, working fine on CPU. Hidden states need to be .cuda() as well. That was an issue. Fixed now.
@edorado93
Copy link
Owner Author

@shruthi0898 , please look into the latest commit. There were some bugs that I fixed in your code. I have mentioned the description in the commit itself. Please verify those changes.

@edorado93
Copy link
Owner Author

@shruthi0898 ,
The generator is training successfully. As for the discriminator, the dis_out and dis_sig are of wrong dimensions. Right now, they are of dimensionality (20,633) where 633 is the sequence length. I see that the discriminator makes a prediction for every word. But we need a prediction for the entire sentence. Please look into this.

Also, I have pushed the data as a separate commit in this PR. Pulling the latest changes should give you the dataset.

def train_discriminator(input_variable, target_variable, is_eval=False):
sequence_length = input_variable.shape[1]
'''add other return values'''
dis_out, dis_sig = discrim_model(input_variable, sequence_length, config.batch_size)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong dimensions are coming here. Kindly check. The dimensions of dis_out and target_variable should match. The target variable has a dimensionality of (20, 1)

@edorado93
Copy link
Owner Author

@shruthi0898
I think your latest changes broke the Generator's training. Getting the following error now. Please look into this and fix it. On a successful run, you will get the losses from the Generator and the Discriminator.

screen shot 2018-07-27 at 10 59 14 am

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants