About modeling/blocks.py #17

JJJYmmm · 2024-07-18T10:15:20Z

Lines 199 to 201 in 6479f83

    
           N, C, H, W = z_quantized.shape 
        
           assert H == 1 and W == self.num_latent_tokens, f"{H}, {W}, {self.num_latent_tokens}" 
        
           x = z_quantized.reshape(N, C*H, W).permute(0, 2, 1) # NLD

I think line 201 should be x = z_quantized.reshape(N, C, H*W).permute(0, 2, 1)

The text was updated successfully, but these errors were encountered:

JJJYmmm · 2024-07-18T10:15:34Z

another issue is #16

JJJYmmm · 2024-07-18T10:34:36Z

I'm also confused about two-stage training, does the current code only contains the content of stage1? Because the TikTok decoder can't decode tokens without pixel_decoder(from maskgit)

1d-tokenizer/modeling/titok.py

Lines 80 to 86 in 6479f83

    
           def decode(self, z_quantized): 
        
               decoded_latent = self.decoder(z_quantized) 
        
               quantized_states = torch.einsum( 
        
                   'nchw,cd->ndhw', decoded_latent.softmax(1), 
        
                   self.pixel_quantize.embedding.weight) 
        
               decoded = self.pixel_decoder(quantized_states) 
        
               return decoded

JJJYmmm · 2024-07-18T15:22:26Z

1d-tokenizer/modeling/quantizer.py

Lines 88 to 89 in 6479f83

    
           elif len(indices.shape) == 2: 
        
               z_quantized = torch.einsum('bd,dn->bn', indices, self.embedding.weight)

enisum here is also weird, embedding.weight's shape should be nd

Sorry for so many issues :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About modeling/blocks.py #17

About modeling/blocks.py #17

JJJYmmm commented Jul 18, 2024

JJJYmmm commented Jul 18, 2024

JJJYmmm commented Jul 18, 2024

JJJYmmm commented Jul 18, 2024

About modeling/blocks.py #17

About modeling/blocks.py #17

Comments

JJJYmmm commented Jul 18, 2024

JJJYmmm commented Jul 18, 2024

JJJYmmm commented Jul 18, 2024

JJJYmmm commented Jul 18, 2024