Skip to content

Comments

conditioning_not_working#9

Draft
johannahom wants to merge 7 commits intomainfrom
conditioning
Draft

conditioning_not_working#9
johannahom wants to merge 7 commits intomainfrom
conditioning

Conversation

@johannahom
Copy link
Collaborator

Still need to fix single conditions

parts = line.strip().split(split)
if has_speakers:
#need to add option for all conditions
if has_speakers or has_conditions: #this might be wrong @Johannah
Copy link
Owner

@evdv evdv May 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True or True = True
is this to say, the last item is either the speaker, or the condition?
given they're both booleans you could do:
has_speakers is not has_conditions
or put has_speakers and has_conditions first, so that this case is already handled and therefore excluded from has_speakers or has_conditions

if len(self.audiopaths_and_text[0]) < expected_columns:
raise ValueError(f'Expected {expected_columns} columns in audiopaths file. '
'The format is <mel_or_wav>|[<pitch>|]<text>[|<speaker_id>]')
'The format is <mel_or_wav>|[<pitch>|]<text>[|<speaker_id>|<condition_id>]')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I guess we're checking it here?

symbol_set='english_basic',
p_arpabet=1.0,
n_speakers=1,
n_conditions=1,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default is 1 but to have conditions there should be more than 1? If 1 is a way of saying there are no conditions, why not 0?:

def __getitem__(self, index):
# Separate filename and text
if self.n_speakers > 1:
if self.n_speakers > 1 and self.n_conditions < 1:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this block is a little cumbersome but I really appreciate how legible it is appreciation comment


if self.n_speakers > 1:
#specifying the fields
if self.n_speakers > 1 and self.n_conditions < 1:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what? my brain is tired and can't exactly figure out what's going on here


audiopaths = [batch[i][7] for i in ids_sorted_decreasing]

if batch[0][8] is not None:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this is the bit that would need updating once the other code is merged?


(inputs, input_lens, mel_tgt, mel_lens, pitch_dense, energy_dense,
speaker, attn_prior, audiopaths) = inputs
speaker, attn_prior, audiopaths, condition) = inputs
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a side-note I am wondring if we should make inputs/outputs some enum or datatype that also doesn't rely on indices to get different things out


# Predict pitch
pitch_pred = self.pitch_predictor(enc_out, enc_mask).permute(0, 2, 1)
pitch_pred = self.pitch_predictor(enc_out, enc_mask).permute(0, 2, 1) #maybe we want to condition pitch prediction on the conditioning parameter.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool idea, we should make a ticket for this

)

def forward(self, dec_inp, seq_lens=None, conditioning=0):
def forward(self, dec_inp, seq_lens=None, conditioning=0, conditioning_2=0): #here when called we add speaker or other discrete condition
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could make condition a tuple, or rename the conditionings to conditioning_speaker conditioning_other

prepare_tmp(args.pitch_online_dir)

trainset = TTSDataset(audiopaths_and_text=args.training_files, **vars(args))
trainset = TTSDataset(audiopaths_and_text=args.training_files, **vars(args)) #making changes here ./fastpitch/data_function.py
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment can be deleted?


gen_kw = {'pace': args.pace,
'speaker': args.speaker,
'condition': args.condition, #@Johannah have to add condition here
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment can be deleted?

def infer(self, inputs, pace=1.0, dur_tgt=None, pitch_tgt=None,
energy_tgt=None, pitch_transform=None, max_duration=75,
speaker=0):
speaker=0, condition=0):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so because this is the condition index, the default is 0 (despite no condition being n_conditions = 1 and so far there only being able to be 1 condition?)
Just making sure I understand, once again my brain is melting

@evdv
Copy link
Owner

evdv commented May 2, 2022

@johannahom my apologies for making all of these separarately, but mostly tomorrow I need to train a model from this branch to see how it goes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants