When taking the suffix activations, in steering_vector.py, I noticed that the length of the tokenized suffix is calculated using suffixes[0][0], but the different suffixes in the list may be of different lengths. Wondering if this is intended behavior?
elif accumulate_last_x_tokens == "suffix-only":
if suffixes:
# Tokenize the suffix
suffix_tokens = tokenizer.encode(suffixes[0][0], add_special_tokens=False)
# Get the hidden states for the suffix tokens
suffix_hidden = batch_hidden[-len(suffix_tokens):, :]
accumulated_hidden_state = torch.mean(suffix_hidden, dim=0)
Thanks in advance!
When taking the suffix activations, in
steering_vector.py, I noticed that the length of the tokenized suffix is calculated usingsuffixes[0][0], but the different suffixes in the list may be of different lengths. Wondering if this is intended behavior?Thanks in advance!