Skip to content

Suffix Activation Storing #5

@chuyishang

Description

@chuyishang

When taking the suffix activations, in steering_vector.py, I noticed that the length of the tokenized suffix is calculated using suffixes[0][0], but the different suffixes in the list may be of different lengths. Wondering if this is intended behavior?

elif accumulate_last_x_tokens == "suffix-only":
    if suffixes:
        # Tokenize the suffix
        suffix_tokens = tokenizer.encode(suffixes[0][0], add_special_tokens=False)
        # Get the hidden states for the suffix tokens
        suffix_hidden = batch_hidden[-len(suffix_tokens):, :]
        accumulated_hidden_state = torch.mean(suffix_hidden, dim=0)

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions