Stochastic GCG for optimizing prompts on datasets

Let's generalize the [easy_gcg]() code to optimize prompts on a dataset of `(x, y)` pairs, where each `x` is the `question` and `y` is the `answer`. 

We want to solve `u := argmax_u E [P(y | u + x)]` where the expectation is taken over the dataset `(x, y) ~ D`. 

We can start by simply aggregating gradients for the swaps in GCG over multiple elements of the batch (https://github.com/amanb2000/Magic_Words/blob/32840cd867c83fc131205e5ff639a109f4e4f78c/magic_words/easy_gcg.py#L178). 

All that remains is to create an efficient `batch_compute_score_dataset()` function to compute the scores of each potential new prompt w.r.t. the dataset (https://github.com/amanb2000/Magic_Words/blob/32840cd867c83fc131205e5ff639a109f4e4f78c/magic_words/easy_gcg.py#L263)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stochastic GCG for optimizing prompts on datasets #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Stochastic GCG for optimizing prompts on datasets #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions