-
Notifications
You must be signed in to change notification settings - Fork 351
Description
Thank you for the interesting work, and making the code easily accessible. I have some confusion on the relationship between the ratio and iterative_size parameters.
In the case I am interested, there is a single demonstration that I want to compress using only the token-level compression approach. I've noticed that, in general, the final ratio between the compressed and original length can vary quite a bit for large enough' ratio' values. However, I noticed that when I make the iterative_size parameter small, e.g. 10, the final compressed ratio is more truthful to the value specified for the ratio parameter.
I'm confused as to why this is the case. From the paper, my understanding was that \gamma_j threshold for segment s_j (whose length is defined by the iterative_size parameter), was based primarily on the ratio parameter. Meaning that, regardless of the iterative_size, LLMLingua would always prune ratio percentage of the tokens in that segment.
Any clarifications of this would be useful, including where in the code \gamma_j is computed.