format_args!: only de-duplicate captured identifiers that refer to places#152480
format_args!: only de-duplicate captured identifiers that refer to places#152480dianne wants to merge 2 commits intorust-lang:mainfrom
format_args!: only de-duplicate captured identifiers that refer to places#152480Conversation
Co-authored-by: Shoyu Vanilla (Flint) <modulo641@gmail.com>
| fmt = flatten_format_args(fmt); | ||
| fmt = self.inline_literals(fmt); | ||
| } | ||
| fmt = self.dedup_captured_places(fmt); |
There was a problem hiding this comment.
The de-duplication happens here as a separate pass, heavily based on the unstable pass to flatten nested format_args! and inline literals.
| // Remove the arguments that were de-duplicated. | ||
| if deduped_anything { | ||
| let fmt = fmt.to_mut(); | ||
|
|
||
| // Drop all the arguments that are marked for removal. | ||
| let mut remove_it = remove.iter(); | ||
| fmt.arguments.all_args_mut().retain(|_| remove_it.next() != Some(&true)); | ||
|
|
||
| // Calculate the mapping of old to new indexes for the remaining arguments. | ||
| let index_map: Vec<usize> = remove | ||
| .into_iter() | ||
| .scan(0, |i, remove| { | ||
| let mapped = *i; | ||
| *i += !remove as usize; | ||
| Some(mapped) | ||
| }) | ||
| .collect(); | ||
|
|
||
| // Correct the indexes that refer to arguments that have shifted position. | ||
| for_all_argument_indexes(&mut fmt.template, |index| *index = index_map[*index]); | ||
| } |
There was a problem hiding this comment.
This was all copy-pasted from inline_literals. It should either be factored out or made more specialized. E.g. here it doesn't have to iterate through all arguments, since captured arguments should always be last, I think? Probably there's plenty of other things it could do to be more clever.
There was a problem hiding this comment.
@dianne Captured args aren't always last:
let x = 42;
println!("{x} {} {x}", "==");There was a problem hiding this comment.
By last, I mean in fmt.arguments's Vec. To my understanding, positional arguments come first, then named arguments, and then captured arguments are last. Since only captured arguments would be removed by the de-duplication pass, we could do something like (ignoring privacy, readability, and other practical concerns)
fmt
.arguments
.all_args_mut()
.extract_if(fmt.arguments.num_explicit_args.., |_| /* ... */)
.for_each(drop)It feels excessive, and it would probably need a helper on rustc_ast::FormatArguments to avoid exposing to the rest of rustc that arguments are sorted in that way, but in theory it would avoid some iteration.
| // We don't de-duplicate widths or precisions since de-duplication can be observed. | ||
| let _ = | ||
| { | ||
| super let args = (&x, &x, &x); | ||
| super let args = | ||
| [format_argument::new_display(args.0), | ||
| format_argument::from_usize(args.1), | ||
| format_argument::from_usize(args.2)]; |
There was a problem hiding this comment.
I think we could get this down to only capturing &x once, but it'd still need two separate from_usize calls to always be the same as providing the width two separate times. This is similar to my understanding of why {a.b} is tricky to fully de-duplicate: if x has an impure/pathological impl Deref<Target = usize>, the deref coercion of &x to a &usize could return different values each time (or modify global state, etc.).
There was a problem hiding this comment.
I didn't try to do that optimization in this PR since I think de-duplicating the from_usize calls would mean adding hacks to more central parts of the format_args! expansion.
| Entry::Vacant(vacant_entry) => { | ||
| // This is the first time we've seen a captured identifier. If it's a local | ||
| // or static, note the argument index so other occurrences can be deduped. | ||
| if let Some(partial_res) = self.resolver.partial_res_map.get(&arg.expr.id) |
There was a problem hiding this comment.
This would be the first use of self.resolver in format_args. I understand that's the entire point of this PR, but that's exactly the thing that I think is problematic. This is what would make format_args too magical for me.
There was a problem hiding this comment.
It's a separate optimization pass, which is what makes it seem reasonable to me. That said, it would be helpful to understand better what aspect of this optimization pass you see as a maintenance issue.
There was a problem hiding this comment.
In particular, fomat_args already has magic optimization passes, doesn't it? If format arguments are constants, they can get "flattened".
There was a problem hiding this comment.
Yes, and these optimization passes already has user-visible effects: https://doc.rust-lang.org/1.93.1/reference/expressions.html#r-expr.super-macros.format_args.super-temporaries
There was a problem hiding this comment.
Ouch, that's messy... though at least it seems to only affect whether a program compiles, not how it behaves?
Proof-of-concept PR to help resolve the lets-see-the-optimization concern on #149926. I tried to aim for something as decoupled as possible from the rest of the
format_args!logic, rather than a clever or efficient implementation. This doesn't do anything to optimize captured width/precision arguments. I've left more details inline.