diff --git a/src/metaprogramming/ltac2/tutorial_matching_terms_and_goals.v b/src/metaprogramming/ltac2/tutorial_matching_terms_and_goals.v index a7acd93..1107a17 100644 --- a/src/metaprogramming/ltac2/tutorial_matching_terms_and_goals.v +++ b/src/metaprogramming/ltac2/tutorial_matching_terms_and_goals.v @@ -9,7 +9,7 @@ making it easy to build decision procedures or solvers. There are three primitives to match terms or goals [lazy_match!], [match!], and [multi_match!], which only differ by their backtracking abilities. - (Not to confuse with [match] which is for Ltac2 inductive types.) + (Not to be confused with [match] which is for Ltac2 inductive types.) Consequently, we first explain how matching terms and goals work using [lazy_match!], then discuss the backtracking differences in a third section. @@ -18,6 +18,8 @@ - 1. Matching Terms - 1.1 Basics - 1.2 Non-Linear Matching + - 1.3 Matching Bound Variables + - 1.4 Using the Unsafe API to Access the Term Structure - 2. Matching Goals - 1.1 Basics - 1.2 Non-Linear Matching @@ -38,7 +40,7 @@ *) (** Let us first import Ltac2, and write a small function printing the goal under focus. *) -From Ltac2 Require Import Ltac2 Printf. +From Ltac2 Require Import Ltac2 Constr Printf. Import Bool.BoolNotations. Ltac2 print_goals0 () := @@ -61,7 +63,7 @@ Ltac2 Notation print_goals := print_goals0 (). To match a term, the syntax is [lazy_match! t with] with one clause of the form [ ... => ...] per pattern to match. As an example, let's write a small function proving the goal provided - it is given an argument of type [True -> ... True -> False]. + it is given an argument of type [True -> ... -> True -> False]. We write a recursive function that matches the type of [t] obtained with [Constr.type : constr -> constr] against [False] and [True -> _]. @@ -118,13 +120,13 @@ Abort. In such cases, [x] is of type [constr], the type of Rocq terms in Ltac2. This is normal as [x] corresponds to a subterm of [t]. - Note that as in Ocaml, variable names cannot start with an upper case letter as this + Note that as in OCaml, variable names cannot start with an upper case letter as this is reserved for constructors of inductive types. As an example consider writing a boolean (in Ltac2, not in Rocq) test to check if a type is a proposition written out of [True], [False], [~] ,[/\] ,[\/]. To check a term is a proposition, we need to match it to inspect its head symbol, - and if it is one of the allowed one check its subterms also are propositions. + and if it is one of the allowed ones check its subterms also are propositions. Consequently, we need to remember subterms, and to match [/\] we need to use the pattern [?a /\ ?b]. This gives the following recursive function: *) @@ -138,7 +140,7 @@ Ltac2 rec is_proposition (t : constr) := | _ => false end. -(** To test it, lets us write a small printing function. *) +(** To test it, let us write a small printing function. *) Ltac2 check_is_proposition (t : constr) := if is_proposition t then printf "%t is a proposition" t @@ -158,7 +160,7 @@ Abort. For instance, [fun x => x] and [fun y => y] are equal syntactically as they only differ by their names of bound variables (α-conversion), but [1 + 4] - and [5] are not equal syntactically as they only are equal up to reduction. + and [5] are not equal syntactically as they are only equal up to reduction. As a consequence, a small variation around a term that would require a reduction step like [(fun _ => False) 0] will not be matched by [False]. *) @@ -221,7 +223,7 @@ Abort. to constrain the shape of the term matched, like in [is_refl_eq], but no further. In this case, the warning can be easily disable by naming unused variables starting with [?_] rather than with [?]. - For instance, by matching for [?_x = ?_x] rather than [?x = ?x] is [is_refl_eq]. + For instance, by matching for [?_x = ?_x] rather than [?x = ?x] in [is_refl_eq]. *) Ltac2 is_refl_eq' (t : constr) := @@ -234,11 +236,139 @@ Goal (1 = 1) -> False. intros H. is_refl_eq (Constr.type 'H). Abort. +(** ** 1.3 Matching Bound Variables + + When a term contains a binder, like [let var := expr in body] or [fun var => body], + the bound variable [var] only exists inside the binder's scope. + This raises a difficulty: if we match [let a := ?expr in ?body], then [body] + is the raw body term, in which [var] appears as a free variable. + We cannot apply [body] to a different term to substitute for [var], because + [var] is not a Ltac2 value we can pass around — it is just a name for the binder. + + The solution is to capture the body as a function abstracted over the bound + variable, using the special pattern [@?body var]. + In a pattern like [let var := ?expr in @?body var]: + - [?expr] is matched as usual and captures the definition, + - [var] names the bound variable of the let-in, + - [@?body var] captures the body as a function: [body] is bound to + [fun var => ], i.e. the body with [var] abstracted out as a lambda. + + After the match, [body] is a [constr] representing a function, and we can + substitute any term [t] for the former bound variable by applying [$body $t]. + The same pattern works for matching under [fun], [forall], and other binders. + + As a concrete example, let us write a tactic [simplify_let] that takes a + hypothesis [h] whose type is a let-in [let var := expr in body[var]], and turns + it into [body[x']], where [x' := expr] is a new shared definition introduced + in the whole context and goal. + + To do so, we recover the type of [h], and match it against the pattern [let var := ?expr in @?body var]. + After matching, [expr] is the definition and [body] represents [fun var => body]. + We then: + 1. Create a new shared definition [x' := expr] in the whole context and goal + with [set (x' := $expr) in *]. + 2. Retrieve the Ltac2 term for the new hypothesis [x'] with [Control.hyp @x']. + 3. Replace the type of [h] with [$body $x] using [change ($body $x) in h]. + This creates a beta redex that we then reduce with [lazy head beta in h]. +*) + +Ltac2 simplify_let0 (h : ident) : unit := + let type_h := type (Control.hyp h) in + lazy_match! type_h with + | let var := ?expr in @?body var => + printf "the body is :%t" body; + set (x' := $expr) in *; + let x := Control.hyp @x' in + change ($body $x) in h; + lazy head beta in h + | _ => Control.zero (Tactic_failure (Some (fprintf "the type %t of %I is not a letin" type_h h))) + end. + +Ltac2 Notation "simplify_let" h(ident) := simplify_let0 h. + +Goal forall x y : nat, (let a := x + 2 in let b := y + 1 in a = b) -> True. + intros x y h. + Fail simplify_let x. + simplify_let h. +Abort. + +(** ** 1.4 Using the Unsafe API to Access the Term Structure + + An alternative to [lazy_match!] for inspecting terms is the [Unsafe] submodule + of [Constr], which exposes the kernel-level representation of terms through the + function [Unsafe.kind : constr -> kind]. + The type [kind] is an Ltac2 inductive type whose constructors correspond to the + kernel term formers: [App], [Lambda], [Prod], [LetIn], [Rel], [Var], [Constant], etc. + The full list is available in the Ltac2 standard library at: + https://github.com/rocq-prover/rocq/blob/master/theories/Ltac2/Constr.v#L65 + + This API is called [Unsafe] because it exposes De Bruijn indices directly. + In Rocq's kernel, bound variables are represented not by names but by De Bruijn + indices — natural numbers counting the number of enclosing binders between the + variable occurrence and the binder that introduced it counting from 1. + For instance, in [fun x => x], the body [x] is stored as [Rel 1], meaning + "the variable bound by the immediately enclosing binder". + If these indices are managed incorrectly — for example by shifting them by the + wrong amount when going under a binder — the result is a not well-scoped term + in which an index refers to the wrong binder or to no binder at all, producing + an ill-scoped term. Hence the [Unsafe] qualifier. + + Because [Unsafe.kind] returns an ordinary Ltac2 value of type [kind], one matches + on it with a regular Ltac2 [match], not with [lazy_match!], which is specific to + matching Rocq terms against surface-syntax patterns. + + The main difference with using [lazy_match!] is that: + 1. [lazy_match!] (and its siblings [match!] and [multi_match!]) match + Rocq terms against patterns. Any number of branches can succeed, or none. + Moreover, variants of [lazy_match!] discussed in the next section can do + backtracking to try the next pattern. + 2. A plain Ltac2 [match] is a standard ML-style pattern match on an inductive value: + only one branch is taken, and no backtracking is involved. +*) + +Import Unsafe. + +(** In the case of [simplify_let], the main interest of using the [Unsafe] API, + is that we can directly access the structure of the let in, and do the + substitution ourselves instead of relying on higher-level tactics like + [lazy beta iota] which may not always fit perfectly our purpose. + + Here, we can reimplement [simplify_let_bis] by applying [Unsafe.kind] to the + type of [h] and matching the result against [LetIn _ expr body]. + This gives us [expr] — the bound expression — and [body] — the raw kernel + term in which the bound variable appears as [Rel 1]. + We then introduce [x' := expr] into the context with [set], recover the + corresponding hypothesis, and call [substnl [x'] 0 body] to replace every + occurrence of [Rel 1] in [body] by [x'] in one step, producing [new_body]. + Finally, [change ($new_body) in h] rewrites the type of [h] to [new_body], + with no intermediate beta redex to reduce. +*) +Ltac2 simplify_let_bis0 (h : ident) : unit := + let type_h := type (Control.hyp h) in + match Unsafe.kind type_h with + | LetIn _ expr body => + set (x' := $expr) in *; + let x' := Control.hyp @x' in + let new_body := substnl [x'] 0 body in + change ($new_body) in h + | _ => Control.zero (Tactic_failure (Some (fprintf "the type %t of %I is not a letin" type_h h))) + + end. + +Ltac2 Notation "simplify_let_bis" h(ident) := simplify_let_bis0 h. + +Goal forall x y : nat, (let a := x + 2 in let b := y + 1 in a = b) -> True. + intros x y h. + simplify_let_bis h. +Abort. + + + (** 2. Matching Goals 2.1 Basics - Ltac2 also offers the possibility to match the goals to inspect the form the goal + Ltac2 also offers the possibility to match the goal to inspect the form of the goal to prove, and/or the existence of particular hypotheses like a proof of [False]. The syntax to match goals is a bit different from matching terms. In this case, the patterns are of the form: @@ -247,15 +377,15 @@ Abort. where: - [x1] ... [xn] are [ident], i.e. Ltac2 values corresponding to the names - of the hypotheses. Opposite to variables, they should not start with [?]. + of the hypotheses. Unlike variables, they should not start with [?]. - [t1] ... [tn] are the types of the hypotheses, and [g] the type of the goal we want to prove. All are of type [constr], the type of Rocq terms in Ltac2. - As usual [ident] --- [x1], ..., [xn] --- and any variables that could appear in [t1], ..., [tn], and [g], cannot start with an upper case letter as it is reserved for constructors. - Such a pattern will match the conclusion of the goal of type [G], and such that - can be found [n] different hypotheses of types [T1], ..., [Tn]. + Such a pattern will match a goal with conclusion of type [G] where + [n] different hypotheses of types [T1], ..., [Tn] can be found in the context. Each clause must match a different hypothesis for the pattern to succeed. Consequently, there must be at least [n] different hypotheses / assumptions in the context for such a branch to have a chance to succeed. @@ -263,7 +393,7 @@ Abort. now [ |- _] as this matches any goals. As an example, let us write a small function starting with [lazy_match! goal with] - to inspect if the conlusion of the goal is either [_ /\ _] or [_ \/ _]. + to inspect whether the conclusion of the goal is either [_ /\ _] or [_ \/ _]. As we want to inspect the conclusion of the goal but not the hypotheses, our pattern is of the form [ |- g]. To match for [_ /\ _] and [_ \/ _], @@ -361,13 +491,13 @@ Abort. (** As for matching terms, matching goals is by default syntactic. However, matching for non-linear variables is a bit more involved. - In the non-linear case, variable are matched up to conversions when they appear - in different clause, and up to syntax when they appear in the same clause. + In the non-linear case, variables are matched up to conversion when they appear + in different clauses, and up to syntax when they appear in the same clause. To understand this better, let us look at an example. We could write a version of [eassumption] by matching for the pattern [_ : ?t |- ?t]. In this case, as [?t] appears in different clauses, one of hypotheses and - in then conclusion, it will hence be matched up to conversion. + in the conclusion, it will hence be matched up to conversion. We can check it works by supposing [0 + 1 = 0] and trying to prove [1 = 0], as [0 + 1] is equal to [1] up to conversion but not syntactically. *) @@ -382,7 +512,7 @@ Abort. (** However, if a variable appears several times in a same clause, then it is compared syntactically. For instance, in [ |- ?t = ?t], [?t] is compared - syntactically as the it appears twice in the goal which forms one clause. + syntactically as it appears twice in the goal which forms one clause. *) Goal 1 + 1 = 2. @@ -395,7 +525,7 @@ Abort. (** A subtlety to understand is that only the hole associated to a variable will be compared up to conversion, other symbols are matched syntactically as usual. For instance, if we match for [?t, ~(?t)], the symbol [~] will matched, - if one if found then inside the clause ~(X) that it will be compared + if one is found, then inside the clause ~(X) it will be compared up to conversion with [?t]. Consequently, if we have [P], it will fail to match for [P -> False] as @@ -423,13 +553,13 @@ Goal forall P, P -> ~((fun _ => P) 0) -> False. Abort. (** It often happens that non-linear matching is not precise enough for our - purpose, either because we would like to match both term in one clause - up to conversion, or because we would like to use a different notation of - equality of non-linear occurencess like syntactic equality, equality up to + purpose, either because we would like to match both terms in one clause + up to conversion, or because we would like to use a different notion of + equality for non-linear occurrences, like syntactic equality, equality up to some reduction, or even up to unification. In this case, the best approach is to use different variables for each - occurences we want to compared, then call the comparaison we wish. + occurrence we want to compare, then call the comparison we wish. For instance, to match for [P] and [~P] up to unification, we would: match for a pair of variables [t1], [t2] then call a unification function to check if one of the hypotheses is indeed a negation of the other. @@ -445,8 +575,8 @@ Goal forall P, P -> (P -> False) -> False. end. Abort. -(** The advantage of this method is that it provides the user with a fine grained - control to the user when and how reduction and equality tests are performed. +(** The advantage of this method is that it provides the user with fine-grained + control over when and how reduction and equality tests are performed. *) (* 3. Backtracking and [lazy_match!], [match!], [multi_match!] *) @@ -454,13 +584,13 @@ Abort. (** 3.1 [lazy_match!] [lazy_match!] is the easiest command to understand and to use. - [lazy_match!] picks a branch, and sticks to it to even if the code excuted + [lazy_match!] picks a branch, and sticks to it even if the code executed after picking this branch (the body of the branch) leads to a failure. It will not backtrack to pick another branch if a choice leads to a failure. For instance, in the example below, it picks the first branch as everything matches [[ |- _]]. It prints "branch 1", then fails. As no backtracking is - allowed, it stick to this choice and fails. + allowed, it sticks to this choice and fails. *) Goal False. @@ -473,7 +603,7 @@ Abort. (** [lazy_match!] should be considered as the default, as it is easy to understand (no backtracking) which prevents unexpected behaviour, and yet sufficient - for all applications where matching the syntax is a sufficient to decide what to do. + for all applications where matching the syntax is sufficient to decide what to do. A common use of [lazy_match!] is to make a decision based on the shape of the goal or the shape of a term or type. @@ -482,7 +612,7 @@ Abort. variables and hypotheses with [intros ?] if the goal is of the form [A -> B], splits the goal with [split] if it is a conjunction [A /\ B], and recursively simplify the new goals. - In both case, the syntactic equality test is sufficient to decide what to do + In both cases, the syntactic equality test is sufficient to decide what to do as if it is of the required form, then the branch will succeed. One should hence use [lazy_match!], which leads to the following simple function: *) @@ -526,15 +656,14 @@ Goal False. | [ |- _ ] => printf "branch 3" end. Abort. -(** match!] is useful as soon as matching the syntax is not enough, and we - need additional tests to see if we have picked the good branch or not. +(** [match!] is useful as soon as matching the syntax is not enough, and we + need additional tests to see if we have picked the right branch or not. Indeed, if such a test fails raising an exception (or we make it so), then [match!] will backtrack, and look for the next branch matching the pattern. A common application of [match!] is to match the goal for hypotheses, then - do extra-test to decide what to do, or ensure we have picked the good ones. - If we have not, failing or triggering failure, then enables to backtrack and - to try the next possible hypotheses. + do extra tests to decide what to do, or ensure we have picked the right ones. + If we have not, failing triggers backtracking to the next candidate hypothesis. A basic example is to recode a simple [eassumption] tactic, that tries to solve the goal with [exact P] for all hypotheses [P]. @@ -582,7 +711,7 @@ Goal False. end; fail. Abort. -(** In contrast, when the composition fails [multi_match!] will further bracktrack +(** In contrast, when the composition fails [multi_match!] will further backtrack to its choice of branch, in this case the second one, and try the next matching branch. The idea is that picking a different branch could have led to the subsequent tactic to succeed, as it can happen when using [constructor]. @@ -608,7 +737,7 @@ Abort. form [A \/ B]. We can only know if we have picked the good one, once chained with another tactic that tries to solve the goal. We hence need to use [multi_match!] as if we have picked the wrong side - to prove, we want to backtrack to pick the otherwiside. + to prove, we want to backtrack to pick the other side. This leads to the following small script, improved with printing to display the backtracking structure: