rocq-prover · thomas-lamiaux · Apr 28, 2026 · Apr 28, 2026 · Apr 28, 2026
diff --git a/src/metaprogramming/ltac2/tutorial_matching_terms_and_goals.v b/src/metaprogramming/ltac2/tutorial_matching_terms_and_goals.v
@@ -9,7 +9,7 @@
   making it easy to build decision procedures or solvers.
   There are three primitives to match terms or goals [lazy_match!], [match!],
   and [multi_match!], which only differ by their backtracking abilities.
-  (Not to confuse with [match] which is for Ltac2 inductive types.)
+  (Not to be confused with [match] which is for Ltac2 inductive types.)
   Consequently, we first explain how matching terms and goals work using
   [lazy_match!], then discuss the backtracking differences in a third section.
 
@@ -18,6 +18,8 @@
   - 1. Matching Terms
     - 1.1 Basics
     - 1.2 Non-Linear Matching
+    - 1.3 Matching Bound Variables
+    - 1.4 Using the Unsafe API to Access the Term Structure
   - 2. Matching Goals
     - 1.1 Basics
     - 1.2 Non-Linear Matching
@@ -38,7 +40,7 @@
 *)
 
 (** Let us first import Ltac2, and write a small function printing the goal under focus. *)
-From Ltac2 Require Import Ltac2 Printf.
+From Ltac2 Require Import Ltac2 Constr Printf.
 Import Bool.BoolNotations.
 
 Ltac2 print_goals0 () :=
@@ -61,7 +63,7 @@ Ltac2 Notation print_goals := print_goals0 ().
     To match a term, the syntax is [lazy_match! t with] with one clause of
     the form [ ... => ...] per pattern to match.
     As an example, let's write a small function proving the goal provided
-    it is given an argument of type [True -> ... True -> False].
+    it is given an argument of type [True -> ... -> True -> False].
 
     We write a recursive function that matches the type of [t] obtained with
     [Constr.type : constr -> constr] against [False] and [True -> _].
@@ -118,13 +120,13 @@ Abort.
     In such cases, [x] is of type [constr], the type of Rocq terms in Ltac2.
     This is normal as [x] corresponds to a subterm of [t].
 
-    Note that as in Ocaml, variable names cannot start with an upper case letter as this
+    Note that as in OCaml, variable names cannot start with an upper case letter as this
     is reserved for constructors of inductive types.
 
     As an example consider writing a boolean (in Ltac2, not in Rocq) test to check if a type is a
     proposition written out of [True], [False], [~] ,[/\] ,[\/].
     To check a term is a proposition, we need to match it to inspect its head symbol,
-    and if it is one of the allowed one check its subterms also are propositions.
+    and if it is one of the allowed ones check its subterms also are propositions.
     Consequently, we need to remember subterms, and to match [/\] we need to
     use the pattern [?a /\ ?b]. This gives the following recursive function:
 *)
@@ -138,7 +140,7 @@ Ltac2 rec is_proposition (t : constr) :=
   | _ => false
   end.
 
-(** To test it, lets us write a small printing function. *)
+(** To test it, let us write a small printing function. *)
 Ltac2 check_is_proposition (t : constr) :=
   if is_proposition t
   then printf "%t is a proposition" t
@@ -158,7 +160,7 @@ Abort.
 
     For instance, [fun x => x] and [fun y => y] are equal syntactically as they
     only differ by their names of bound variables (α-conversion), but [1 + 4]
-    and [5] are not equal syntactically as they only are equal up to reduction.
+    and [5] are not equal syntactically as they are only equal up to reduction.
     As a consequence, a small variation around a term that would require a
     reduction step like [(fun _ => False) 0] will not be matched by [False].
 *)
@@ -221,7 +223,7 @@ Abort.
     to constrain the shape of the term matched, like in [is_refl_eq], but no further.
     In this case, the warning can be easily disable by naming unused variables
     starting with [?_] rather than with [?].
-    For instance, by matching for [?_x = ?_x] rather than [?x = ?x] is [is_refl_eq].
+    For instance, by matching for [?_x = ?_x] rather than [?x = ?x] in [is_refl_eq].
 *)
 
 Ltac2 is_refl_eq' (t : constr) :=
@@ -234,11 +236,139 @@ Goal (1 = 1) -> False.
   intros H. is_refl_eq (Constr.type 'H).
 Abort.
 
+(** ** 1.3 Matching Bound Variables
+
+    When a term contains a binder, like [let var := expr in body] or [fun var => body],
+    the bound variable [var] only exists inside the binder's scope.
+    This raises a difficulty: if we match [let a := ?expr in ?body], then [body]
+    is the raw body term, in which [var] appears as a free variable.
+    We cannot apply [body] to a different term to substitute for [var], because
+    [var] is not a Ltac2 value we can pass around — it is just a name for the binder.
+
+    The solution is to capture the body as a function abstracted over the bound
+    variable, using the special pattern [@?body var].
+    In a pattern like [let var := ?expr in @?body var]:
+    - [?expr] is matched as usual and captures the definition,
+    - [var] names the bound variable of the let-in,
+    - [@?body var] captures the body as a function: [body] is bound to
+      [fun var => <body>], i.e. the body with [var] abstracted out as a lambda.
+
+    After the match, [body] is a [constr] representing a function, and we can
+    substitute any term [t] for the former bound variable by applying [$body $t].
+    The same pattern works for matching under [fun], [forall], and other binders.
+
+    As a concrete example, let us write a tactic [simplify_let] that takes a
+    hypothesis [h] whose type is a let-in [let var := expr in body[var]], and turns
+    it into [body[x']], where [x' := expr] is a new shared definition introduced
+    in the whole context and goal.
+
+    To do so, we recover the type of [h], and match it against the pattern [let var := ?expr in @?body var].
+    After matching, [expr] is the definition and [body] represents [fun var => body].
+    We then:
+    1. Create a new shared definition [x' := expr] in the whole context and goal
+       with [set (x' := $expr) in *].
+    2. Retrieve the Ltac2 term for the new hypothesis [x'] with [Control.hyp @x'].
+    3. Replace the type of [h] with [$body $x] using [change ($body $x) in h].
+      This creates a beta redex that we then reduce with [lazy head beta in h].
+*)
+
+Ltac2 simplify_let0 (h : ident) : unit :=
+  let type_h := type (Control.hyp h) in
+  lazy_match! type_h with
+  | let var := ?expr in @?body var =>
+      printf "the body is :%t" body;
+      set (x' := $expr) in *;
+      let x := Control.hyp @x' in
+      change ($body $x) in h;
+      lazy head beta in h
+  | _ => Control.zero (Tactic_failure (Some (fprintf "the type %t of %I is not a letin" type_h h)))
+  end.
+
+Ltac2 Notation "simplify_let" h(ident) := simplify_let0 h.
+
+Goal forall x y : nat, (let a := x + 2 in let b := y + 1 in a = b) -> True.
+  intros x y h.
+  Fail simplify_let x.
+  simplify_let h.
+Abort.
+
+(** ** 1.4 Using the Unsafe API to Access the Term Structure
+
+    An alternative to [lazy_match!] for inspecting terms is the [Unsafe] submodule
+    of [Constr], which exposes the kernel-level representation of terms through the
+    function [Unsafe.kind : constr -> kind].
+    The type [kind] is an Ltac2 inductive type whose constructors correspond to the
+    kernel term formers: [App], [Lambda], [Prod], [LetIn], [Rel], [Var], [Constant], etc.
+    The full list is available in the Ltac2 standard library at:
+    https://github.com/rocq-prover/rocq/blob/master/theories/Ltac2/Constr.v#L65
+
+    This API is called [Unsafe] because it exposes De Bruijn indices directly.
+    In Rocq's kernel, bound variables are represented not by names but by De Bruijn
+    indices — natural numbers counting the number of enclosing binders between the
+    variable occurrence and the binder that introduced it counting from 1.
+    For instance, in [fun x => x], the body [x] is stored as [Rel 1], meaning
+    "the variable bound by the immediately enclosing binder".
+    If these indices are managed incorrectly — for example by shifting them by the
+    wrong amount when going under a binder — the result is a not well-scoped term
+    in which an index refers to the wrong binder or to no binder at all, producing
+    an ill-scoped term. Hence the [Unsafe] qualifier.
+
+    Because [Unsafe.kind] returns an ordinary Ltac2 value of type [kind], one matches
+    on it with a regular Ltac2 [match], not with [lazy_match!], which is specific to
+    matching Rocq terms against surface-syntax patterns.
+
+    The main difference with using [lazy_match!] is that:
+    1. [lazy_match!] (and its siblings [match!] and [multi_match!]) match
+       Rocq terms against patterns. Any number of branches can succeed, or none.
+       Moreover, variants of [lazy_match!] discussed in the next section can do
+       backtracking to try the next pattern.
+    2. A plain Ltac2 [match] is a standard ML-style pattern match on an inductive value:
+       only one branch is taken, and no backtracking is involved.
+*)
+
+Import Unsafe.
+
+(** In the case of [simplify_let], the main interest of using the [Unsafe] API,
+    is that we can directly access the structure of the let in, and do the
+    substitution ourselves instead of relying on higher-level tactics like
+    [lazy beta iota] which may not always fit perfectly our purpose.
+
+    Here, we can reimplement [simplify_let_bis] by applying [Unsafe.kind] to the
+    type of [h] and matching the result against [LetIn _ expr body].
+    This gives us [expr] — the bound expression — and [body] — the raw kernel
+    term in which the bound variable appears as [Rel 1].
+    We then introduce [x' := expr] into the context with [set], recover the
+    corresponding hypothesis, and call [substnl [x'] 0 body] to replace every
+    occurrence of [Rel 1] in [body] by [x'] in one step, producing [new_body].
+    Finally, [change ($new_body) in h] rewrites the type of [h] to [new_body],
+    with no intermediate beta redex to reduce.
+*)
+Ltac2 simplify_let_bis0 (h : ident) : unit :=
+  let type_h := type (Control.hyp h) in
+  match Unsafe.kind type_h with
+  | LetIn _ expr body =>
+      set (x' := $expr) in *;
+      let x' := Control.hyp @x' in
+      let new_body := substnl [x'] 0 body in
+      change ($new_body) in h
+  | _ => Control.zero (Tactic_failure (Some (fprintf "the type %t of %I is not a letin" type_h h)))
+
+  end.
+
+Ltac2 Notation "simplify_let_bis" h(ident) := simplify_let_bis0 h.
+
+Goal forall x y : nat, (let a := x + 2 in let b := y + 1 in a = b) -> True.
+  intros x y h.
+  simplify_let_bis h.
+Abort.
+
+
+
 (** 2. Matching Goals
 
     2.1 Basics
 
-    Ltac2 also offers the possibility to match the goals to inspect the form the goal
+    Ltac2 also offers the possibility to match the goal to inspect the form of the goal
     to prove, and/or the existence of particular hypotheses like a proof of [False].
     The syntax to match goals is a bit different from matching terms.
     In this case, the patterns are of the form:
@@ -247,23 +377,23 @@ Abort.
 
     where:
     - [x1] ... [xn] are [ident], i.e. Ltac2 values corresponding to the names
-      of the hypotheses. Opposite to variables, they should not start with [?].
+      of the hypotheses. Unlike variables, they should not start with [?].
     - [t1] ... [tn] are the types of the hypotheses, and [g] the type of the goal
       we want to prove. All are of type [constr], the type of Rocq terms in Ltac2.
     - As usual [ident] --- [x1], ..., [xn] --- and any variables that could appear
       in [t1], ..., [tn], and [g], cannot start with an upper case letter as
       it is reserved for constructors.
 
-    Such a pattern will match the conclusion of the goal of type [G], and such that
-    can be found [n] different hypotheses of types [T1], ..., [Tn].
+    Such a pattern will match a goal with conclusion of type [G] where
+    [n] different hypotheses of types [T1], ..., [Tn] can be found in the context.
     Each clause must match a different hypothesis for the pattern to succeed.
     Consequently, there must be at least [n] different hypotheses / assumptions
     in the context for such a branch to have a chance to succeed.
     Moreover, the fallback wildcard pattern [_] we used to match terms is
     now [ |- _] as this matches any goals.
 
     As an example, let us write a small function starting with [lazy_match! goal with]
-    to inspect if the conlusion of the goal is either [_ /\ _] or [_ \/ _].
+    to inspect whether the conclusion of the goal is either [_ /\ _] or [_ \/ _].
 
     As we want to inspect the conclusion of the goal but not the hypotheses,
     our pattern is of the form [ |- g]. To match for [_ /\ _] and [_ \/ _],
@@ -361,13 +491,13 @@ Abort.
 
 (** As for matching terms, matching goals is by default syntactic.
     However, matching for non-linear variables is a bit more involved.
-    In the non-linear case, variable are matched up to conversions when they appear
-    in different clause, and up to syntax when they appear in the same clause.
+    In the non-linear case, variables are matched up to conversion when they appear
+    in different clauses, and up to syntax when they appear in the same clause.
 
     To understand this better, let us look at an example.
     We could write a version of [eassumption] by matching for the pattern [_ : ?t |- ?t].
     In this case, as [?t] appears in different clauses, one of hypotheses and
-    in then conclusion, it will hence be matched up to conversion.
+    in the conclusion, it will hence be matched up to conversion.
     We can check it works by supposing [0 + 1 = 0] and trying to prove [1 = 0],
     as [0 + 1] is equal to [1] up to conversion but not syntactically.
 *)
@@ -382,7 +512,7 @@ Abort.
 
 (** However, if a variable appears several times in a same clause, then it is
     compared syntactically. For instance, in [ |- ?t = ?t], [?t] is compared
-    syntactically as the it appears twice in the goal which forms one clause.
+    syntactically as it appears twice in the goal which forms one clause.
 *)
 
 Goal 1 + 1 = 2.
@@ -395,7 +525,7 @@ Abort.
 (** A subtlety to understand is that only the hole associated to a variable will
     be compared up to conversion, other symbols are matched syntactically as usual.
     For instance, if we match for [?t, ~(?t)], the symbol [~] will matched,
-    if one if found then inside the clause ~(X) that it will be compared
+    if one is found, then inside the clause ~(X) it will be compared
     up to conversion with [?t].
 
     Consequently, if we have [P], it will fail to match for [P -> False] as
@@ -423,13 +553,13 @@ Goal forall P, P -> ~((fun _ => P) 0) -> False.
 Abort.
 
 (** It often happens that non-linear matching is not precise enough for our
-    purpose, either because we would like to match both term in one clause
-    up to conversion, or because we would like to use a different notation of
-    equality of non-linear occurencess like syntactic equality, equality up to
+    purpose, either because we would like to match both terms in one clause
+    up to conversion, or because we would like to use a different notion of
+    equality for non-linear occurrences, like syntactic equality, equality up to
     some reduction, or even up to unification.
 
     In this case, the best approach is to use different variables for each
-    occurences we want to compared, then call the comparaison we wish.
+    occurrence we want to compare, then call the comparison we wish.
     For instance, to match for [P] and [~P] up to unification, we would:
     match for a pair of variables [t1], [t2] then call a unification function
     to check if one of the hypotheses is indeed a negation of the other.
@@ -445,22 +575,22 @@ Goal forall P, P -> (P -> False) -> False.
   end.
 Abort.
 
-(** The advantage of this method is that it provides the user with a fine grained
-    control to the user when and how reduction and equality tests are performed.
+(** The advantage of this method is that it provides the user with fine-grained
+    control over when and how reduction and equality tests are performed.
 *)
 
 (* 3. Backtracking and [lazy_match!], [match!], [multi_match!] *)
 
 (** 3.1 [lazy_match!]
 
     [lazy_match!] is the easiest command to understand and to use.
-    [lazy_match!] picks a branch, and sticks to it to even if the code excuted
+    [lazy_match!] picks a branch, and sticks to it even if the code executed
     after picking this branch (the body of the branch) leads to a failure.
     It will not backtrack to pick another branch if a choice leads to a failure.
 
     For instance, in the example below, it picks the first branch as everything
     matches [[ |- _]]. It prints "branch 1", then fails. As no backtracking is
-    allowed, it stick to this choice and fails.
+    allowed, it sticks to this choice and fails.
 *)
 
 Goal False.
@@ -473,7 +603,7 @@ Abort.
 
 (** [lazy_match!] should be considered as the default, as it is easy to understand
     (no backtracking) which prevents unexpected behaviour, and yet sufficient
-    for all applications where matching the syntax is a sufficient to decide what to do.
+    for all applications where matching the syntax is sufficient to decide what to do.
 
     A common use of [lazy_match!] is to make a decision based on the shape of the
     goal or the shape of a term or type.
@@ -482,7 +612,7 @@ Abort.
     variables and hypotheses with [intros ?] if the goal is of the form [A -> B],
     splits the goal with [split] if it is a conjunction [A /\ B], and recursively
     simplify the new goals.
-    In both case, the syntactic equality test is sufficient to decide what to do
+    In both cases, the syntactic equality test is sufficient to decide what to do
     as if it is of the required form, then the branch will succeed.
     One should hence use [lazy_match!], which leads to the following simple function:
 *)
@@ -526,15 +656,14 @@ Goal False.
   | [ |- _ ] => printf "branch 3"
   end.
 Abort.
-(** match!] is useful as soon as matching the syntax is not enough, and we
-    need additional tests to see if we have picked the good branch or not.
+(** [match!] is useful as soon as matching the syntax is not enough, and we
+    need additional tests to see if we have picked the right branch or not.
     Indeed, if such a test fails raising an exception (or we make it so), then
     [match!] will backtrack, and look for the next branch matching the pattern.
 
     A common application of [match!] is to match the goal for hypotheses, then
-    do extra-test to decide what to do, or ensure we have picked the good ones.
-    If we have not, failing or triggering failure, then enables to backtrack and
-    to try the next possible hypotheses.
+    do extra tests to decide what to do, or ensure we have picked the right ones.
+    If we have not, failing triggers backtracking to the next candidate hypothesis.
 
     A basic example is to recode a simple [eassumption] tactic, that tries
     to solve the goal with [exact P] for all hypotheses [P].
@@ -582,7 +711,7 @@ Goal False.
   end; fail.
 Abort.
 
-(** In contrast, when the composition fails [multi_match!] will further bracktrack
+(** In contrast, when the composition fails [multi_match!] will further backtrack
     to its choice of branch, in this case the second one, and try the next matching branch.
     The idea is that picking a different branch could have led to the subsequent
     tactic to succeed, as it can happen when using [constructor].
@@ -608,7 +737,7 @@ Abort.
     form [A \/ B]. We can only know if we have picked the good one, once chained
     with another tactic that tries to solve the goal.
     We hence need to use [multi_match!] as if we have picked the wrong side
-    to prove, we want to backtrack to pick the otherwiside.
+    to prove, we want to backtrack to pick the other side.
 
     This leads to the following small script, improved with printing to display
     the backtracking structure: