Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
199 changes: 164 additions & 35 deletions src/metaprogramming/ltac2/tutorial_matching_terms_and_goals.v
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
making it easy to build decision procedures or solvers.
There are three primitives to match terms or goals [lazy_match!], [match!],
and [multi_match!], which only differ by their backtracking abilities.
(Not to confuse with [match] which is for Ltac2 inductive types.)
(Not to be confused with [match] which is for Ltac2 inductive types.)
Consequently, we first explain how matching terms and goals work using
[lazy_match!], then discuss the backtracking differences in a third section.

Expand All @@ -18,6 +18,8 @@
- 1. Matching Terms
- 1.1 Basics
- 1.2 Non-Linear Matching
- 1.3 Matching Bound Variables
- 1.4 Using the Unsafe API to Access the Term Structure
- 2. Matching Goals
- 1.1 Basics
- 1.2 Non-Linear Matching
Expand All @@ -38,7 +40,7 @@
*)

(** Let us first import Ltac2, and write a small function printing the goal under focus. *)
From Ltac2 Require Import Ltac2 Printf.
From Ltac2 Require Import Ltac2 Constr Printf.
Import Bool.BoolNotations.

Ltac2 print_goals0 () :=
Expand All @@ -61,7 +63,7 @@ Ltac2 Notation print_goals := print_goals0 ().
To match a term, the syntax is [lazy_match! t with] with one clause of
the form [ ... => ...] per pattern to match.
As an example, let's write a small function proving the goal provided
it is given an argument of type [True -> ... True -> False].
it is given an argument of type [True -> ... -> True -> False].

We write a recursive function that matches the type of [t] obtained with
[Constr.type : constr -> constr] against [False] and [True -> _].
Expand Down Expand Up @@ -118,13 +120,13 @@ Abort.
In such cases, [x] is of type [constr], the type of Rocq terms in Ltac2.
This is normal as [x] corresponds to a subterm of [t].

Note that as in Ocaml, variable names cannot start with an upper case letter as this
Note that as in OCaml, variable names cannot start with an upper case letter as this
is reserved for constructors of inductive types.

As an example consider writing a boolean (in Ltac2, not in Rocq) test to check if a type is a
proposition written out of [True], [False], [~] ,[/\] ,[\/].
To check a term is a proposition, we need to match it to inspect its head symbol,
and if it is one of the allowed one check its subterms also are propositions.
and if it is one of the allowed ones check its subterms also are propositions.
Consequently, we need to remember subterms, and to match [/\] we need to
use the pattern [?a /\ ?b]. This gives the following recursive function:
*)
Expand All @@ -138,7 +140,7 @@ Ltac2 rec is_proposition (t : constr) :=
| _ => false
end.

(** To test it, lets us write a small printing function. *)
(** To test it, let us write a small printing function. *)
Ltac2 check_is_proposition (t : constr) :=
if is_proposition t
then printf "%t is a proposition" t
Expand All @@ -158,7 +160,7 @@ Abort.

For instance, [fun x => x] and [fun y => y] are equal syntactically as they
only differ by their names of bound variables (α-conversion), but [1 + 4]
and [5] are not equal syntactically as they only are equal up to reduction.
and [5] are not equal syntactically as they are only equal up to reduction.
As a consequence, a small variation around a term that would require a
reduction step like [(fun _ => False) 0] will not be matched by [False].
*)
Expand Down Expand Up @@ -221,7 +223,7 @@ Abort.
to constrain the shape of the term matched, like in [is_refl_eq], but no further.
In this case, the warning can be easily disable by naming unused variables
starting with [?_] rather than with [?].
For instance, by matching for [?_x = ?_x] rather than [?x = ?x] is [is_refl_eq].
For instance, by matching for [?_x = ?_x] rather than [?x = ?x] in [is_refl_eq].
*)

Ltac2 is_refl_eq' (t : constr) :=
Expand All @@ -234,11 +236,139 @@ Goal (1 = 1) -> False.
intros H. is_refl_eq (Constr.type 'H).
Abort.

(** ** 1.3 Matching Bound Variables

When a term contains a binder, like [let var := expr in body] or [fun var => body],
the bound variable [var] only exists inside the binder's scope.
This raises a difficulty: if we match [let a := ?expr in ?body], then [body]
is the raw body term, in which [var] appears as a free variable.
We cannot apply [body] to a different term to substitute for [var], because
[var] is not a Ltac2 value we can pass around — it is just a name for the binder.

The solution is to capture the body as a function abstracted over the bound
variable, using the special pattern [@?body var].
In a pattern like [let var := ?expr in @?body var]:
- [?expr] is matched as usual and captures the definition,
- [var] names the bound variable of the let-in,
- [@?body var] captures the body as a function: [body] is bound to
[fun var => <body>], i.e. the body with [var] abstracted out as a lambda.

After the match, [body] is a [constr] representing a function, and we can
substitute any term [t] for the former bound variable by applying [$body $t].
The same pattern works for matching under [fun], [forall], and other binders.

As a concrete example, let us write a tactic [simplify_let] that takes a
hypothesis [h] whose type is a let-in [let var := expr in body[var]], and turns
it into [body[x']], where [x' := expr] is a new shared definition introduced
in the whole context and goal.

To do so, we recover the type of [h], and match it against the pattern [let var := ?expr in @?body var].
After matching, [expr] is the definition and [body] represents [fun var => body].
We then:
1. Create a new shared definition [x' := expr] in the whole context and goal
with [set (x' := $expr) in *].
2. Retrieve the Ltac2 term for the new hypothesis [x'] with [Control.hyp @x'].
3. Replace the type of [h] with [$body $x] using [change ($body $x) in h].
This creates a beta redex that we then reduce with [lazy head beta in h].
*)

Ltac2 simplify_let0 (h : ident) : unit :=
let type_h := type (Control.hyp h) in
lazy_match! type_h with
| let var := ?expr in @?body var =>
printf "the body is :%t" body;
set (x' := $expr) in *;
let x := Control.hyp @x' in
change ($body $x) in h;
lazy head beta in h
| _ => Control.zero (Tactic_failure (Some (fprintf "the type %t of %I is not a letin" type_h h)))
end.

Ltac2 Notation "simplify_let" h(ident) := simplify_let0 h.

Goal forall x y : nat, (let a := x + 2 in let b := y + 1 in a = b) -> True.
intros x y h.
Fail simplify_let x.
simplify_let h.
Abort.

(** ** 1.4 Using the Unsafe API to Access the Term Structure

An alternative to [lazy_match!] for inspecting terms is the [Unsafe] submodule
of [Constr], which exposes the kernel-level representation of terms through the
function [Unsafe.kind : constr -> kind].
The type [kind] is an Ltac2 inductive type whose constructors correspond to the
kernel term formers: [App], [Lambda], [Prod], [LetIn], [Rel], [Var], [Constant], etc.
The full list is available in the Ltac2 standard library at:
https://github.com/rocq-prover/rocq/blob/master/theories/Ltac2/Constr.v#L65

This API is called [Unsafe] because it exposes De Bruijn indices directly.
In Rocq's kernel, bound variables are represented not by names but by De Bruijn
indices — natural numbers counting the number of enclosing binders between the
variable occurrence and the binder that introduced it counting from 1.
For instance, in [fun x => x], the body [x] is stored as [Rel 1], meaning
"the variable bound by the immediately enclosing binder".
If these indices are managed incorrectly — for example by shifting them by the
wrong amount when going under a binder — the result is a not well-scoped term
in which an index refers to the wrong binder or to no binder at all, producing
an ill-scoped term. Hence the [Unsafe] qualifier.

Because [Unsafe.kind] returns an ordinary Ltac2 value of type [kind], one matches
on it with a regular Ltac2 [match], not with [lazy_match!], which is specific to
matching Rocq terms against surface-syntax patterns.

The main difference with using [lazy_match!] is that:
1. [lazy_match!] (and its siblings [match!] and [multi_match!]) match
Rocq terms against patterns. Any number of branches can succeed, or none.
Moreover, variants of [lazy_match!] discussed in the next section can do
backtracking to try the next pattern.
2. A plain Ltac2 [match] is a standard ML-style pattern match on an inductive value:
only one branch is taken, and no backtracking is involved.
*)

Import Unsafe.

(** In the case of [simplify_let], the main interest of using the [Unsafe] API,
is that we can directly access the structure of the let in, and do the
substitution ourselves instead of relying on higher-level tactics like
[lazy beta iota] which may not always fit perfectly our purpose.

Here, we can reimplement [simplify_let_bis] by applying [Unsafe.kind] to the
type of [h] and matching the result against [LetIn _ expr body].
This gives us [expr] — the bound expression — and [body] — the raw kernel
term in which the bound variable appears as [Rel 1].
We then introduce [x' := expr] into the context with [set], recover the
corresponding hypothesis, and call [substnl [x'] 0 body] to replace every
occurrence of [Rel 1] in [body] by [x'] in one step, producing [new_body].
Finally, [change ($new_body) in h] rewrites the type of [h] to [new_body],
with no intermediate beta redex to reduce.
*)
Ltac2 simplify_let_bis0 (h : ident) : unit :=
let type_h := type (Control.hyp h) in
match Unsafe.kind type_h with
| LetIn _ expr body =>
set (x' := $expr) in *;
let x' := Control.hyp @x' in
let new_body := substnl [x'] 0 body in
change ($new_body) in h
| _ => Control.zero (Tactic_failure (Some (fprintf "the type %t of %I is not a letin" type_h h)))

end.

Ltac2 Notation "simplify_let_bis" h(ident) := simplify_let_bis0 h.

Goal forall x y : nat, (let a := x + 2 in let b := y + 1 in a = b) -> True.
intros x y h.
simplify_let_bis h.
Abort.



(** 2. Matching Goals

2.1 Basics

Ltac2 also offers the possibility to match the goals to inspect the form the goal
Ltac2 also offers the possibility to match the goal to inspect the form of the goal
to prove, and/or the existence of particular hypotheses like a proof of [False].
The syntax to match goals is a bit different from matching terms.
In this case, the patterns are of the form:
Expand All @@ -247,23 +377,23 @@ Abort.

where:
- [x1] ... [xn] are [ident], i.e. Ltac2 values corresponding to the names
of the hypotheses. Opposite to variables, they should not start with [?].
of the hypotheses. Unlike variables, they should not start with [?].
- [t1] ... [tn] are the types of the hypotheses, and [g] the type of the goal
we want to prove. All are of type [constr], the type of Rocq terms in Ltac2.
- As usual [ident] --- [x1], ..., [xn] --- and any variables that could appear
in [t1], ..., [tn], and [g], cannot start with an upper case letter as
it is reserved for constructors.

Such a pattern will match the conclusion of the goal of type [G], and such that
can be found [n] different hypotheses of types [T1], ..., [Tn].
Such a pattern will match a goal with conclusion of type [G] where
[n] different hypotheses of types [T1], ..., [Tn] can be found in the context.
Each clause must match a different hypothesis for the pattern to succeed.
Consequently, there must be at least [n] different hypotheses / assumptions
in the context for such a branch to have a chance to succeed.
Moreover, the fallback wildcard pattern [_] we used to match terms is
now [ |- _] as this matches any goals.

As an example, let us write a small function starting with [lazy_match! goal with]
to inspect if the conlusion of the goal is either [_ /\ _] or [_ \/ _].
to inspect whether the conclusion of the goal is either [_ /\ _] or [_ \/ _].

As we want to inspect the conclusion of the goal but not the hypotheses,
our pattern is of the form [ |- g]. To match for [_ /\ _] and [_ \/ _],
Expand Down Expand Up @@ -361,13 +491,13 @@ Abort.

(** As for matching terms, matching goals is by default syntactic.
However, matching for non-linear variables is a bit more involved.
In the non-linear case, variable are matched up to conversions when they appear
in different clause, and up to syntax when they appear in the same clause.
In the non-linear case, variables are matched up to conversion when they appear
in different clauses, and up to syntax when they appear in the same clause.

To understand this better, let us look at an example.
We could write a version of [eassumption] by matching for the pattern [_ : ?t |- ?t].
In this case, as [?t] appears in different clauses, one of hypotheses and
in then conclusion, it will hence be matched up to conversion.
in the conclusion, it will hence be matched up to conversion.
We can check it works by supposing [0 + 1 = 0] and trying to prove [1 = 0],
as [0 + 1] is equal to [1] up to conversion but not syntactically.
*)
Expand All @@ -382,7 +512,7 @@ Abort.

(** However, if a variable appears several times in a same clause, then it is
compared syntactically. For instance, in [ |- ?t = ?t], [?t] is compared
syntactically as the it appears twice in the goal which forms one clause.
syntactically as it appears twice in the goal which forms one clause.
*)

Goal 1 + 1 = 2.
Expand All @@ -395,7 +525,7 @@ Abort.
(** A subtlety to understand is that only the hole associated to a variable will
be compared up to conversion, other symbols are matched syntactically as usual.
For instance, if we match for [?t, ~(?t)], the symbol [~] will matched,
if one if found then inside the clause ~(X) that it will be compared
if one is found, then inside the clause ~(X) it will be compared
up to conversion with [?t].

Consequently, if we have [P], it will fail to match for [P -> False] as
Expand Down Expand Up @@ -423,13 +553,13 @@ Goal forall P, P -> ~((fun _ => P) 0) -> False.
Abort.

(** It often happens that non-linear matching is not precise enough for our
purpose, either because we would like to match both term in one clause
up to conversion, or because we would like to use a different notation of
equality of non-linear occurencess like syntactic equality, equality up to
purpose, either because we would like to match both terms in one clause
up to conversion, or because we would like to use a different notion of
equality for non-linear occurrences, like syntactic equality, equality up to
some reduction, or even up to unification.

In this case, the best approach is to use different variables for each
occurences we want to compared, then call the comparaison we wish.
occurrence we want to compare, then call the comparison we wish.
For instance, to match for [P] and [~P] up to unification, we would:
match for a pair of variables [t1], [t2] then call a unification function
to check if one of the hypotheses is indeed a negation of the other.
Expand All @@ -445,22 +575,22 @@ Goal forall P, P -> (P -> False) -> False.
end.
Abort.

(** The advantage of this method is that it provides the user with a fine grained
control to the user when and how reduction and equality tests are performed.
(** The advantage of this method is that it provides the user with fine-grained
control over when and how reduction and equality tests are performed.
*)

(* 3. Backtracking and [lazy_match!], [match!], [multi_match!] *)

(** 3.1 [lazy_match!]

[lazy_match!] is the easiest command to understand and to use.
[lazy_match!] picks a branch, and sticks to it to even if the code excuted
[lazy_match!] picks a branch, and sticks to it even if the code executed
after picking this branch (the body of the branch) leads to a failure.
It will not backtrack to pick another branch if a choice leads to a failure.

For instance, in the example below, it picks the first branch as everything
matches [[ |- _]]. It prints "branch 1", then fails. As no backtracking is
allowed, it stick to this choice and fails.
allowed, it sticks to this choice and fails.
*)

Goal False.
Expand All @@ -473,7 +603,7 @@ Abort.

(** [lazy_match!] should be considered as the default, as it is easy to understand
(no backtracking) which prevents unexpected behaviour, and yet sufficient
for all applications where matching the syntax is a sufficient to decide what to do.
for all applications where matching the syntax is sufficient to decide what to do.

A common use of [lazy_match!] is to make a decision based on the shape of the
goal or the shape of a term or type.
Expand All @@ -482,7 +612,7 @@ Abort.
variables and hypotheses with [intros ?] if the goal is of the form [A -> B],
splits the goal with [split] if it is a conjunction [A /\ B], and recursively
simplify the new goals.
In both case, the syntactic equality test is sufficient to decide what to do
In both cases, the syntactic equality test is sufficient to decide what to do
as if it is of the required form, then the branch will succeed.
One should hence use [lazy_match!], which leads to the following simple function:
*)
Expand Down Expand Up @@ -526,15 +656,14 @@ Goal False.
| [ |- _ ] => printf "branch 3"
end.
Abort.
(** match!] is useful as soon as matching the syntax is not enough, and we
need additional tests to see if we have picked the good branch or not.
(** [match!] is useful as soon as matching the syntax is not enough, and we
need additional tests to see if we have picked the right branch or not.
Indeed, if such a test fails raising an exception (or we make it so), then
[match!] will backtrack, and look for the next branch matching the pattern.

A common application of [match!] is to match the goal for hypotheses, then
do extra-test to decide what to do, or ensure we have picked the good ones.
If we have not, failing or triggering failure, then enables to backtrack and
to try the next possible hypotheses.
do extra tests to decide what to do, or ensure we have picked the right ones.
If we have not, failing triggers backtracking to the next candidate hypothesis.

A basic example is to recode a simple [eassumption] tactic, that tries
to solve the goal with [exact P] for all hypotheses [P].
Expand Down Expand Up @@ -582,7 +711,7 @@ Goal False.
end; fail.
Abort.

(** In contrast, when the composition fails [multi_match!] will further bracktrack
(** In contrast, when the composition fails [multi_match!] will further backtrack
to its choice of branch, in this case the second one, and try the next matching branch.
The idea is that picking a different branch could have led to the subsequent
tactic to succeed, as it can happen when using [constructor].
Expand All @@ -608,7 +737,7 @@ Abort.
form [A \/ B]. We can only know if we have picked the good one, once chained
with another tactic that tries to solve the goal.
We hence need to use [multi_match!] as if we have picked the wrong side
to prove, we want to backtrack to pick the otherwiside.
to prove, we want to backtrack to pick the other side.

This leads to the following small script, improved with printing to display
the backtracking structure:
Expand Down
Loading