An exploration of well-founded recursion
Table of Contents
1 Introduction
Using well-founded relations to assist the termination checker of Agda is something I have always meant to investigate, but never quite gotten around to. This post attempts to rectify that.
First, I must give credit to the example code posted
by Sergei Romanenko,
which explores the technique using a log₂
algorithm;
my own developments were written by using that as an example.
2 Agda header
We use several facts about natural numbers, specfically
regarding <
and ≤
(from which <
is derived) below.
module WellFoundedRecursion where open import Data.Empty using (⊥) open import Data.Unit using (⊤) open import Data.Nat using ( ℕ ; zero ; suc ; _+_ ; _∸_ ; _⊔_ ; _≤_ ; _<_ ; z≤n ; s≤s ; _≤′_ ; _<′_ ; ≤′-refl ; ≤′-step) open import Data.Nat.Properties using ( <-cmp ; ≤-step ; ≤-refl ; <-transˡ ; ≤⇒≤′ ; m∸n≤m ; m<n⇒m<n⊔o ; m<n⇒m<o⊔n ; ⊔-pres-<m) open import Data.Product using (_×_ ; _,_) open import Data.Sum using (_⊎_ ; inj₁ ; inj₂) open import Relation.Binary.Definitions using (tri< ; tri≈ ; tri>) open import Relation.Binary.PropositionalEquality using (_≡_ ; refl ; sym) open import Function using (case_of_)
3 Problem 1 – The Euclidean algorithm
3.1 Preamble
Euclid's GCD is a well known and elegantly simple algorithm.
It hinges upon the fact that
given natural numbers m
and n
with m < n
, GCD(m,n) = GCD(m,n-m)
.
Note also that if m = n
, then GCD(m,n) = m = n
,
and if n = 0
, then GCD(m,n) = m
.
Making use of the three-way comparison function <-cmp
,
we can easily transcribe this algorithm into Agda.
module WithoutWFR where {-# TERMINATING #-} EuclidGCD : ℕ → ℕ → ℕ EuclidGCD zero m = m EuclidGCD n@(suc _) zero = n EuclidGCD n@(suc _) m@(suc _) = case <-cmp n m of λ { (tri< _ _ _) → EuclidGCD n (m ∸ n) ; (tri≈ _ _ _) → m ; (tri> _ _ _) → EuclidGCD (n ∸ m) m }
But Agda will not believe this is terminating, because it fails to see
in the recursive case that m ∸ n~/~n ∸ m
are
structurally smaller than m
or n
.
Note that the use of the TERMINATING
pragma does allow us to
use EuclidGCD
; unlike the NONTERMINATING
pragma, which
would prevent Agda from normalising uses of EuclidGCD
.
_ : EuclidGCD 12 20 ≡ 4 _ = refl
But it is ultimately against the spirit of verified programming in Agda to use this “just believe me” approach.
This is understandable; we must apply the proofs
of m < n
or n < m
in order to establish those facts.
If we try to state that using _<_
, we get stuck in the base cases
during the proof.
m∸n<m : {m n : ℕ} → m ∸ n < m m∸n<m {m} {zero} = ??? -- Stuck; m ≮ n m∸n<m {zero} {suc n} = ??? -- Stuck ; zero ≮ zero m∸n<m {suc m} {suc n} = <-transˡ (m∸n<m {m} {n}) -- suc m ∸ suc n = m ∸ n < m, by I.H. (≤-step ≤-refl) -- and m ≤ suc m.
To actually prove this,
we must know that both m
and n
are non-zero
—then we may use the fact
that m ∸ n
is less than or equal to to m
,
the proof of which is in the standard library and is very close
to our attempted proof above for _<_
.
Sm∸Sn<Sm : (m n : ℕ) → suc m ∸ suc n < suc m Sm∸Sn<Sm m n = s≤s (m∸n≤m m n)
With this proof in hand, we are equipped to use a well-founded relation
to rework our EuclidGCD
function so that it passes the termination checker.
We start by defining the concepts ourselves,
then we will utilise the (level-agnostic) versions from the standard library.
3.2 Solution version 1 – Defining our own notion of well-foundedness
module definingWFR where
We begin with the concept of accessibility. An element x
of A
is
accessible by relation _<_
iff every y
such that y < x
is accessible.
data Acc {A : Set} (_<_ : A → A → Set) (x : A) : Set where acc : (rs : (y : A) → y < x → Acc _<_ y) → Acc _<_ x
I'm honestly not quite certain of the reasoning behind
the name rs
; I interpret it as “Recurse on Smaller”.
This is motivated by the standard library description of
the type of rs
:
When using well-founded recursion you can recurse arbitrarily, as long as the arguments become smaller, and “smaller” is well-founded.
The standard library “extracts” that type. It is the type of “well-founded recursion”; we do the same here.
WfRec : {A : Set} → (A → A → Set) → (A → Set) → A → Set WfRec {A} _<_ P x = {y : A} → y < x → P y
Then the above definition is shortened to
data Acc {A : Set} (_<_ : A → A → Set) (x : A) : Set where acc : (rs : WfRec _<_ (Acc _<_) x) → Acc _<_ x
A relation is well-founded if all elements
of the underlying set are accessible;
that is, we will need to prove that for every x : A
,
every y < x
is accessible to
be able to call _<_
well founded.
Well-founded : {A : Set} → (A → A → Set) → Set Well-founded {A} _<_ = (x : A) → Acc _<_ x
Let us prove that less than on the naturals is well-founded. We use an alternate version of less from the standard library defined using reflexivity and “step” as constructors, which the standard library describes as
more suitable for well-founded induction (see Induction.Nat).
I did not manage to complete the proof using _<_
before turning
to this _<′_
.
<′-acc : {x y : ℕ} → y <′ x → Acc _<′_ y <′-acc .{suc y} {y} ≤′-refl = acc <′-acc <′-acc (≤′-step {_} y<′x) = <′-acc y<′x <′-well-founded : Well-founded _<′_ <′-well-founded n = acc <′-acc
Since the right side of the base case of <′-acc
is
exactly <′-well-founded
, I've seen these presented as
mutual definitions; I find it more illuminating to
detangle them.
Now, to define the Euclidean algorithm function,
we must decide upon a value upon which to base our well-founded recursion
with respect to _<_
. That is, we must decide upon a value
which is strictly decreasing at each call.
However, we are only decreasing one of the two arguments at each call,
and it is not necessarily the same argument each time.
So neither m
or n
is an adequate choice.
Instead, we may consider the maximum of the two, m ⊔ n
,
which does always decrease, since we monus a non-zero value
from the greater of the two arguments at each call.
This does require some lemmas regarding _⊔_
.
EuclidGCD′ : (m n : ℕ) → Acc _<′_ (m ⊔ n) → ℕ EuclidGCD′ zero m _ = m EuclidGCD′ n@(suc _) zero _ = n EuclidGCD′ m@(suc m₀) n@(suc n₀) (acc rs) = case <-cmp m n of λ { (tri< m<n _ _) → EuclidGCD′ m (n ∸ m) (rs {suc m₀ ⊔ (n₀ ∸ m₀)} (≤⇒≤′ (m<n⇒m<o⊔n m (⊔-pres-<m m<n (Sm∸Sn<Sm n₀ m₀))))) ; (tri≈ _ _ _) → m ; (tri> _ _ n<m) → EuclidGCD′ (m ∸ n) n (rs {m₀ ∸ n₀ ⊔ suc n₀} (≤⇒≤′ (m<n⇒m<n⊔o n (⊔-pres-<m (Sm∸Sn<Sm m₀ n₀) n<m)))) }
In the recursive cases, rs
has the type
{y : ℕ} → suc y ≤′ suc (m₀ ⊔ n₀) → Acc (λ m₁ n₁ → suc m₁ ≤′ n₁) y
We've written the implicit argument in in each use (although Agda doesn't need it to be given) for clarity.
Now, we can define a “clean” version of the Euclidean algorithm function
by utilising our proof that _<′_
is well-founded.
EuclidGCD : (m n : ℕ) → ℕ EuclidGCD m n = EuclidGCD′ m n (<′-well-founded (m ⊔ n))
3.3 Solution version 2 – Using the standard library's definitions
The standard library defines its notions of well-foundedness
and accessibility in Induction.WellFounded
,
and proves that _<′_
is well-founded
in Data.Nat.Induction
—previously it was Induction.Nat
,
which does still exist, but is deprecated.
open import Data.Nat.Induction using (<′-wellFounded) open import Induction.WellFounded using (Acc ; acc)
Aside from the fact that rs
takes an explicit argument
where we used an implicit one, and a slightly different name
for the well-foundedness of _<′_
, we are able
to echo our definitions above using the standard library.
EuclidGCD′ : (m n : ℕ) → Acc _<′_ (m ⊔ n) → ℕ EuclidGCD′ zero m _ = m EuclidGCD′ n@(suc _) zero _ = n EuclidGCD′ m@(suc m₀) n@(suc n₀) (acc rs) = case <-cmp m n of λ { (tri< m<n _ _) → EuclidGCD′ m (n ∸ m) (rs (suc m₀ ⊔ (n₀ ∸ m₀)) (≤⇒≤′ (m<n⇒m<o⊔n m (⊔-pres-<m m<n (Sm∸Sn<Sm n₀ m₀))))) ; (tri≈ _ _ _) → m ; (tri> _ _ n<m) → EuclidGCD′ (m ∸ n) n (rs (m₀ ∸ n₀ ⊔ suc n₀) (≤⇒≤′ (m<n⇒m<n⊔o n (⊔-pres-<m (Sm∸Sn<Sm m₀ n₀) n<m)))) } EuclidGCD : (m n : ℕ) → ℕ EuclidGCD m n = EuclidGCD′ m n (<′-wellFounded (m ⊔ n))
As a next step, let's move on to a more complicated domain than the natural numbers.
4 Problem 2 Draft 1 – Bintree's with union
At the time I began writing this post, I had been considering implementations of order-preserving functions on unordered binary trees in Agda as I was following along with the course CAS 763, Certified Programming with Dependent Types.
Specifically, when I began writing this, I had been working on the union function. At first, I thought a direct approach was not obviously terminating to Agda. I had seen an implementation using in-order traversals of the two trees in the course. But it turned out my direct algorithm was obviously terminating, just a bit tricker to define (I had made a mistake which made it non-terminating).
I've preserved the code here, for documentation. For an actual problem on well-founded recursion, skip to the next section, on tree balancing.
We start with a definition of (not necessarily ordered)
binary trees.
We require an ordering on the Elem
set
by way of a injective function from Elem
to ℕ
.
module BinTree₁ (Elem : Set) (# : Elem → ℕ) (#-injective : {e₁ e₂ : Elem} → # e₁ ≡ # e₂ → e₁ ≡ e₂) where data BinTree : Set where nilT : BinTree branch : BinTree → Elem → BinTree → BinTree
along with some basic operations.
[_] : Elem → BinTree [ e ] = branch nilT e nilT insert : Elem → BinTree → BinTree insert e nilT = [ e ] insert e (branch t₁ x t₂) = case <-cmp (# e) (# x) of λ { (tri< e<x ¬e≡x ¬e>x) → branch (insert e t₁) x t₂ ; (tri≈ ¬e<x e≡x ¬e>x) → branch t₁ e t₂ ; (tri> ¬e<x ¬e≡x e>x) → branch t₁ x (insert e t₂) } _∈_ : Elem → BinTree → Set e ∈ nilT = ⊥ e ∈ branch t₁ f t₂ = e ∈ t₁ ⊎ e ≡ f ⊎ e ∈ t₂
This union operation seemed at my first attempt to be
not obviously terminating; however, that was due to a mistake.
We must be careful all the ₁
's and ₂
's are correct!
This correct code does in fact terminate.
module DirectUnion where _∪_ : BinTree → BinTree → BinTree nilT ∪ t = t (branch l₁ e₁ r₁) ∪ (branch l₂ e₂ r₂) = case <-cmp (# e₁) (# e₂) of λ { (tri< _ _ _) → (insert e₂ (branch l₁ e₁ (r₁ ∪ r₂))) ∪ l₂ ; (tri≈ _ _ _) → branch (l₁ ∪ l₂) e₁ (r₁ ∪ r₂) ; (tri> _ _ _) → (insert e₂ (branch (l₁ ∪ l₂) e₁ r₁)) ∪ r₂ } t@(branch _ _ _) ∪ nilT = t
We can prove that this is the union of the two trees.
We first need a lemma about _∈_
and insert
.
In both the lemma and the theorem we must
split on a call to <-cmp
, since both insert
and <-cmp
are
defined based on a call to <-cmp
.
This does result in a large number of cases,
but they are all trivial.
insert-preserves-∈ : {a : Elem} → {t : BinTree} → a ∈ t → (e : Elem) → a ∈ (insert e t) insert-preserves-∈ {a} t@{branch l x _} (inj₁ a∈l) e with <-cmp (# e) (# x) … | tri< _ _ _ = inj₁ (insert-preserves-∈ a∈l e) … | tri≈ _ _ _ = inj₁ a∈l … | tri> _ _ _ = inj₁ a∈l insert-preserves-∈ {a} {branch _ .a _} (inj₂ (inj₁ refl)) e with <-cmp (# e) (# a) … | tri< _ _ _ = inj₂ (inj₁ refl) … | tri≈ _ #e≡#a _ = inj₂ (inj₁ (#-injective (sym #e≡#a))) … | tri> _ _ _ = inj₂ (inj₁ refl) insert-preserves-∈ {a} {branch _ x r} (inj₂ (inj₂ a∈r)) e with <-cmp (# e) (# x) … | tri< _ _ _ = inj₂ (inj₂ a∈r) … | tri≈ _ _ _ = inj₂ (inj₂ a∈r) … | tri> _ _ _ = inj₂ (inj₂ (insert-preserves-∈ a∈r e)) e∈t₁⇒e∈t₁∪t₂ : {e : Elem} → {t₁ : BinTree} → (t₂ : BinTree) → e ∈ t₁ → e ∈ (t₁ ∪ t₂) e∈t₁⇒e∈t₁∪t₂ {e} {branch l₁ _ _} nilT (inj₁ e∈l₁) = inj₁ e∈l₁ e∈t₁⇒e∈t₁∪t₂ {.e} {branch _ e _} nilT (inj₂ (inj₁ refl)) = inj₂ (inj₁ refl) e∈t₁⇒e∈t₁∪t₂ {e} {branch _ _ r₁} nilT (inj₂ (inj₂ e∈r₁)) = inj₂ (inj₂ e∈r₁) e∈t₁⇒e∈t₁∪t₂ {e} {branch l₁ f₁ r₁} (branch l₂ f₂ r₂) (inj₁ e∈l₁) with <-cmp (# f₁) (# f₂) … | tri< _ _ _ = e∈t₁⇒e∈t₁∪t₂ l₂ (insert-preserves-∈ (inj₁ e∈l₁) f₂) … | tri≈ _ _ _ = inj₁ (e∈t₁⇒e∈t₁∪t₂ l₂ e∈l₁) … | tri> _ _ _ = e∈t₁⇒e∈t₁∪t₂ r₂ (insert-preserves-∈ (inj₁ (e∈t₁⇒e∈t₁∪t₂ l₂ e∈l₁)) f₂) e∈t₁⇒e∈t₁∪t₂ {e} {branch l₁ .e r₁} (branch l₂ f₂ r₂) (inj₂ (inj₁ refl)) with <-cmp (# e) (# f₂) … | tri< _ _ _ = e∈t₁⇒e∈t₁∪t₂ l₂ (insert-preserves-∈ (inj₂ (inj₁ refl)) f₂) … | tri≈ _ _ _ = inj₂ (inj₁ refl) … | tri> _ _ _ = e∈t₁⇒e∈t₁∪t₂ r₂ (insert-preserves-∈ (inj₂ (inj₁ refl)) f₂) e∈t₁⇒e∈t₁∪t₂ {e} {branch l₁ f₁ r₁} (branch l₂ f₂ r₂) (inj₂ (inj₂ e∈r₁)) with <-cmp (# f₁) (# f₂) … | tri< _ _ _ = e∈t₁⇒e∈t₁∪t₂ l₂ (insert-preserves-∈ (inj₂ (inj₂ (e∈t₁⇒e∈t₁∪t₂ r₂ e∈r₁))) f₂) … | tri≈ _ _ _ = inj₂ (inj₂ (e∈t₁⇒e∈t₁∪t₂ r₂ e∈r₁)) … | tri> _ _ _ = e∈t₁⇒e∈t₁∪t₂ r₂ (insert-preserves-∈ (inj₂ (inj₂ e∈r₁)) f₂)
All recursive calls in our union function are made on subtrees, and a sensible decreasing value is the height of the tree.
height : BinTree → ℕ height nilT = 0 height (branch t₁ _ t₂) = suc (height t₁ ⊔ height t₂)
We quickly prove some lemmas regarding <′
, for convenience.
open import Data.Nat.Properties using (n≤m⊔n ; m≤m⊔n) n<′suc[m⊔n] : {m n : ℕ} → n <′ suc (m ⊔ n) n<′suc[m⊔n] = ≤⇒≤′ (s≤s (n≤m⊔n _ _)) m<′suc[m⊔n] : {m n : ℕ} → m <′ suc (m ⊔ n) m<′suc[m⊔n] = ≤⇒≤′ (s≤s (m≤m⊔n _ _))
And then we are ready to explicitely prove that union′
is terminating.
union′ : (t₁ t₂ : BinTree) → Acc _<′_ (height t₂) → BinTree union′ nilT t _ = t union′ (branch l₁ e₁ r₁) (branch l₂ e₂ r₂) (acc rs) = let hʳ₁ = height r₁ hʳ₂ = height r₂ hˡ₁ = height l₁ hˡ₂ = height l₂ in case <-cmp (# e₁) (# e₂) of λ { (tri< _ _ _) → union′ (insert e₂ (branch l₁ e₁ (union′ r₁ r₂ (rs _ n<′suc[m⊔n])))) l₂ (rs _ m<′suc[m⊔n]) ; (tri≈ _ _ _) → branch (union′ l₁ l₂ (rs _ m<′suc[m⊔n])) e₁ (union′ r₁ r₂ (rs _ n<′suc[m⊔n])) ; (tri> _ _ _) → union′ (insert e₂ (branch (union′ l₁ l₂ (rs _ m<′suc[m⊔n])) e₁ r₁)) r₂ (rs _ n<′suc[m⊔n]) } union′ t@(branch _ _ _) nilT _ = t
And define a clean version of union.
_∪_ : (t₁ t₂ : BinTree) → BinTree t₁ ∪ t₂ = union′ t₁ t₂ (<′-wellFounded (height t₂))
We could again prove this is the union; I believe it should be only a little more tedious. Instead of repeating the proof though, we settle for some quick unit tests.
module Test where open BinTree₁ ℕ (λ x → x) (λ x≡x → x≡x) tree₁ : BinTree tree₁ = branch (branch (branch nilT 1 nilT) 3 nilT) 5 (branch (branch nilT 7 nilT) 8 nilT) tree₂ : BinTree tree₂ = branch (branch nilT 2 nilT) 4 (branch (branch nilT 5 nilT) 6 (branch nilT 9 nilT)) _ : tree₁ ∪ tree₂ ≡ branch (branch (branch nilT 1 (branch nilT 2 nilT)) 3 (branch nilT 4 nilT)) 5 (branch (branch (branch nilT 6 nilT) 7 nilT) 8 (branch nilT 9 nilT)) _ = refl _ : tree₂ ∪ tree₁ ≡ branch (branch (branch nilT 1 nilT) 2 (branch nilT 3 nilT)) 4 (branch (branch nilT 5 nilT) 6 (branch (branch (branch nilT 7 nilT) 8 nilT) 9 nilT)) _ = refl
5 Problem 2 – Tree balancing
As in the above problem about union, consider a simple (unbalanced, unordered) binary tree type.
module BinTree (Elem : Set) (# : Elem → ℕ) where data BinTree : Set where nilT : BinTree branch : BinTree → Elem → BinTree → BinTree
Once again, define some basic operations on these trees.
[_] : Elem → BinTree [ e ] = branch nilT e nilT insert : Elem → BinTree → BinTree insert e nilT = [ e ] insert e (branch t₁ x t₂) = case <-cmp (# e) (# x) of λ { (tri< e<x ¬e≡x ¬e>x) → branch (insert e t₁) x t₂ ; (tri≈ ¬e<x e≡x ¬e>x) → branch t₁ e t₂ ; (tri> ¬e<x ¬e≡x e>x) → branch t₁ x (insert e t₂) }
Our goal here will be to create a balancing function which uses pivots, rather than a balancing insert routine which creates “a brand new tree”.
To talk about balanced trees, we need a height function.
height : BinTree → ℕ height nilT = zero height (branch t₁ _ t₂) = suc (height t₁ ⊔ height t₂)
We also need to know how a tree is unbalanced;
we talk about a tree leaning to the left or right
by a certain disparity (difference in height).
Note that a disparity of 0 (left 0
or right 0
) means
the heights are off by just 1.
Trees with such a disparity still count as balanced.
data Lean : Set where left : ℕ → Lean right : ℕ → Lean none : Lean open import Data.Nat using (pred) open import Relation.Binary.Definitions using (Tri) -- Originally a pattern matching lambda, but extracted -- to help with proofs lower down. disparity : (t₁ t₂ : BinTree) → Tri (height t₁ < height t₂) (height t₁ ≡ height t₂) (height t₂ < height t₁) → Lean disparity t₁ t₂ (tri< _ _ _) = right (pred (height t₂ ∸ height t₁)) disparity t₁ t₂ (tri≈ _ _ _) = none disparity t₁ t₂ (tri> _ _ _) = left (pred (height t₁ ∸ height t₂)) lean : BinTree → Lean lean nilT = none lean (branch t₁ _ t₂) = disparity t₁ t₂ (<-cmp (height t₁) (height t₂))
The definition of “balanced” I am striving for requires not only the root of the tree to be balanced; all subtrees must also be balanced. But first we must define the notion of the root being balanced.
rootBalanced : BinTree → Set rootBalanced t = (lean t ≡ none) ⊎ (lean t ≡ left 0) ⊎ (lean t ≡ right 0)
Then we apply that predicate to all subtrees to check if the tree is balanced.
balanced : BinTree → Set balanced nilT = ⊤ balanced t@(branch t₁ _ t₂) = balanced t₁ × rootBalanced t × balanced t₂
With the notion of balanced
defined, we can turn to our balancing algorithm.
As mentioned above, we will use pivots to balance the tree.
pivotLeft : BinTree → BinTree pivotLeft (branch l root (branch rl root′ rr)) = branch (branch l root rl) root′ rr {-# CATCHALL #-} pivotLeft t = t pivotRight : BinTree → BinTree pivotRight (branch (branch ll root′ lr) root r) = branch ll root′ (branch lr root r) {-# CATCHALL #-} pivotRight t = t
Now we need to decide in what order we apply pivots. In order to assist with this, we define the notion of a balance factor, in order to find the “worst part” of the tree. The balance factor is zero if the left and right subtrees are the same height, or if they are off by at most one (lean of 0).
leanAmount : Lean → ℕ leanAmount (left factor) = factor leanAmount (right factor) = factor leanAmount none = zero rootBalanceFactor : BinTree → ℕ rootBalanceFactor nilT = zero rootBalanceFactor t@(branch t₁ _ t₂) = leanAmount (lean t)
For the balance algorithm, we need to check if there is any unbalanced node in a subtree. Checking for the maximum imbalance suffices.
open import Data.Nat using (_⊓_) maxImbalance : BinTree → ℕ maxImbalance nilT = zero maxImbalance t@(branch t₁ _ t₂) = maxImbalance t₁ ⊓ rootBalanceFactor t ⊓ maxImbalance t₂
Now we can define our naive, not obviously terminating balance function. It starts by balancing the left subtree, then the right, then applying one pivot to the root and starting again. To avoid nested pattern matching lambdas, we use a number of mutually defined functions.
BinTreeNE : Set BinTreeNE = BinTree × Elem × BinTree module Bad where {-# TERMINATING #-} balance : BinTree → BinTree balanceLeft : BinTreeNE → ℕ → BinTree balanceRight : BinTreeNE → ℕ → BinTree balanceRoot : BinTreeNE → Lean → BinTree balanceLeft (t₁ , e , t₂) (suc _) = balance (branch (balance t₁) e t₂) balanceLeft t@(t₁ , e , t₂) zero = balanceRight t (maxImbalance t₂) balanceRight (t₁ , e , t₂) (suc _) = balance (branch t₁ e (balance t₂)) balanceRight t@(t₁ , e , t₂) zero = balanceRoot t (lean (branch t₁ e t₂)) balanceRoot (t₁ , e , t₂) (left (suc _)) = balance (pivotRight (branch t₁ e t₂)) balanceRoot (t₁ , e , t₂) (right (suc _)) = balance (pivotLeft (branch t₁ e t₂)) balanceRoot (t₁ , e , t₂) none = branch t₁ e t₂ balanceRoot (t₁ , e , t₂) (left zero) = branch t₁ e t₂ balanceRoot (t₁ , e , t₂) (right zero) = branch t₁ e t₂ balance nilT = nilT balance (branch t₁ e t₂) = balanceLeft (t₁ , e , t₂) (maxImbalance t₁)
Here's the version using pattern matching lambdas, which may be easier to read.
{-# TERMINATING #-} balance nilT = nilT balance t@(branch t₁ e t₂) = case maxImbalance t₁ of λ { (suc _) → balance (branch (balance t₁) e t₂) ; zero → case maxImbalance t₂ of λ { (suc _) → balance (branch t₁ e (balance t₂)) ; zero → case lean t of λ { (left (suc _)) → balance (pivotRight t) ; (right (suc _)) → balance (pivotLeft t) ; none → t ; (left zero) → t ; (right zero) → t } } }
Now what is the decreasing value here we can use for well-founded recursion? There's no guarantee that the maximum imbalance decreases at each step. But after most steps of our algorithm, the sum of the balance factors should have decreased, because we've reduced the imbalance of one subtree, without making any others worse.
totalImbalance : BinTree → ℕ totalImbalance nilT = 0 totalImbalance t@(branch t₁ _ t₂) = totalImbalance t₁ + rootBalanceFactor t + totalImbalance t₂
And we only fail to decrease the sum of the balance factors
is when the recursive calls are on structurally smaller arguments
(specifically, in the calls of the form balance t₁
and balance t₂
).
So we define a new less than relation that accounts for two cases.
- the height of the first tree is strictly less than that of the second, or
- the total imbalance of the first tree is strictly less than that of the second.
In each case, we must ensure the other metric does not increase.
_⪡ʰ_ : BinTree → BinTree → Set t₁ ⪡ʰ t₂ = height t₁ <′ height t₂ × totalImbalance t₁ ≤ totalImbalance t₂ _⪡ⁱ_ : BinTree → BinTree → Set t₁ ⪡ⁱ t₂ = height t₁ ≤ height t₂ × totalImbalance t₁ <′ totalImbalance t₂ _⪡_ : BinTree → BinTree → Set t₁ ⪡ t₂ = (t₁ ⪡ʰ t₂) ⊎ (t₁ ⪡ⁱ t₂)
We prove that subtrees precede their “super” tree.
open import Data.Nat.Properties using (m≤m⊔n ; n≤m⊔n ; m≤m+n ; m≤n+m ; ≤-stepsʳ) branch-⪡ˡ : (t₁ : BinTree) (e : Elem) (t₂ : BinTree) → t₁ ⪡ branch t₁ e t₂ branch-⪡ˡ t₁ _ t₂ = inj₁ (≤⇒≤′ (s≤s (m≤m⊔n (height t₁) (height t₂))) , ≤-stepsʳ (totalImbalance t₂) (m≤m+n (totalImbalance t₁) (leanAmount (disparity t₁ t₂ (<-cmp (height t₁) (height t₂)))))) branch-⪡ʳ : (t₁ : BinTree) (e : Elem) (t₂ : BinTree) → t₂ ⪡ branch t₁ e t₂ branch-⪡ʳ t₁ _ t₂ = inj₁ (≤⇒≤′ (s≤s (n≤m⊔n (height t₁) (height t₂))) , m≤n+m (totalImbalance t₂) ( totalImbalance t₁ + leanAmount (disparity t₁ t₂ (<-cmp (height t₁) (height t₂)))))
:TODO: Temporary import location.
open import Data.Nat.Properties using (+-monoˡ-≤ ; +-mono-< ; ⊔-monoˡ-≤ ; ≤′⇒≤ ; <⇒≤)
Here's the state of things as I left them in April.
Seems like the first thing
cannot be proven to me now.
In particular, t₁
and t₁′
seem reversed.
I'll start from scratch after tackling the accessibility
and well-foundedness proofs.
thing : (t₁′ t₁ t₂ : BinTree) → t₁′ ⪡ⁱ t₁ → (leanAmount (disparity t₁′ t₂ (<-cmp (height t₁′) (height t₂))) < leanAmount (disparity t₁ t₂ (<-cmp (height t₁) (height t₂)))) thing nilT (branch nilT _ (branch _ _ _)) nilT _ = s≤s z≤n thing nilT (branch (branch _ _ _) _ nilT) nilT _ = s≤s z≤n thing nilT (branch (branch _ _ _) _ (branch _ _ _)) nilT _ = s≤s z≤n thing nilT (branch nilT x₁ (branch t₄ x₂ t₅)) (branch t₂ x t₃) _ = ??? thing nilT (branch (branch t₁ x₂ t₅) x₁ nilT) (branch t₂ x t₃) _ = ??? thing nilT (branch (branch t₁ x₂ t₅) x₁ (branch t₄ x₃ t₆)) (branch t₂ x t₃) _ = ??? thing (branch t₁′ x t₁′₁) (branch t₁ x₁ t₃) t₂ _ = ??? lemma : (t₁′ t₁ : BinTree) (e : Elem) (t₂ : BinTree) → t₁′ ⪡ⁱ t₁ → branch t₁′ e t₂ ⪡ⁱ branch t₁ e t₂ lemma t₁′ t₁ e t₂ (ht₁≤ht₁′ , mit₁<mit₁′) = s≤s (⊔-monoˡ-≤ (height t₂) ht₁≤ht₁′) , ≤⇒≤′ (+-monoˡ-≤ (totalImbalance t₂) (+-mono-< (≤′⇒≤ mit₁<mit₁′) (thing t₁′ t₁ t₂ ???))) lemma′ : (t₁ : BinTree) (e : Elem) (t₂′ t₂ : BinTree) → t₂′ ⪡ⁱ t₂ → branch t₁ e t₂′ ⪡ⁱ branch t₁ e t₂ lemma′ = ??? balance′ : (t : BinTree) → Acc _⪡_ t → BinTree balance′ nilT _ = nilT balance′ t@(branch t₁ e t₂) (acc rs) = case maxImbalance t₁ of λ { (suc _) → let t₁′ = balance′ t₁ (rs _ (branch-⪡ˡ t₁ e t₂)) in balance′ (branch t₁′ e t₂) (rs _ (inj₂ (lemma t₁′ t₁ e t₂ ???))) ; zero → case maxImbalance t₂ of λ { (suc _) → let t₂′ = balance′ t₂ (rs _ (branch-⪡ʳ t₁ e t₂)) in balance′ (branch t₁ e t₂′) (rs _ (inj₂ (lemma′ t₁ e t₂′ t₂ ???))) ; zero → case lean t of λ { (left (suc _)) → balance′ (pivotRight t) (rs _ ???) ; (right (suc _)) → balance′ (pivotLeft t) (rs _ ???) ; none → t ; (left zero) → t ; (right zero) → t } } }
open import Induction.WellFounded using (WellFounded) ⪡ʰ-acc : {x y : BinTree} → y ⪡ʰ x → Acc _⪡ʰ_ y ⪡ʰ-WellFounded : WellFounded _⪡ʰ_ ⪡ʰ-acc {branch x x₁ x₂} {nilT} (fst , z≤n) = ⪡ʰ-WellFounded nilT ⪡ʰ-acc {branch x x₁ x₂} {branch y x₃ y₁} (fst , snd) = {!!} ⪡ʰ-WellFounded nilT = acc λ { _ () } ⪡ʰ-WellFounded (branch t x t₁) = acc λ y x₁ → ⪡ʰ-acc {{!!}} {{!!}} x₁ ⪡ⁱ-acc : {x y : BinTree} → y ⪡ⁱ x → Acc _⪡ⁱ_ y ⪡ⁱ-WellFounded : WellFounded _⪡ⁱ_ ⪡ⁱ-acc = {!!} ⪡ⁱ-WellFounded = {!!} ⪡-acc : {x y : BinTree} → y ⪡ x → Acc _⪡_ y ⪡-WellFounded : WellFounded _⪡_ ⪡-acc {branch _ _ _} {nilT} (inj₁ _) = ⪡-WellFounded nilT ⪡-acc {branch _ _ _} {nilT} (inj₂ _) = ⪡-WellFounded nilT ⪡-acc {branch _ _ _} {branch _ _ _} (inj₁ (fst , snd)) = {!!} ⪡-acc {branch _ _ _} {branch _ _ _} (inj₂ (fst , snd)) = {!!} -- No trees less than the empty tree, in either sense. ⪡-WellFounded nilT = acc (λ { _ (inj₁ ()) ; _ (inj₂ ()) }) -- … ⪡-WellFounded (branch _ _ _) = acc {!⪡-acc!} --(λ { y (inj₁ (fst , snd)) → acc {!!} -- ; y (inj₂ (fst , snd)) → acc (λ y₁ x₃ → {!⪡-WellFounded ?!}) } )
These were some tests on the bad version.
module BadBalanceTest where open BinTree ℕ (λ x → x) open Bad renaming (balance to badBalance) t₂′ : BinTree t₂′ = branch nilT 3 nilT t₂ : BinTree t₂ = branch nilT 2 t₂′ tree : BinTree tree = branch nilT 1 t₂ {- Walk through the algorithm manually _ : balanceFactor t₂ ≡ 0 _ = refl _ : balanceFactor nilT ≡ 0 _ = refl _ : lean tree ≡ right 2 _ = refl tree′ : BinTree tree′ = pivotLeft tree _ : tree′ ≡ {!nilT!} _ = refl -} _ : badBalance tree ≡ branch (branch nilT 1 nilT) 2 (branch nilT 3 nilT) _ = refl