An exploration of well-founded recursion

Table of Contents

1 Introduction

Using well-founded relations to assist the termination checker of Agda is something I have always meant to investigate, but never quite gotten around to. This post attempts to rectify that.

First, I must give credit to the example code posted by Sergei Romanenko, which explores the technique using a log₂ algorithm; my own developments were written by using that as an example.

2 Agda header

We use several facts about natural numbers, specfically regarding < and (from which < is derived) below.

module WellFoundedRecursion where

open import Data.Empty using ()
open import Data.Unit using ()
open import Data.Nat
  using ( ℕ ; zero ; suc
        ; _+_ ; _∸_ ; _⊔_
        ; _≤_ ; _<_ ; z≤n ; s≤s
        ; _≤′_ ; _<′_ ; ≤′-refl ; ≤′-step)
open import Data.Nat.Properties
  using ( <-cmp ; ≤-step ; ≤-refl ; <-transˡ ; ≤⇒≤′
        ; m∸n≤m
        ; m<n⇒m<n⊔o ; m<n⇒m<o⊔n ; ⊔-pres-<m)
open import Data.Product using (_×_ ; _,_)
open import Data.Sum using (_⊎_ ; inj₁ ; inj₂)

open import Relation.Binary.Definitions using (tri< ; tri≈ ; tri>)
open import Relation.Binary.PropositionalEquality using (_≡_ ; refl ; sym)

open import Function using (case_of_)

3 Problem 1 – The Euclidean algorithm

3.1 Preamble

Euclid's GCD is a well known and elegantly simple algorithm. It hinges upon the fact that given natural numbers m and n with m < n, GCD(m,n) = GCD(m,n-m). Note also that if m = n, then GCD(m,n) = m = n, and if n = 0, then GCD(m,n) = m.

Making use of the three-way comparison function <-cmp, we can easily transcribe this algorithm into Agda.

module WithoutWFR where
  
  {-# TERMINATING #-}
  EuclidGCD : ℕ → ℕ → ℕ
  EuclidGCD zero      m         = m
  EuclidGCD n@(suc _) zero      = n
  EuclidGCD n@(suc _) m@(suc _) =
    case <-cmp n m of
      λ { (tri< _ _ _) → EuclidGCD n (m ∸ n)
        ; (tri≈ _ _ _) → m
        ; (tri> _ _ _) → EuclidGCD (n ∸ m) m
        }

But Agda will not believe this is terminating, because it fails to see in the recursive case that m ∸ n~/~n ∸ m are structurally smaller than m or n.

Note that the use of the TERMINATING pragma does allow us to use EuclidGCD; unlike the NONTERMINATING pragma, which would prevent Agda from normalising uses of EuclidGCD.

  _ : EuclidGCD 12 20 ≡ 4
  _ = refl

But it is ultimately against the spirit of verified programming in Agda to use this “just believe me” approach.

This is understandable; we must apply the proofs of m < n or n < m in order to establish those facts. If we try to state that using _<_, we get stuck in the base cases during the proof.

m∸n<m : {m n : ℕ} → m ∸ n < m
m∸n<m {m} {zero} = ??? -- Stuck; m ≮ n
m∸n<m {zero} {suc n} = ??? -- Stuck ; zero ≮ zero
m∸n<m {suc m} {suc n} =
  <-transˡ (m∸n<m {m} {n}) -- suc m ∸ suc n = m ∸ n < m, by I.H.
          (≤-step ≤-refl) -- and m ≤ suc m.

To actually prove this, we must know that both m and n are non-zero —then we may use the fact that m ∸ n is less than or equal to to m, the proof of which is in the standard library and is very close to our attempted proof above for _<_.

Sm∸Sn<Sm : (m n : ℕ) → suc m ∸ suc n < suc m
Sm∸Sn<Sm m n = s≤s (m∸n≤m m n)

With this proof in hand, we are equipped to use a well-founded relation to rework our EuclidGCD function so that it passes the termination checker. We start by defining the concepts ourselves, then we will utilise the (level-agnostic) versions from the standard library.

3.2 Solution version 1 – Defining our own notion of well-foundedness

module definingWFR where

We begin with the concept of accessibility. An element x of A is accessible by relation _<_ iff every y such that y < x is accessible.

    data Acc {A : Set} (_<_ : A → A → Set) (x : A) : Set where
      acc : (rs : (y : A) → y < x → Acc _<_ y) → Acc _<_ x

I'm honestly not quite certain of the reasoning behind the name rs; I interpret it as “Recurse on Smaller”. This is motivated by the standard library description of the type of rs:

When using well-founded recursion you can recurse arbitrarily, as long as the arguments become smaller, and “smaller” is well-founded.

The standard library “extracts” that type. It is the type of “well-founded recursion”; we do the same here.

  WfRec : {A : Set}(A → A → Set)(A → Set) → A → Set
  WfRec {A} _<_ P x = {y : A} → y < x → P y

Then the above definition is shortened to

  data Acc {A : Set} (_<_ : A → A → Set) (x : A) : Set where
    acc : (rs : WfRec _<_ (Acc _<_) x) → Acc _<_ x

A relation is well-founded if all elements of the underlying set are accessible; that is, we will need to prove that for every x : A, every y < x is accessible to be able to call _<_ well founded.

  Well-founded : {A : Set}(A → A → Set) → Set
  Well-founded {A} _<_ = (x : A) → Acc _<_ x

Let us prove that less than on the naturals is well-founded. We use an alternate version of less from the standard library defined using reflexivity and “step” as constructors, which the standard library describes as

more suitable for well-founded induction (see Induction.Nat).

I did not manage to complete the proof using _<_ before turning to this _<′_.

  <′-acc : {x y : ℕ} → y <′ x → Acc _<′_ y
  <′-acc .{suc y} {y} ≤′-refl = acc <′-acc
  <′-acc (≤′-step {_} y<′x) = <′-acc y<′x

  <′-well-founded : Well-founded _<′_
  <′-well-founded n = acc <′-acc

Since the right side of the base case of <′-acc is exactly <′-well-founded, I've seen these presented as mutual definitions; I find it more illuminating to detangle them.

Now, to define the Euclidean algorithm function, we must decide upon a value upon which to base our well-founded recursion with respect to _<_. That is, we must decide upon a value which is strictly decreasing at each call. However, we are only decreasing one of the two arguments at each call, and it is not necessarily the same argument each time. So neither m or n is an adequate choice. Instead, we may consider the maximum of the two, m ⊔ n, which does always decrease, since we monus a non-zero value from the greater of the two arguments at each call. This does require some lemmas regarding _⊔_.

  EuclidGCD′ : (m n : ℕ) → Acc _<′_ (m ⊔ n) → ℕ
  EuclidGCD′ zero m _ = m
  EuclidGCD′ n@(suc _) zero _ = n
  EuclidGCD′ m@(suc m₀) n@(suc n₀) (acc rs) =
    case <-cmp m n of
      λ { (tri< m<n _ _) →
          EuclidGCD′ m (n ∸ m)
                     (rs {suc m₀ ⊔ (n₀ ∸ m₀)}
                         (≤⇒≤′ (m<n⇒m<o⊔n m (⊔-pres-<m m<n (Sm∸Sn<Sm n₀ m₀)))))
        ; (tri≈ _ _ _)   → m
        ; (tri> _ _ n<m) →
          EuclidGCD′ (m ∸ n) n
                     (rs {m₀ ∸ n₀ ⊔ suc n₀}
                         (≤⇒≤′ (m<n⇒m<n⊔o n (⊔-pres-<m (Sm∸Sn<Sm m₀ n₀) n<m))))
        }

In the recursive cases, rs has the type

{y : ℕ} → suc y ≤′ suc (m₀ ⊔ n₀) → Acc (λ m₁ n₁ → suc m₁ ≤′ n₁) y

We've written the implicit argument in in each use (although Agda doesn't need it to be given) for clarity.

Now, we can define a “clean” version of the Euclidean algorithm function by utilising our proof that _<′_ is well-founded.

  EuclidGCD : (m n : ℕ) → ℕ
  EuclidGCD m n = EuclidGCD′ m n (<′-well-founded (m ⊔ n))

3.3 Solution version 2 – Using the standard library's definitions

The standard library defines its notions of well-foundedness and accessibility in Induction.WellFounded, and proves that _<′_ is well-founded in Data.Nat.Induction —previously it was Induction.Nat, which does still exist, but is deprecated.

open import Data.Nat.Induction using (<′-wellFounded)
open import Induction.WellFounded using (Acc ; acc)

Aside from the fact that rs takes an explicit argument where we used an implicit one, and a slightly different name for the well-foundedness of _<′_, we are able to echo our definitions above using the standard library.

EuclidGCD′ : (m n : ℕ) → Acc _<′_ (m ⊔ n) → ℕ
EuclidGCD′ zero m _ = m
EuclidGCD′ n@(suc _) zero _ = n
EuclidGCD′ m@(suc m₀) n@(suc n₀) (acc rs) =
  case <-cmp m n of
    λ { (tri< m<n _ _) →
        EuclidGCD′ m (n ∸ m)
                   (rs (suc m₀ ⊔ (n₀ ∸ m₀))
                       (≤⇒≤′ (m<n⇒m<o⊔n m (⊔-pres-<m m<n (Sm∸Sn<Sm n₀ m₀)))))
      ; (tri≈ _ _ _)   → m
      ; (tri> _ _ n<m) →
        EuclidGCD′ (m ∸ n) n
                   (rs (m₀ ∸ n₀ ⊔ suc n₀)
                       (≤⇒≤′ (m<n⇒m<n⊔o n (⊔-pres-<m (Sm∸Sn<Sm m₀ n₀) n<m))))
      }

EuclidGCD : (m n : ℕ) → ℕ
EuclidGCD m n = EuclidGCD′ m n (<′-wellFounded (m ⊔ n))

As a next step, let's move on to a more complicated domain than the natural numbers.

4 Problem 2 Draft 1 – Bintree's with union

At the time I began writing this post, I had been considering implementations of order-preserving functions on unordered binary trees in Agda as I was following along with the course CAS 763, Certified Programming with Dependent Types.

Specifically, when I began writing this, I had been working on the union function. At first, I thought a direct approach was not obviously terminating to Agda. I had seen an implementation using in-order traversals of the two trees in the course. But it turned out my direct algorithm was obviously terminating, just a bit tricker to define (I had made a mistake which made it non-terminating).

I've preserved the code here, for documentation. For an actual problem on well-founded recursion, skip to the next section, on tree balancing.

We start with a definition of (not necessarily ordered) binary trees. We require an ordering on the Elem set by way of a injective function from Elem to .

module BinTree₁ (Elem : Set)
                (# : Elem → ℕ)
                (#-injective : {e₁ e₂ : Elem} → # e₁ ≡ # e₂ → e₁ ≡ e₂)
              where

  data BinTree : Set where
    nilT : BinTree
    branch : BinTree → Elem → BinTree → BinTree

along with some basic operations.

  [_] : Elem → BinTree
  [ e ] = branch nilT e nilT

  insert : Elem → BinTree → BinTree
  insert e nilT = [ e ]
  insert e (branch t₁ x t₂) = case <-cmp (# e) (# x) of λ
    { (tri< e<x ¬e≡x ¬e>x) → branch (insert e t₁) x t₂
    ; (tri≈ ¬e<x e≡x ¬e>x) → branch t₁ e t₂
    ; (tri> ¬e<x ¬e≡x e>x) → branch t₁ x (insert e t₂)
    }

  _∈_ : Elem → BinTree → Set
  e ∈ nilT = ⊥
  e ∈ branch t₁ f t₂ = e ∈ t₁  ⊎  e ≡ f  ⊎  e ∈ t₂

This union operation seemed at my first attempt to be not obviously terminating; however, that was due to a mistake. We must be careful all the 's and 's are correct! This correct code does in fact terminate.

  module DirectUnion where

    _∪_ : BinTree → BinTree → BinTree
    nilT ∪ t = t
    (branch l₁ e₁ r₁)(branch l₂ e₂ r₂) =
      case <-cmp (# e₁) (# e₂) of λ
        { (tri< _ _ _)(insert e₂ (branch l₁ e₁ (r₁ ∪ r₂))) ∪ l₂
        ; (tri≈ _ _ _) → branch (l₁ ∪ l₂) e₁ (r₁ ∪ r₂)
        ; (tri> _ _ _)(insert e₂ (branch (l₁ ∪ l₂) e₁ r₁)) ∪ r₂
        }
    t@(branch _ _ _) ∪ nilT = t

We can prove that this is the union of the two trees. We first need a lemma about _∈_ and insert. In both the lemma and the theorem we must split on a call to <-cmp, since both insert and <-cmp are defined based on a call to <-cmp. This does result in a large number of cases, but they are all trivial.

    insert-preserves-∈ : {a : Elem}{t : BinTree} → a ∈ t → (e : Elem) → a ∈ (insert e t)
    insert-preserves-∈ {a} t@{branch l x _} (inj₁ a∈l) e with <-cmp (# e) (# x)
    … | tri< _ _ _ = inj₁ (insert-preserves-∈ a∈l e)
    … | tri≈ _ _ _ = inj₁ a∈l
    … | tri> _ _ _ = inj₁ a∈l
    insert-preserves-∈ {a} {branch _ .a _} (inj₂ (inj₁ refl)) e with <-cmp (# e) (# a)
    … | tri< _ _ _ = inj₂ (inj₁ refl)
    … | tri≈ _ #e≡#a _ = inj₂ (inj₁ (#-injective (sym #e≡#a)))
    … | tri> _ _ _ = inj₂ (inj₁ refl)
    insert-preserves-∈ {a} {branch _ x r} (inj₂ (inj₂ a∈r)) e with <-cmp (# e) (# x)
    … | tri< _ _ _ = inj₂ (inj₂ a∈r)
    … | tri≈ _ _ _ = inj₂ (inj₂ a∈r)
    … | tri> _ _ _ = inj₂ (inj₂ (insert-preserves-∈ a∈r e))

    e∈t₁⇒e∈t₁∪t₂ : {e : Elem}{t₁ : BinTree}(t₂ : BinTree) → e ∈ t₁ → e ∈ (t₁ ∪ t₂)
    e∈t₁⇒e∈t₁∪t₂ {e}  {branch l₁ _ _} nilT (inj₁ e∈l₁) = inj₁ e∈l₁
    e∈t₁⇒e∈t₁∪t₂ {.e} {branch _ e _} nilT (inj₂ (inj₁ refl)) = inj₂ (inj₁ refl)
    e∈t₁⇒e∈t₁∪t₂ {e}  {branch _ _ r₁} nilT (inj₂ (inj₂ e∈r₁)) = inj₂ (inj₂ e∈r₁)
    e∈t₁⇒e∈t₁∪t₂ {e}  {branch l₁ f₁ r₁} (branch l₂ f₂ r₂) (inj₁ e∈l₁) with <-cmp (# f₁) (# f₂)
    … | tri< _ _ _ = e∈t₁⇒e∈t₁∪t₂ l₂ (insert-preserves-∈ (inj₁ e∈l₁) f₂)
    … | tri≈ _ _ _ = inj₁ (e∈t₁⇒e∈t₁∪t₂ l₂ e∈l₁)
    … | tri> _ _ _ = e∈t₁⇒e∈t₁∪t₂ r₂ (insert-preserves-∈ (inj₁ (e∈t₁⇒e∈t₁∪t₂ l₂ e∈l₁)) f₂)
    e∈t₁⇒e∈t₁∪t₂ {e} {branch l₁ .e r₁} (branch l₂ f₂ r₂) (inj₂ (inj₁ refl)) with <-cmp (# e) (# f₂)
    … | tri< _ _ _ = e∈t₁⇒e∈t₁∪t₂ l₂ (insert-preserves-∈ (inj₂ (inj₁ refl)) f₂)
    … | tri≈ _ _ _ = inj₂ (inj₁ refl)
    … | tri> _ _ _ = e∈t₁⇒e∈t₁∪t₂ r₂ (insert-preserves-∈ (inj₂ (inj₁ refl)) f₂)
    e∈t₁⇒e∈t₁∪t₂ {e} {branch l₁ f₁ r₁} (branch l₂ f₂ r₂) (inj₂ (inj₂ e∈r₁)) with <-cmp (# f₁) (# f₂)
    … | tri< _ _ _ = e∈t₁⇒e∈t₁∪t₂ l₂ (insert-preserves-∈ (inj₂ (inj₂ (e∈t₁⇒e∈t₁∪t₂ r₂ e∈r₁))) f₂)
    … | tri≈ _ _ _ = inj₂ (inj₂ (e∈t₁⇒e∈t₁∪t₂ r₂ e∈r₁))
    … | tri> _ _ _ = e∈t₁⇒e∈t₁∪t₂ r₂ (insert-preserves-∈ (inj₂ (inj₂ e∈r₁)) f₂)

All recursive calls in our union function are made on subtrees, and a sensible decreasing value is the height of the tree.

  height : BinTree → ℕ
  height nilT = 0
  height (branch t₁ _ t₂) = suc (height t₁ ⊔ height t₂)

We quickly prove some lemmas regarding <′, for convenience.

  open import Data.Nat.Properties using (n≤m⊔n ; m≤m⊔n)
  
  n<′suc[m⊔n] : {m n : ℕ} → n <′ suc (m ⊔ n)
  n<′suc[m⊔n] = ≤⇒≤′ (s≤s (n≤m⊔n _ _))

  m<′suc[m⊔n] : {m n : ℕ} → m <′ suc (m ⊔ n)
  m<′suc[m⊔n] = ≤⇒≤′ (s≤s (m≤m⊔n _ _))

And then we are ready to explicitely prove that union′ is terminating.

  union′ : (t₁ t₂ : BinTree) → Acc _<′_ (height t₂) → BinTree
  union′ nilT t _ = t
  union′ (branch l₁ e₁ r₁) (branch l₂ e₂ r₂) (acc rs) =
    let hʳ₁ = height r₁
        hʳ₂ = height r₂
        hˡ₁ = height l₁
        hˡ₂ = height l₂
    in
    case <-cmp (# e₁) (# e₂) of λ
        { (tri< _ _ _) →
            union′ (insert e₂ (branch l₁ e₁ (union′ r₁ r₂ (rs _ n<′suc[m⊔n]))))
                   l₂
                   (rs _ m<′suc[m⊔n])
        ; (tri≈ _ _ _) → branch (union′ l₁ l₂ (rs _ m<′suc[m⊔n]))
                                e₁
                                (union′ r₁ r₂ (rs _ n<′suc[m⊔n]))
        ; (tri> _ _ _) →
            union′ (insert e₂ (branch (union′ l₁ l₂ (rs _ m<′suc[m⊔n])) e₁ r₁))
                   r₂
                   (rs _ n<′suc[m⊔n])
        } 
  union′ t@(branch _ _ _) nilT _ = t

And define a clean version of union.

  _∪_ : (t₁ t₂ : BinTree) → BinTree
  t₁ ∪ t₂ = union′ t₁ t₂ (<′-wellFounded (height t₂))

We could again prove this is the union; I believe it should be only a little more tedious. Instead of repeating the proof though, we settle for some quick unit tests.

module Test where

  open BinTree₁ ℕ (λ x → x) (λ x≡x → x≡x)

  tree₁ : BinTree
  tree₁ = branch (branch (branch nilT
                                 1
                                 nilT)
                         3
                         nilT)
                 5
                 (branch (branch nilT
                                 7
                                 nilT)
                         8
                         nilT)

  tree₂ : BinTree
  tree₂ = branch (branch nilT
                         2
                         nilT)
                 4 (branch (branch nilT
                                   5
                                   nilT)
                           6
                           (branch nilT
                                   9
                                   nilT))

  _ : tree₁ ∪ tree₂ ≡
      branch (branch (branch nilT
                             1
                             (branch nilT
                                     2
                                     nilT))
                     3
                     (branch nilT
                             4
                             nilT))
             5
             (branch (branch (branch nilT
                                     6
                                     nilT)
                             7
                             nilT)
                     8
                     (branch nilT
                             9
                             nilT))
  _ = refl

  _ : tree₂ ∪ tree₁ ≡
      branch (branch (branch nilT
                             1
                             nilT)
                     2
                     (branch nilT
                             3
                             nilT))
             4
             (branch (branch nilT
                             5
                             nilT)
                     6
                     (branch (branch (branch nilT
                                             7
                                             nilT)
                                     8
                                     nilT)
                             9
                             nilT))
  _ = refl

5 Problem 2 – Tree balancing

As in the above problem about union, consider a simple (unbalanced, unordered) binary tree type.

module BinTree (Elem : Set) (# : Elem → ℕ) where

  data BinTree : Set where
    nilT : BinTree
    branch : BinTree → Elem → BinTree → BinTree

Once again, define some basic operations on these trees.

  [_] : Elem → BinTree
  [ e ] = branch nilT e nilT

  insert : Elem → BinTree → BinTree
  insert e nilT = [ e ]
  insert e (branch t₁ x t₂) = case <-cmp (# e) (# x) of λ
    { (tri< e<x ¬e≡x ¬e>x) → branch (insert e t₁) x t₂
    ; (tri≈ ¬e<x e≡x ¬e>x) → branch t₁ e t₂
    ; (tri> ¬e<x ¬e≡x e>x) → branch t₁ x (insert e t₂)
    }

Our goal here will be to create a balancing function which uses pivots, rather than a balancing insert routine which creates “a brand new tree”.

To talk about balanced trees, we need a height function.

  height : BinTree → ℕ
  height nilT = zero
  height (branch t₁ _ t₂) = suc (height t₁ ⊔ height t₂)

We also need to know how a tree is unbalanced; we talk about a tree leaning to the left or right by a certain disparity (difference in height). Note that a disparity of 0 (left 0 or right 0) means the heights are off by just 1. Trees with such a disparity still count as balanced.

  data Lean : Set where
    left : ℕ → Lean
    right : ℕ → Lean
    none : Lean

  open import Data.Nat using (pred)
  open import Relation.Binary.Definitions using (Tri)

  -- Originally a pattern matching lambda, but extracted
  -- to help with proofs lower down.
  disparity : (t₁ t₂ : BinTree) → Tri (height t₁ < height t₂) (height t₁ ≡ height t₂) (height t₂ < height t₁) → Lean
  disparity t₁ t₂ (tri< _ _ _) = right (pred (height t₂ ∸ height t₁))
  disparity t₁ t₂ (tri≈ _ _ _) = none
  disparity t₁ t₂ (tri> _ _ _) = left (pred (height t₁ ∸ height t₂))

  lean : BinTree → Lean
  lean nilT = none
  lean (branch t₁ _ t₂) = disparity t₁ t₂ (<-cmp (height t₁) (height t₂))

The definition of “balanced” I am striving for requires not only the root of the tree to be balanced; all subtrees must also be balanced. But first we must define the notion of the root being balanced.

  rootBalanced : BinTree → Set
  rootBalanced t =
    (lean t ≡ none)(lean t ≡ left 0)(lean t ≡ right 0)

Then we apply that predicate to all subtrees to check if the tree is balanced.

  balanced : BinTree → Set
  balanced nilT = ⊤
  balanced t@(branch t₁ _ t₂) =
    balanced t₁ × rootBalanced t × balanced t₂ 

With the notion of balanced defined, we can turn to our balancing algorithm. As mentioned above, we will use pivots to balance the tree.

  pivotLeft : BinTree → BinTree
  pivotLeft (branch l root (branch rl root′ rr)) =
    branch (branch l root rl) root′ rr
  {-# CATCHALL #-}
  pivotLeft t = t

  pivotRight : BinTree → BinTree
  pivotRight (branch (branch ll root′ lr) root r) =
    branch ll root′ (branch lr root r)
  {-# CATCHALL #-}
  pivotRight t = t

Now we need to decide in what order we apply pivots. In order to assist with this, we define the notion of a balance factor, in order to find the “worst part” of the tree. The balance factor is zero if the left and right subtrees are the same height, or if they are off by at most one (lean of 0).

  leanAmount : Lean → ℕ
  leanAmount (left factor) = factor
  leanAmount (right factor) = factor
  leanAmount none = zero

  rootBalanceFactor : BinTree → ℕ
  rootBalanceFactor nilT = zero
  rootBalanceFactor t@(branch t₁ _ t₂) = leanAmount (lean t)

For the balance algorithm, we need to check if there is any unbalanced node in a subtree. Checking for the maximum imbalance suffices.

  open import Data.Nat using (_⊓_)

  maxImbalance : BinTree → ℕ
  maxImbalance nilT = zero
  maxImbalance t@(branch t₁ _ t₂) =
    maxImbalance t₁ ⊓ rootBalanceFactor t ⊓ maxImbalance t₂

Now we can define our naive, not obviously terminating balance function. It starts by balancing the left subtree, then the right, then applying one pivot to the root and starting again. To avoid nested pattern matching lambdas, we use a number of mutually defined functions.

  BinTreeNE : Set
  BinTreeNE = BinTree × Elem × BinTree

  module Bad where

    {-# TERMINATING #-}
    balance      : BinTree          → BinTree
    balanceLeft  : BinTreeNE → ℕ    → BinTree
    balanceRight : BinTreeNE → ℕ    → BinTree
    balanceRoot  : BinTreeNE → Lean → BinTree

    balanceLeft   (t₁ , e , t₂) (suc _) = balance (branch (balance t₁) e t₂)
    balanceLeft t@(t₁ , e , t₂) zero    = balanceRight t (maxImbalance t₂)

    balanceRight   (t₁ , e , t₂) (suc _) = balance (branch t₁ e (balance t₂))
    balanceRight t@(t₁ , e , t₂) zero    = balanceRoot t (lean (branch t₁ e t₂))

    balanceRoot (t₁ , e , t₂) (left (suc _))  = balance (pivotRight (branch t₁ e t₂))
    balanceRoot (t₁ , e , t₂) (right (suc _)) = balance (pivotLeft  (branch t₁ e t₂))
    balanceRoot (t₁ , e , t₂) none         = branch t₁ e t₂
    balanceRoot (t₁ , e , t₂) (left zero)  = branch t₁ e t₂
    balanceRoot (t₁ , e , t₂) (right zero) = branch t₁ e t₂

    balance nilT = nilT
    balance (branch t₁ e t₂) = balanceLeft (t₁ , e , t₂) (maxImbalance t₁)

Here's the version using pattern matching lambdas, which may be easier to read.

    {-# TERMINATING #-}
    balance nilT = nilT
    balance t@(branch t₁ e t₂) = case maxImbalance t₁ of λ
      { (suc _) → balance (branch (balance t₁) e t₂)
      ; zero    → case maxImbalance t₂ of λ
                  { (suc _) → balance (branch t₁ e (balance t₂))
                  ; zero    → case lean t of λ
                              { (left (suc _))  → balance (pivotRight t)
                              ; (right (suc _)) → balance (pivotLeft t)
                              ; none            → t
                              ; (left zero)     → t
                              ; (right zero)    → t
                              }
                  }
      }

Now what is the decreasing value here we can use for well-founded recursion? There's no guarantee that the maximum imbalance decreases at each step. But after most steps of our algorithm, the sum of the balance factors should have decreased, because we've reduced the imbalance of one subtree, without making any others worse.

  totalImbalance : BinTree → ℕ
  totalImbalance nilT = 0
  totalImbalance t@(branch t₁ _ t₂) =
    totalImbalance t₁ + rootBalanceFactor t + totalImbalance t₂

And we only fail to decrease the sum of the balance factors is when the recursive calls are on structurally smaller arguments (specifically, in the calls of the form balance t₁ and balance t₂).

So we define a new less than relation that accounts for two cases.

  1. the height of the first tree is strictly less than that of the second, or
  2. the total imbalance of the first tree is strictly less than that of the second.

In each case, we must ensure the other metric does not increase.

  _⪡ʰ_ : BinTree → BinTree → Set
  t₁ ⪡ʰ t₂ = height t₁ <′ height t₂ × totalImbalance t₁ ≤ totalImbalance t₂ 

  _⪡ⁱ_ : BinTree → BinTree → Set
  t₁ ⪡ⁱ t₂ = height t₁ ≤ height t₂  × totalImbalance t₁ <′ totalImbalance t₂

  _⪡_ : BinTree → BinTree → Set
  t₁ ⪡ t₂ = (t₁ ⪡ʰ t₂)(t₁ ⪡ⁱ t₂)

We prove that subtrees precede their “super” tree.

  open import Data.Nat.Properties using (m≤m⊔n ; n≤m⊔n ; m≤m+n ; m≤n+m ; ≤-stepsʳ)

  branch-⪡ˡ : (t₁ : BinTree) (e : Elem) (t₂ : BinTree) → t₁ ⪡ branch t₁ e t₂
  branch-⪡ˡ t₁ _ t₂ = inj₁ (≤⇒≤′ (s≤s (m≤m⊔n (height t₁) (height t₂)))
                          , ≤-stepsʳ (totalImbalance t₂)
                                     (m≤m+n (totalImbalance t₁)
                                            (leanAmount (disparity t₁ t₂ (<-cmp (height t₁) (height t₂))))))

  branch-⪡ʳ : (t₁ : BinTree) (e : Elem) (t₂ : BinTree) → t₂ ⪡ branch t₁ e t₂
  branch-⪡ʳ t₁ _ t₂ = inj₁ (≤⇒≤′ (s≤s (n≤m⊔n (height t₁) (height t₂)))
                          , m≤n+m (totalImbalance t₂)
                                  (  totalImbalance t₁
                                   + leanAmount (disparity t₁ t₂ (<-cmp (height t₁) (height t₂)))))

:TODO: Temporary import location.

  open import Data.Nat.Properties using (+-monoˡ-≤ ; +-mono-< ; ⊔-monoˡ-≤ ; ≤′⇒≤ ; <⇒≤)

Here's the state of things as I left them in April. Seems like the first thing cannot be proven to me now. In particular, t₁ and t₁′ seem reversed. I'll start from scratch after tackling the accessibility and well-foundedness proofs.

  thing : (t₁′ t₁ t₂ : BinTree)
        → t₁′ ⪡ⁱ t₁
        →  (leanAmount (disparity t₁′ t₂ (<-cmp (height t₁′) (height t₂)))
          < leanAmount (disparity t₁  t₂ (<-cmp (height t₁)  (height t₂))))
  thing nilT (branch nilT           _ (branch _ _ _)) nilT            _ = s≤s z≤n
  thing nilT (branch (branch _ _ _) _ nilT)           nilT            _ = s≤s z≤n
  thing nilT (branch (branch _ _ _) _ (branch _ _ _)) nilT            _ = s≤s z≤n
  thing nilT (branch nilT x₁ (branch t₄ x₂ t₅)) (branch t₂ x t₃) _ = ???
  thing nilT (branch (branch t₁ x₂ t₅) x₁ nilT) (branch t₂ x t₃) _ = ???
  thing nilT (branch (branch t₁ x₂ t₅) x₁ (branch t₄ x₃ t₆)) (branch t₂ x t₃) _ = ???
  thing (branch t₁′ x t₁′₁) (branch t₁ x₁ t₃) t₂ _ = ???

  lemma : (t₁′ t₁ : BinTree) (e : Elem) (t₂ : BinTree)
        → t₁′ ⪡ⁱ t₁
        → branch t₁′ e t₂ ⪡ⁱ branch t₁ e t₂
  lemma t₁′ t₁ e t₂ (ht₁≤ht₁′ , mit₁<mit₁′) =
      s≤s (⊔-monoˡ-≤ (height t₂) ht₁≤ht₁′)
    , ≤⇒≤′ (+-monoˡ-≤ (totalImbalance t₂) (+-mono-< (≤′⇒≤ mit₁<mit₁′) (thing t₁′ t₁ t₂ ???)))
  lemma′ : (t₁ : BinTree) (e : Elem) (t₂′ t₂ : BinTree)
         → t₂′ ⪡ⁱ t₂
         → branch t₁ e t₂′ ⪡ⁱ branch t₁ e t₂
  lemma′ = ???

  balance′ : (t : BinTree) → Acc _⪡_ t  → BinTree
  balance′ nilT _ = nilT
  balance′ t@(branch t₁ e t₂) (acc rs) = case maxImbalance t₁ of λ
    { (suc _) → let t₁′ = balance′ t₁ (rs _ (branch-⪡ˡ t₁ e t₂)) in
                balance′ (branch t₁′ e t₂) (rs _ (inj₂ (lemma t₁′ t₁ e t₂ ???)))
    ; zero    → case maxImbalance t₂ of λ
                { (suc _) → let t₂′ = balance′ t₂ (rs _ (branch-⪡ʳ t₁ e t₂)) in
                            balance′ (branch t₁ e t₂′) (rs _ (inj₂ (lemma′ t₁ e t₂′ t₂ ???)))
                 ; zero    → case lean t of λ
                             { (left (suc _))  → balance′ (pivotRight t) (rs _ ???)
                             ; (right (suc _)) → balance′ (pivotLeft t) (rs _ ???)
                             ; none            → t
                             ; (left zero)     → t
                             ; (right zero)    → t
                             }
                 }
    }
  open import Induction.WellFounded using (WellFounded)

  ⪡ʰ-acc : {x y : BinTree} → y ⪡ʰ x → Acc _⪡ʰ_ y
  ⪡ʰ-WellFounded : WellFounded _⪡ʰ_

  ⪡ʰ-acc {branch x x₁ x₂} {nilT} (fst , z≤n) = ⪡ʰ-WellFounded nilT
  ⪡ʰ-acc {branch x x₁ x₂} {branch y x₃ y₁} (fst , snd) = {!!}
  ⪡ʰ-WellFounded nilT = acc λ { _ () }
  ⪡ʰ-WellFounded (branch t x t₁) = acc λ y x₁ → ⪡ʰ-acc {{!!}} {{!!}} x₁

  ⪡ⁱ-acc : {x y : BinTree} → y ⪡ⁱ x → Acc _⪡ⁱ_ y
  ⪡ⁱ-WellFounded : WellFounded _⪡ⁱ_

  ⪡ⁱ-acc = {!!}
  ⪡ⁱ-WellFounded = {!!}

  ⪡-acc : {x y : BinTree} → y ⪡ x → Acc _⪡_ y
  ⪡-WellFounded : WellFounded _⪡_
  
  ⪡-acc {branch _ _ _} {nilT} (inj₁ _) = ⪡-WellFounded nilT
  ⪡-acc {branch _ _ _} {nilT} (inj₂ _) = ⪡-WellFounded  nilT
  ⪡-acc {branch _ _ _} {branch _ _ _} (inj₁ (fst , snd)) = {!!}
  ⪡-acc {branch _ _ _} {branch _ _ _} (inj₂ (fst , snd)) = {!!}

  -- No trees less than the empty tree, in either sense.
  ⪡-WellFounded nilT = acc (λ { _ (inj₁ ())
                              ; _ (inj₂ ()) })
  -- …
  ⪡-WellFounded (branch _ _ _) = acc {!⪡-acc!}

--(λ { y (inj₁ (fst , snd)) → acc {!!}
--   ; y (inj₂ (fst , snd)) → acc (λ y₁ x₃ → {!⪡-WellFounded ?!}) } )

These were some tests on the bad version.

module BadBalanceTest where

  open BinTree ℕ (λ x → x)
  open Bad renaming (balance to badBalance)

  t₂′ : BinTree
  t₂′ = branch nilT 3 nilT

  t₂ : BinTree
  t₂ = branch nilT 2 t₂′

  tree : BinTree
  tree = branch nilT 1 t₂

  {- Walk through the algorithm manually
  _ : balanceFactor t₂ ≡ 0
  _ = refl

  _ : balanceFactor nilT ≡ 0
  _ = refl

  _ : lean tree ≡ right 2
  _ = refl

  tree′ : BinTree
  tree′ = pivotLeft tree

  _ : tree′ ≡ {!nilT!}
  _ = refl
  -}

  _ : badBalance tree ≡ branch (branch nilT 1 nilT) 2 (branch nilT 3 nilT)
  _ = refl

Author: Mark Armstrong markarmstrong.jpg

Contact: markparmstrong@gmail.com

Original date: April 1st, 2020

Last updated: 2021-01-27 Wed 02:12

Created using Emacs 27.0.90 (Org mode 9.4.4)

Validate