Langlib

Langlib.Classes.Linear.Pumping.Spine

Spines of linear derivations #

A linear grammar derives a terminal word from a single nonterminal along a single "spine": every sentential form has the shape map terminal p ++ [nonterminal C] ++ map terminal s, until a terminal rule is applied. This file reifies that spine as an inductive Spine, proves it sound and complete with respect to grammar_derives, and equips it with the bookkeeping (length, list of visited nonterminals, accumulated terminal counts) needed for the linear pumping lemma.

Main declarations #

Spine — the caterpillar derivation from a nonterminal to a word.
Spine.derives — soundness: a spine yields a grammar_derives.
exists_spine — completeness: a linear derivation yields a spine.

inductive Spine {T : Type} (g : grammar T) :

g.nt → List T → Type

A Spine g B w witnesses, for a linear grammar g, a derivation of the terminal word w from the single nonterminal B, recorded as the chain of rules applied along the unique nonterminal.

last {T : Type} {g : grammar T} (B : g.nt) (m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal m } ∈ g.rules) : Spine g B m
The final step: a purely terminal rule B → m.
cons {T : Type} {g : grammar T} (B C : g.nt) (u v m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal u ++ [symbol.nonterminal C] ++ List.map symbol.terminal v } ∈ g.rules) (rest : Spine g C m) : Spine g B (u ++ m ++ v)
A spine step: a rule B → u C v with u, v terminal, then continue from C.

Instances For

def Spine.derives {T : Type} {g : grammar T} {B : g.nt} {w : List T} :

∀ (a : Spine g B w), grammar_derives g [symbol.nonterminal B] (List.map symbol.terminal w)

Soundness: a spine gives an actual derivation B ⇒* w.

Equations

⋯ = ⋯

Instances For

def Spine.length {T : Type} {g : grammar T} {B : g.nt} {w : List T} :

Spine g B w → ℕ

The number of rule applications recorded by the spine.

Equations

(Spine.last x✝¹ x✝ h).length = 1
(Spine.cons x✝ C u v m h rest).length = rest.length + 1

Instances For

def Spine.nts {T : Type} {g : grammar T} {B : g.nt} {w : List T} :

Spine g B w → List g.nt

The list of nonterminals visited along the spine, from top to bottom.

Equations

(Spine.last x✝¹ x✝ h).nts = [x✝¹]
(Spine.cons x✝ C u v m h rest).nts = x✝ :: rest.nts

Instances For

Completeness: every linear derivation is a spine #

theorem not_transforms_map_terminal {T : Type} {g : grammar T} {a : List T} {z : List (symbol T g.nt)} :

¬grammar_transforms g (List.map symbol.terminal a) z

A terminal-only sentential form admits no transformation step.

theorem eq_of_deri_from_terminals {T : Type} {g : grammar T} {a : List T} {z : List (symbol T g.nt)} (h : grammar_derives g (List.map symbol.terminal a) z) :

z = List.map symbol.terminal a

From a terminal-only form, any derivation reaches only that same form.

theorem eq_map_terminal_of_all_terminal {T : Type} {g : grammar T} {l : List (symbol T g.nt)} (h : ∀ x ∈ l, ∃ (t : T), x = symbol.terminal t) :

∃ (o : List T), l = List.map symbol.terminal o

A symbol list whose every element is a terminal is the image of a terminal word.

theorem eq_of_terminal_sandwich {T : Type} {g : grammar T} {p s : List T} {C N' : g.nt} {u v : List (symbol T g.nt)} (h : List.map symbol.terminal p ++ [symbol.nonterminal C] ++ List.map symbol.terminal s = u ++ [symbol.nonterminal N'] ++ v) :

u = List.map symbol.terminal p ∧ N' = C ∧ v = List.map symbol.terminal s

In a terminal sandwich map p ++ [nt C] ++ map s = u ++ [nt N'] ++ v, the nonterminal is forced to line up: u = map p, N' = C, v = map s.

theorem linear_transforms_at_nt {T : Type} {g : grammar T} (hg : grammar_linear g) {p s : List T} {C : g.nt} {x' : List (symbol T g.nt)} (h : grammar_transforms g (List.map symbol.terminal p ++ [symbol.nonterminal C] ++ List.map symbol.terminal s) x') :

∃ (out : List (symbol T g.nt)), { input_L := [], input_N := C, input_R := [], output_string := out } ∈ g.rules ∧ x' = List.map symbol.terminal p ++ out ++ List.map symbol.terminal s

In a linear grammar, a transformation step from a terminal sandwich map p ++ [nt C] ++ map s rewrites exactly the nonterminal C in place.

theorem spine_of_deri {T : Type} {g : grammar T} (hg : grammar_linear g) {w : List T} {x : List (symbol T g.nt)} :

grammar_derives g x (List.map symbol.terminal w) → ∀ {p s : List T} {C : g.nt}, x = List.map symbol.terminal p ++ [symbol.nonterminal C] ++ List.map symbol.terminal s → ∃ (m : List T), w = p ++ m ++ s ∧ Nonempty (Spine g C m)

Completeness (general form): a linear derivation from a terminal sandwich map p ++ [nt C] ++ map s to a terminal word map w factors through C.

theorem exists_spine {T : Type} {g : grammar T} (hg : grammar_linear g) {B : g.nt} {w : List T} (h : grammar_derives g [symbol.nonterminal B] (List.map symbol.terminal w)) :

Nonempty (Spine g B w)

Completeness: every linear derivation of a terminal word from a single nonterminal B is witnessed by a spine.

Splitting a spine #

def maxRuleLen {T : Type} (g : grammar T) :

An upper bound on the number of terminals a single rule application produces.

Equations

maxRuleLen g = List.foldr max 0 (List.map (fun (r : grule T g.nt) => r.output_string.length) g.rules)

Instances For

theorem output_length_le_maxRuleLen {T : Type} {g : grammar T} {r : grule T g.nt} (hr : r ∈ g.rules) :

r.output_string.length ≤ maxRuleLen g

Every rule's output is no longer than maxRuleLen.

structure SplitResult {T : Type} (g : grammar T) (B : g.nt) (w : List T) :

The result of splitting a spine B ⇒* w at a given depth: a top derivation B ⇒* u·C·y and an inner spine C ⇒* wc, with w = u ++ wc ++ y.

C : g.nt
u : List T
y : List T
wc : List T
inner : Spine g self.C self.wc
hword : w = self.u ++ self.wc ++ self.y
hderiv : grammar_derives g [symbol.nonterminal B] (List.map symbol.terminal self.u ++ [symbol.nonterminal self.C] ++ List.map symbol.terminal self.y)

Instances For

def Spine.splitAt {T : Type} {g : grammar T} {B : g.nt} {w : List T} :

Spine g B w → ℕ → SplitResult g B w

Split a spine at depth k, peeling off the first k rule applications.

Equations

One or more equations did not get rendered due to their size.
x✝.splitAt 0 = { C := x✝², u := [], y := [], wc := x✝¹, inner := x✝, hword := ⋯, hderiv := ⋯ }
(Spine.last x✝¹ x✝ h).splitAt n.succ = { C := x✝¹, u := [], y := [], wc := x✝, inner := Spine.last x✝¹ x✝ h, hword := ⋯, hderiv := ⋯ }

Instances For

@[simp]

theorem Spine.splitAt_zero_C {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) :

(s.splitAt 0).C = B

@[simp]

theorem Spine.splitAt_zero_u {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) :

(s.splitAt 0).u = []

@[simp]

theorem Spine.splitAt_zero_y {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) :

(s.splitAt 0).y = []

@[simp]

theorem Spine.splitAt_zero_wc {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) :

(s.splitAt 0).wc = w

@[simp]

theorem Spine.splitAt_last_C {T : Type} {g : grammar T} (B : g.nt) (m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal m } ∈ g.rules) (k : ℕ) :

((last B m h).splitAt k).C = B

@[simp]

theorem Spine.splitAt_last_u {T : Type} {g : grammar T} (B : g.nt) (m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal m } ∈ g.rules) (k : ℕ) :

((last B m h).splitAt k).u = []

@[simp]

theorem Spine.splitAt_last_y {T : Type} {g : grammar T} (B : g.nt) (m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal m } ∈ g.rules) (k : ℕ) :

((last B m h).splitAt k).y = []

@[simp]

theorem Spine.splitAt_last_wc {T : Type} {g : grammar T} (B : g.nt) (m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal m } ∈ g.rules) (k : ℕ) :

((last B m h).splitAt k).wc = m

@[simp]

theorem Spine.splitAt_cons_succ_C {T : Type} {g : grammar T} (B C : g.nt) (u0 v0 m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal u0 ++ [symbol.nonterminal C] ++ List.map symbol.terminal v0 } ∈ g.rules) (rest : Spine g C m) (k : ℕ) :

((cons B C u0 v0 m h rest).splitAt (k + 1)).C = (rest.splitAt k).C

@[simp]

theorem Spine.splitAt_cons_succ_u {T : Type} {g : grammar T} (B C : g.nt) (u0 v0 m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal u0 ++ [symbol.nonterminal C] ++ List.map symbol.terminal v0 } ∈ g.rules) (rest : Spine g C m) (k : ℕ) :

((cons B C u0 v0 m h rest).splitAt (k + 1)).u = u0 ++ (rest.splitAt k).u

@[simp]

theorem Spine.splitAt_cons_succ_y {T : Type} {g : grammar T} (B C : g.nt) (u0 v0 m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal u0 ++ [symbol.nonterminal C] ++ List.map symbol.terminal v0 } ∈ g.rules) (rest : Spine g C m) (k : ℕ) :

((cons B C u0 v0 m h rest).splitAt (k + 1)).y = (rest.splitAt k).y ++ v0

@[simp]

theorem Spine.splitAt_cons_succ_wc {T : Type} {g : grammar T} (B C : g.nt) (u0 v0 m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal u0 ++ [symbol.nonterminal C] ++ List.map symbol.terminal v0 } ∈ g.rules) (rest : Spine g C m) (k : ℕ) :

((cons B C u0 v0 m h rest).splitAt (k + 1)).wc = (rest.splitAt k).wc

@[simp]

theorem Spine.splitAt_cons_succ_inner {T : Type} {g : grammar T} (B C : g.nt) (u0 v0 m : List T) (h : { input_L := [], input_N := B, input_R := [], output_string := List.map symbol.terminal u0 ++ [symbol.nonterminal C] ++ List.map symbol.terminal v0 } ∈ g.rules) (rest : Spine g C m) (k : ℕ) :

((cons B C u0 v0 m h rest).splitAt (k + 1)).inner = (rest.splitAt k).inner

theorem Spine.splitAt_add {T : Type} {g : grammar T} (i d : ℕ) {B : g.nt} {w : List T} (s : Spine g B w) :

(s.splitAt (i + d)).C = ((s.splitAt i).inner.splitAt d).C ∧ (s.splitAt (i + d)).u = (s.splitAt i).u ++ ((s.splitAt i).inner.splitAt d).u ∧ (s.splitAt (i + d)).y = ((s.splitAt i).inner.splitAt d).y ++ (s.splitAt i).y

Composition of splits: splitting at depth i + d is splitting at i, then splitting the inner spine at depth d.

theorem Spine.root_mem {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) :

B ∈ List.map (fun (x : grule T g.nt) => x.input_N) g.rules

The root nonterminal of any spine is the input nonterminal of one of its rules.

def Spine.cnt {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) (k : ℕ) :

The number of terminals produced in the first k rule applications of the spine.

Equations

s.cnt k = (s.splitAt k).u.length + (s.splitAt k).y.length

Instances For

theorem Spine.cnt_add {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) (i d : ℕ) :

s.cnt (i + d) = s.cnt i + (s.splitAt i).inner.cnt d

cnt decomposes additively along splitAt_add.

@[simp]

theorem Spine.cnt_zero {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) :

s.cnt 0 = 0

theorem Spine.splitAt_add_C {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) (i d : ℕ) :

(s.splitAt (i + d)).C = ((s.splitAt i).inner.splitAt d).C

The root nonterminal reached at depth i + d is the one reached by splitting the inner spine at depth d.

theorem Spine.cnt_one_le {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) :

s.cnt 1 ≤ maxRuleLen g

One rule application produces at most maxRuleLen terminals.

theorem Spine.cnt_step {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) (k : ℕ) :

s.cnt (k + 1) ≤ s.cnt k + maxRuleLen g

Each additional step adds at most maxRuleLen terminals.

theorem Spine.cnt_mono {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) :

cnt is monotone in the depth.

theorem Spine.cnt_add_wc {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) (k : ℕ) :

s.cnt k + (s.splitAt k).wc.length = w.length

The terminals produced plus the remaining inner word recover the whole word.

theorem Spine.splitAt_length_wc_le {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) :

(s.splitAt s.length).wc.length ≤ maxRuleLen g

The remaining inner word after fully splitting is a single rule output, hence short.

theorem Spine.cnt_length_ge {T : Type} {g : grammar T} {B : g.nt} {w : List T} (s : Spine g B w) :

w.length - maxRuleLen g ≤ s.cnt s.length

At full depth, cnt accounts for all but at most maxRuleLen symbols of the word.