Context-Free Closure Under Inverse String Homomorphism #

This file proves that context-free languages are closed under inverse string homomorphism. Given a CFL L over β and a string homomorphism h : α → List β, the preimage h⁻¹(L) = {w : List α | h(w) ∈ L} is also context-free.

Strategy #

We decompose h⁻¹(L) as π(D ∩ f⁻¹(L)) where the intermediate alphabet is Γ = α ⊕ β:

fHom : Γ → List β erases Sum.inl symbols and unwraps Sum.inr symbols.
piHom : Γ → List α keeps Sum.inl values and erases Sum.inr symbols.
dLang is a regular "well-formedness" language over Γ consisting of valid encodings.
fInv L is CFL (the language L with Sum.inl symbols freely inserted).

Then:

fInv L is CFL via substitution (CF_of_subst_CF).
dLang h is regular via DFA construction.
dLang h ∩ fInv L is CFL (CF_of_CF_inter_regular).
π(dLang h ∩ fInv L) is CFL (CF_closed_under_homomorphism).

Main declarations #

Language.inverseHomomorphicImage : the preimage of a language under a string homomorphism.
CF_closed_under_inverse_homomorphism : CFLs are closed under inverse homomorphism.

source

def extendHom {α β : Type} (h : α → List β) (w : List α) :

List β

Extend a homomorphism to words by concatenation.

Equations

extendHom h w = (List.map h w).flatten

Instances For

source

def Language.inverseHomomorphicImage {α β : Type} (L : Language β) (h : α → List β) :

Language α

The preimage of a language L under a string homomorphism h. This is the set of words w over α such that h(w) ∈ L, where h extends to words by concatenation: h(a₁⋯aₙ) = h(a₁) ++ ⋯ ++ h(aₙ).

Equations

L.inverseHomomorphicImage h = {w : List α | extendHom h w ∈ L}

Instances For

source

def embedWord {α β : Type} (h : α → List β) (a : α) :

List (α ⊕ β)

Embed a symbol a into Γ = α ⊕ β as [inl a] ++ (h a).map inr.

Equations

embedWord h a = [Sum.inl a] ++ List.map Sum.inr (h a)

Instances For

source

def fHom {α β : Type} :

α ⊕ β → List β

Project Γ to β: erase inl, unwrap inr.

Equations

fHom (Sum.inl val) = []
fHom (Sum.inr b) = [b]

Instances For

source

def piHom {α β : Type} :

α ⊕ β → List α

Project Γ to α: keep inl, erase inr.

Equations

piHom (Sum.inl val) = [val]
piHom (Sum.inr b) = []

Instances For

source

theorem embedWord_flatMap_fHom {α β : Type} (h : α → List β) (a : α) :

List.flatMap fHom (embedWord h a) = h a

source

theorem extendHom_eq_flatMap_embedWord_fHom {α β : Type} (h : α → List β) (w : List α) :

extendHom h w = List.flatMap fHom (extendHom (embedWord h) w)

source

theorem embedWord_flatMap_piHom {α β : Type} (h : α → List β) (a : α) :

List.flatMap piHom (embedWord h a) = [a]

source

theorem extendHom_embedWord_piHom {α β : Type} (h : α → List β) (w : List α) :

List.flatMap piHom (extendHom (embedWord h) w) = w

source

def sLang (α β : Type) :

Language (α ⊕ β)

S = {[inl a] | a ∈ α}, the set of single-symbol inl words.

Equations

sLang α β = {w : List (α ⊕ β) | ∃ (a : α), w = [Sum.inl a]}

Instances For

source

theorem is_CF_sLang {α β : Type} [Fintype α] :

is_CF (sLang α β)

source

def fInv {α β : Type} (L : Language β) :

Language (α ⊕ β)

f⁻¹(L) = {v : List Γ | v.flatMap fHom ∈ L}.

Equations

fInv L = {v : List (α ⊕ β) | List.flatMap fHom v ∈ L}

Instances For

source

def substFn (α β : Type) (b : β) :

Language (α ⊕ β)

σ(b) = S* · {[inr b]}.

Equations

substFn α β b = KStar.kstar (sLang α β) * {[Sum.inr b]}

Instances For

source

theorem allInl_mem_kstar_sLang {α β : Type} (v : List (α ⊕ β)) (hv : ∀ x ∈ v, x.isLeft = true) :

v ∈ KStar.kstar (sLang α β)

source

theorem flatMap_fHom_kstar_sLang {α β : Type} (v : List (α ⊕ β)) (hv : v ∈ KStar.kstar (sLang α β)) :

List.flatMap fHom v = []

source

theorem flatMap_fHom_substFn {α β : Type} (v : List (α ⊕ β)) (b : β) (hv : v ∈ substFn α β b) :

List.flatMap fHom v = [b]

source

theorem decompose_sum_list {α β : Type} (v : List (α ⊕ β)) :

∃ (bs : List β) (ss : List (List (α ⊕ β))), ss.length = bs.length + 1 ∧ (∀ s ∈ ss, ∀ x ∈ s, x.isLeft = true) ∧ v = (List.zipWith (fun (s : List (α ⊕ β)) (b : β) => s ++ [Sum.inr b]) ss.dropLast bs).flatten ++ ss.getLast! ∧ List.flatMap fHom v = bs

source

theorem fInv_eq {α β : Type} (L : Language β) :

fInv L = L.subst (substFn α β) * KStar.kstar (sLang α β)

source

theorem is_CF_fInv {α β : Type} [Fintype α] (L : Language β) (hL : is_CF L) :

is_CF (fInv L)

f⁻¹(L) is CFL when L is CFL and α is finite.

source

def dLang {α β : Type} (h : α → List β) :

Language (α ⊕ β)

D = (⋃_a {embedWord h a})*, the set of valid encodings.

Equations

dLang h = KStar.kstar {w : List (α ⊕ β) | ∃ (a : α), w = embedWord h a}

Instances For

source

@[reducible, inline]

abbrev DFAState (α β : Type) (h : α → List β) :

Type

DFA state type for recognizing D.

none: dead state
some none: ready/accepting state (completed all blocks)
some (some ⟨a, k⟩): currently processing h(a) at position k

Equations

DFAState α β h = Option (Option ((a : α) × Fin (h a).length))

Instances For

source

def dStep {α β : Type} (h : α → List β) :

DFAState α β h → α ⊕ β → DFAState α β h

DFA transition function for recognizing D.

Equations

dStep h (some none) (Sum.inl a) = if hl : 0 < (h a).length then some (some ⟨a, ⟨0, hl⟩⟩) else some none
dStep h (some (some ⟨a, k⟩)) (Sum.inr b) = if (h a).get k = b then if hlast : ↑k + 1 < (h a).length then some (some ⟨a, ⟨↑k + 1, hlast⟩⟩) else some none else none
dStep h (some none) (Sum.inr val) = none
dStep h (some (some val)) (Sum.inl val_1) = none
dStep h none x✝ = none

Instances For

source

def invHomDFA {α β : Type} (h : α → List β) :

DFA (α ⊕ β) (DFAState α β h)

The DFA recognizing D.

Equations

invHomDFA h = { step := dStep h, start := some none, accept := {some none} }

Instances For

source

theorem dStep_embedWord {α β : Type} (h : α → List β) (a : α) :

List.foldl (dStep h) (some none) (embedWord h a) = some none

source

theorem foldl_dStep_none {α β : Type} (h : α → List β) (v : List (α ⊕ β)) :

List.foldl (dStep h) none v = none

source

theorem dfa_accepts_of_dLang {α β : Type} (h : α → List β) (v : List (α ⊕ β)) (hv : v ∈ dLang h) :

List.foldl (dStep h) (some none) v = some none

source

theorem dfa_mid_consume {α β : Type} (h : α → List β) (a : α) (k : Fin (h a).length) (v : List (α ⊕ β)) (hv : List.foldl (dStep h) (some (some ⟨a, k⟩)) v = some none) :

∃ (rest : List (α ⊕ β)), v = List.map Sum.inr (List.drop (↑k) (h a)) ++ rest ∧ List.foldl (dStep h) (some none) rest = some none

source

theorem dLang_of_dfa_accepts {α β : Type} (h : α → List β) (v : List (α ⊕ β)) (hv : List.foldl (dStep h) (some none) v = some none) :

v ∈ dLang h

source

theorem invHomDFA_correct {α β : Type} (h : α → List β) :

(invHomDFA h).accepts = dLang h

Correctness: the DFA accepts exactly D.

source

theorem isRegular_dLang {α β : Type} [Fintype α] (h : α → List β) :

(dLang h).IsRegular

D is a regular language when α is finite.

source

theorem inverseHomomorphicImage_eq {α β : Type} (L : Language β) (h : α → List β) :

L.inverseHomomorphicImage h = (fInv L ⊓ dLang h).homomorphicImage piHom

source

theorem CF_closed_under_inverse_homomorphism {α β : Type} [Fintype α] (L : Language β) (h : α → List β) (hL : is_CF L) :

is_CF (L.inverseHomomorphicImage h)

The class of context-free languages is closed under inverse string homomorphism.

Given a context-free language L over alphabet β and a string homomorphism h : α → List β (with α a finite type), the preimage h⁻¹(L) = {w | h(w) ∈ L} is also context-free.

source

theorem CF_closedUnderInverseHomomorphism :

ClosedUnderInverseHomomorphism fun {α : Type} [Fintype α] => is_CF

The class of context-free languages is closed under inverse string homomorphism.

Langlib

Langlib.Classes.ContextFree.Closure.InverseHomomorphism

Context-Free Closure Under Inverse String Homomorphism #

Strategy #

Main declarations #