Langlib

Langlib.Automata.LinearBounded.Equivalence.LBAToCSG.Completeness

LBA → CS: configuration encoding and the run-induction skeleton #

This file develops the completeness half of myhill_language_eq (Automata/LinearBounded/Equivalence/ContextSensitive.lean): an accepting LBA run on w is mirrored by a derivation start ⇒* map terminal w of the Myhill grammar.

The plan factors the derivation as

start ⇒* encode w₀ (initial cfg) -- start_setup ⇒* encode w₀ cfg' -- CS_derives_of_reaches + step_sim ⇒* map terminal w -- cleanup

where encode reifies an LBA configuration as a list of cell nonterminals (defined here), and CS_derives_of_reaches (proved here) lifts the single-step simulation over an entire reachable run by induction on Relation.ReflTransGen.

This file pins down the encoding and the glue, and proves the three content lemmas start_setup, step_sim, cleanup that assemble into completeness (the soundness direction is in LBAToCSG/Soundness.lean).

def MyhillConstruction.encode {T Γ Λ : Type} {n : } (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) :
List (symbol T (MyhillNT T Γ Λ))

Encoding of an LBA configuration as a Myhill sentential form: one cell per tape position, recording the boundary flags (i = 0, i = n), whether the head — and its state — sits at that cell, the current tape symbol, and the frozen original terminal worig i.

Equations
  • One or more equations did not get rendered due to their size.
Instances For
    @[simp]
    theorem MyhillConstruction.encode_length {T Γ Λ : Type} {n : } (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) :
    (encode worig cfg).length = n + 1
    theorem MyhillConstruction.encode_getElem {T Γ Λ : Type} {n : } (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) (i : ) (hi : i < n + 1) :
    (encode worig cfg)[i] = cellSym (decide (i = 0)) (decide (i = n)) (if i, hi = cfg.tape.head then some cfg.state else none) (cfg.tape.contents i, hi) (worig i, hi)

    The i-th symbol of encode is the cell at position i.

    theorem MyhillConstruction.encode_set_head {T Γ Λ : Type} {n : } (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) (q' : Λ) (a' : Γ) :
    (encode worig cfg).set (↑cfg.tape.head) (cellSym (decide (cfg.tape.head = 0)) (decide (cfg.tape.head = n)) (some q') a' (worig cfg.tape.head)) = encode worig { state := q', tape := (cfg.tape.write a').moveHead DLBA.Dir.stay }

    A head-stationary step (write a', set state q', head unchanged) updates exactly the head cell of the encoding.

    theorem MyhillConstruction.CS_derives_of_reaches {T Γ Λ : Type} {n : } (g : CS_grammar T) (M : LBA.Machine Γ Λ) (E : DLBA.Cfg Γ Λ nList (symbol T g.nt)) (hstep : ∀ (c c' : DLBA.Cfg Γ Λ n), LBA.Step M c c'CS_derives g (E c) (E c')) {c c' : DLBA.Cfg Γ Λ n} (h : LBA.Reaches M c c') :
    CS_derives g (E c) (E c')

    Run induction. If a single LBA step is mirrored by a context-sensitive derivation of the encodings, then so is any reachable run. Pure ReflTransGen induction — the engine that turns the per-step simulation step_sim into the full start ⇒* accepting derivation.

    theorem MyhillConstruction.CS_transforms_of_set {T : Type} (g : CS_grammar T) (L : List (symbol T g.nt)) (k : ) (hk : k < L.length) (A : g.nt) (out : symbol T g.nt) (hLk : L[k] = symbol.nonterminal A) (hrule : { context_left := [], input_nonterminal := A, context_right := [], output_string := [out] } g.rules) :
    CS_transforms g L (L.set k out)

    Rewriting a single nonterminal A at position k to the output [out] of a context-free-shaped rule A → out is one context-sensitive step (List.set). The engine for the head-stationary cases of the step simulation.

    theorem MyhillConstruction.list_two_split {α : Type u_1} (L : List α) (k : ) (hk : k + 1 < L.length) :
    L = List.take k L ++ L[k] :: L[k + 1] :: List.drop (k + 2) L

    Split a list around two adjacent positions k, k+1.

    theorem MyhillConstruction.CS_transforms_of_set_ctxR {T : Type} (g : CS_grammar T) (L : List (symbol T g.nt)) (k : ) (hk : k + 1 < L.length) (A : g.nt) (out : symbol T g.nt) (hLk : L[k] = symbol.nonterminal A) (hrule : { context_left := [], input_nonterminal := A, context_right := [L[k + 1]], output_string := [out] } g.rules) :
    CS_transforms g L (L.set k out)

    Rewriting nonterminal A at position k to [out] with the right neighbour L[k+1] as context is one context-sensitive step (= L.set k out). Drives interior step 1.

    theorem MyhillConstruction.CS_transforms_of_set_ctxL {T : Type} (g : CS_grammar T) (L : List (symbol T g.nt)) (k : ) (hk : k + 1 < L.length) (A : g.nt) (out : symbol T g.nt) (hLk1 : L[k + 1] = symbol.nonterminal A) (hrule : { context_left := [L[k]], input_nonterminal := A, context_right := [], output_string := [out] } g.rules) :
    CS_transforms g L (L.set (k + 1) out)

    Rewriting nonterminal A at position k+1 to [out] with the left neighbour L[k] as context is one context-sensitive step (= L.set (k+1) out). Drives interior step 2.

    theorem MyhillConstruction.CS_transforms_of_set_ctxLR {T : Type} (g : CS_grammar T) (L : List (symbol T g.nt)) (j : ) (hj : j + 2 < L.length) (A : g.nt) (out : symbol T g.nt) (hLk : L[j + 1] = symbol.nonterminal A) (hrule : { context_left := [L[j]], input_nonterminal := A, context_right := [L[j + 2]], output_string := [out] } g.rules) :
    CS_transforms g L (L.set (j + 1) out)

    Two-sided context rewrite at position j+1 (= L.set (j+1) out).

    theorem MyhillConstruction.CS_transforms_of_set_ctxLR_k {T : Type} (g : CS_grammar T) (L : List (symbol T g.nt)) (k : ) (hk0 : 0 < k) (hk : k + 1 < L.length) (A : g.nt) (out : symbol T g.nt) (hLk : L[k] = symbol.nonterminal A) (hrule : { context_left := [L[k - 1]], input_nonterminal := A, context_right := [L[k + 1]], output_string := [out] } g.rules) :
    CS_transforms g L (L.set k out)

    Two-sided context rewrite at position k ≥ 1 (= L.set k out); avoids j+1 mismatch.

    theorem MyhillConstruction.CS_tran_with_prefix {T : Type} {g : CS_grammar T} {p w₁ w₂ : List (symbol T g.nt)} (h : CS_transforms g w₁ w₂) :
    CS_transforms g (p ++ w₁) (p ++ w₂)
    theorem MyhillConstruction.CS_deri_with_prefix {T : Type} {g : CS_grammar T} {p w₁ w₂ : List (symbol T g.nt)} (h : CS_derives g w₁ w₂) :
    CS_derives g (p ++ w₁) (p ++ w₂)
    theorem MyhillConstruction.CS_tran_with_postfix {T : Type} {g : CS_grammar T} {s w₁ w₂ : List (symbol T g.nt)} (h : CS_transforms g w₁ w₂) :
    CS_transforms g (w₁ ++ s) (w₂ ++ s)
    theorem MyhillConstruction.CS_deri_with_postfix {T : Type} {g : CS_grammar T} {s w₁ w₂ : List (symbol T g.nt)} (h : CS_derives g w₁ w₂) :
    CS_derives g (w₁ ++ s) (w₂ ++ s)

    Start phase: the startAux nonterminal lays down the non-head cells #

    def MyhillConstruction.auxCells {T Γ Λ : Type} (embed : TΓ) :
    List TList (symbol T (MyhillNT T Γ Λ))

    The interior/last cells produced by the startAux rules for an input list: every cell carries lb = false and q = none; only the final cell carries rb = true.

    Equations
    Instances For
      @[simp]
      theorem MyhillConstruction.auxCells_length {T Γ Λ : Type} (embed : TΓ) (ts : List T) :
      (auxCells embed ts).length = ts.length
      theorem MyhillConstruction.auxCells_cons {T Γ Λ : Type} (embed : TΓ) (a : T) (l : List T) (hl : l []) :
      auxCells embed (a :: l) = cellSym false false none (embed a) a :: auxCells embed l

      Unfolding auxCells on a cons whose tail is non-empty (a :: l matches clause 3).

      theorem MyhillConstruction.auxCells_ofFn {T Γ Λ : Type} (embed : TΓ) {m : } (worig : Fin (m + 1)T) :
      auxCells embed (List.ofFn worig) = List.ofFn fun (i : Fin (m + 1)) => cellSym false (decide (i = m)) none (embed (worig i)) (worig i)

      auxCells of an ofFn list is the positional ofFn of the corresponding cells: lb = false, q = none, rb = (i = m).

      theorem MyhillConstruction.preRow {T Γ Λ : Type} [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (mids : List T) (tail : List (symbol T (MyhillNT T Γ Λ))) :

      Right-to-left middle build. Starting from startAux :: tail, repeatedly applying the middle-cell rule (startAux → [startAux, cell]) lays the none cells for mids immediately to the right of startAux, in order. Applying the inductive hypothesis first (building ts to the right) and the head step last keeps the cells in position order with no reversal.

      theorem MyhillConstruction.auxCells_split {T Γ Λ : Type} (embed : TΓ) (tl : List T) (h : tl []) :
      auxCells embed tl = List.map (fun (t : T) => cellSym false false none (embed t) t) tl.dropLast ++ [cellSym false true none (embed (tl.getLast h)) (tl.getLast h)]

      auxCells splits as the none-cells for all but the last input, followed by the single rb = true cell for the last input. This is the shape produced by the right-to-left build: the rightmost cell comes from the start rule, the middles from preRow.

      def MyhillConstruction.startCells {T Γ Λ : Type} (M : LBA.Machine Γ Λ) (embed : TΓ) :
      List TList (symbol T (MyhillNT T Γ Λ))

      The full initial sentential form: the head cell (lb = true, state M.initial) at position 0, followed by auxCells for the rest of the input.

      Equations
      Instances For
        @[simp]
        theorem MyhillConstruction.startCells_length {T Γ Λ : Type} (M : LBA.Machine Γ Λ) (embed : TΓ) (ts : List T) :
        (startCells M embed ts).length = ts.length
        theorem MyhillConstruction.start_phase {T Γ Λ : Type} [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (ts : List T) (h : ts []) :

        Start phase. start ⇒* startCells ts for any non-empty input list ts; this is the initial sentential form encoding the LBA's initial configuration on ts.

        theorem MyhillConstruction.startCells_cons {T Γ Λ : Type} (M : LBA.Machine Γ Λ) (embed : TΓ) (a : T) (l : List T) (hl : l []) :
        startCells M embed (a :: l) = cellSym true false (some M.initial) (embed a) a :: auxCells embed l

        Unfolding startCells on a cons whose tail is non-empty (a :: l matches clause 3).

        theorem MyhillConstruction.startCells_eq_encode {T Γ Λ : Type} (M : LBA.Machine Γ Λ) (embed : TΓ) {m : } (worig : Fin (m + 1)T) :
        startCells M embed (List.ofFn worig) = encode worig { state := M.initial, tape := { contents := fun (i : Fin (m + 1)) => embed (worig i), head := 0, } }

        Start-phase bridge. The cons-list startCells equals the positional encode of the initial configuration (head at 0, state M.initial, tape embed ∘ worig). This connects start_phase to the encode-based run simulation.

        Step simulation: the stay case #

        theorem MyhillConstruction.step_sim_stay {T Γ Λ : Type} {n : } [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) (q' : Λ) (a' : Γ) (htrans : (q', a', DLBA.Dir.stay) M.transition cfg.state cfg.tape.read) :
        CS_derives (myhillGrammar M embed) (encode worig cfg) (encode worig { state := q', tape := (cfg.tape.write a').moveHead DLBA.Dir.stay })

        step_sim, stay case. A stay transition of the LBA is mirrored by a single context-sensitive derivation step on the encodings (rewrite the head cell in place).

        Moving right at the right boundary is clamped (no movement).

        Moving left at the left boundary is clamped (no movement).

        A stay move leaves the tape unchanged.

        moveHead never changes the tape contents.

        theorem MyhillConstruction.moveHead_right_head_lt {Γ : Type} {n : } (t : DLBA.BoundedTape Γ n) (h : t.head < n) :

        An interior right move increments the head position.

        theorem MyhillConstruction.moveHead_left_head_pos {Γ : Type} {n : } (t : DLBA.BoundedTape Γ n) (h : 0 < t.head) :

        An interior left move decrements the head position.

        theorem MyhillConstruction.encode_right_interior {T Γ Λ : Type} {n : } (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) (q' : Λ) (a' : Γ) (hlt : cfg.tape.head < n) :
        encode worig { state := q', tape := (cfg.tape.write a').moveHead DLBA.Dir.right } = ((encode worig cfg).set (↑cfg.tape.head) (cellSym (decide (cfg.tape.head = 0)) false none a' (worig cfg.tape.head))).set (cfg.tape.head + 1) (cellSym false (decide (cfg.tape.head + 1 = n)) (some q') (cfg.tape.contents cfg.tape.head + 1, ) (worig cfg.tape.head + 1, ))

        An interior right step changes exactly the head cell (head leaves, symbol a' written) and its right neighbour (head arrives with state q').

        theorem MyhillConstruction.encode_left_interior {T Γ Λ : Type} {n : } (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) (q' : Λ) (a' : Γ) (m : ) (hm : cfg.tape.head = m + 1) :
        encode worig { state := q', tape := (cfg.tape.write a').moveHead DLBA.Dir.left } = ((encode worig cfg).set (m + 1) (cellSym false (decide (m + 1 = n)) none a' (worig cfg.tape.head))).set m (cellSym (decide (m = 0)) false (some q') (cfg.tape.contents m, ) (worig m, ))

        An interior left step changes exactly the head cell (at m+1; head leaves, symbol a' written) and its left neighbour (at m; head arrives with state q'). Stated with head = m + 1 so the indices m, m+1 match the three-step assembly.

        theorem MyhillConstruction.step_sim_right_interior {T Γ Λ : Type} {n : } [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) (q' : Λ) (a' : Γ) (hlt : cfg.tape.head < n) (htrans : (q', a', DLBA.Dir.right) M.transition cfg.state cfg.tape.read) :
        CS_derives (myhillGrammar M embed) (encode worig cfg) (encode worig { state := q', tape := (cfg.tape.write a').moveHead DLBA.Dir.right })

        step_sim, right-interior case. An interior right transition is mirrored by the three-step cellPending protocol: park the head as a cellPending, place the new state on the right neighbour, then resolve the pending cell.

        theorem MyhillConstruction.step_sim_left_interior {T Γ Λ : Type} {n : } [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) (q' : Λ) (a' : Γ) (hpos : 0 < cfg.tape.head) (htrans : (q', a', DLBA.Dir.left) M.transition cfg.state cfg.tape.read) :
        CS_derives (myhillGrammar M embed) (encode worig cfg) (encode worig { state := q', tape := (cfg.tape.write a').moveHead DLBA.Dir.left })

        step_sim, left-interior case. Mirror of the right-interior case.

        theorem MyhillConstruction.step_sim_right_boundary {T Γ Λ : Type} {n : } [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) (q' : Λ) (a' : Γ) (hb : cfg.tape.head = n) (htrans : (q', a', DLBA.Dir.right) M.transition cfg.state cfg.tape.read) :
        CS_derives (myhillGrammar M embed) (encode worig cfg) (encode worig { state := q', tape := (cfg.tape.write a').moveHead DLBA.Dir.right })

        step_sim, right-boundary case. A right transition with the head at the right end is clamped, so it is mirrored by a single in-place rewrite (using the boundary rule).

        theorem MyhillConstruction.step_sim_left_boundary {T Γ Λ : Type} {n : } [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) (q' : Λ) (a' : Γ) (hb : cfg.tape.head = 0) (htrans : (q', a', DLBA.Dir.left) M.transition cfg.state cfg.tape.read) :
        CS_derives (myhillGrammar M embed) (encode worig cfg) (encode worig { state := q', tape := (cfg.tape.write a').moveHead DLBA.Dir.left })

        step_sim, left-boundary case. A left transition with the head at the left end is clamped, so it is mirrored by a single in-place rewrite (using the boundary rule).

        theorem MyhillConstruction.step_sim {T Γ Λ : Type} {n : } [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (worig : Fin (n + 1)T) {cfg cfg' : DLBA.Cfg Γ Λ n} (h : LBA.Step M cfg cfg') :
        CS_derives (myhillGrammar M embed) (encode worig cfg) (encode worig cfg')

        Step simulation. Every single LBA step is mirrored by a context-sensitive derivation of the cell encodings — combining the stay, boundary, and interior cases.

        Cleanup: an accepting configuration's encoding derives the terminal word #

        def MyhillConstruction.cleanupCells {T Γ Λ : Type} (cs : List (Bool × Bool × Γ × T)) :
        List (symbol T (MyhillNT T Γ Λ))

        A list of (head-free) cells, as grammar symbols, given their (lb, rb, a, t) data.

        Equations
        Instances For
          def MyhillConstruction.cleanupTerms {T Γ Λ : Type} (cs : List (Bool × Bool × Γ × T)) :
          List (symbol T (MyhillNT T Γ Λ))

          The corresponding list of terminals.

          Equations
          Instances For
            theorem MyhillConstruction.cleanup_right {T Γ Λ : Type} [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (t₀ : T) (cs : List (Bool × Bool × Γ × T)) :

            Rightward propagation. A terminal followed by head-free cells derives to all terminals (each cell is converted by the left-propagation rule using the terminal on its left).

            theorem MyhillConstruction.cleanup_left {T Γ Λ : Type} [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (t₀ : T) (cs : List (Bool × Bool × Γ × T)) :

            Leftward propagation. Head-free cells followed by a terminal derive to all terminals (each cell is converted by the right-propagation rule using the terminal on its right).

            theorem MyhillConstruction.cleanup_mid {T Γ Λ : Type} [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (before after : List (Bool × Bool × Γ × T)) (t₀ : T) :

            Two-sided propagation. A terminal flanked by head-free cells derives to all terminals.

            theorem MyhillConstruction.cleanup_full {T Γ Λ : Type} [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (cs : List (Bool × Bool × Γ × T)) (k : ) (hk : k < cs.length) (q : Λ) (hq : M.accept q = true) :
            CS_derives (myhillGrammar M embed) ((cleanupCells cs).set k (cellSym cs[k].1 cs[k].2.1 (some q) cs[k].2.2.1 cs[k].2.2.2)) (cleanupTerms cs)

            Cleanup (list form). A list of head-free cells with one accepting (some q) cell at position k derives to the list of terminals.

            def MyhillConstruction.dataOf {T Γ Λ : Type} {n : } (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) :
            List (Bool × Bool × Γ × T)

            The per-position cell data (lb, rb, a, t) of a configuration.

            Equations
            Instances For
              @[simp]
              theorem MyhillConstruction.dataOf_length {T Γ Λ : Type} {n : } (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) :
              (dataOf worig cfg).length = n + 1
              theorem MyhillConstruction.dataOf_getElem {T Γ Λ : Type} {n : } (worig : Fin (n + 1)T) (cfg : DLBA.Cfg Γ Λ n) (i : ) (hi : i < n + 1) :
              (dataOf worig cfg)[i] = (decide (i = 0), decide (i = n), cfg.tape.contents i, hi, worig i, hi)
              theorem MyhillConstruction.cleanup {T Γ Λ : Type} {n : } [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (worig : Fin (n + 1)T) (cfg' : DLBA.Cfg Γ Λ n) (hq : M.accept cfg'.state = true) :
              CS_derives (myhillGrammar M embed) (encode worig cfg') (List.ofFn fun (i : Fin (n + 1)) => symbol.terminal (worig i))

              Cleanup. The encoding of an accepting configuration derives the terminal word.

              theorem MyhillConstruction.myhill_complete_fin {T Γ Λ : Type} {n : } [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (worig : Fin (n + 1)T) (cfg' : DLBA.Cfg Γ Λ n) (hreach : LBA.Reaches M { state := M.initial, tape := { contents := fun (i : Fin (n + 1)) => embed (worig i), head := 0, } } cfg') (hacc : M.accept cfg'.state = true) :

              Completeness (Fin-indexed). If the LBA, started on the configuration encoding worig, reaches an accepting configuration, then the Myhill grammar derives the terminal word map terminal worig. Assembles the start phase, the run simulation, and the cleanup.

              theorem MyhillConstruction.myhill_complete {T Γ Λ : Type} [Fintype T] [Fintype Γ] [Fintype Λ] (M : LBA.Machine Γ Λ) (embed : T Γ) (w : List T) (h : LBA.LanguageViaEmbed M (⇑embed) w) :

              Completeness. Every word accepted by the LBA is generated by the Myhill grammar.