Langlib

Langlib.Automata.Turing.Equivalence.TMToGrammar.Soundness

Soundness of the TM → Grammar Construction #

This file proves that the grammar tmToGrammar constructed from a TM0 machine is sound: it only generates words on which the TM halts.

Proof strategy #

We define a "terminal content" function terminalContent that extracts, from each symbol in a sentential form, the original input character it represents. This function is invariant under all simulation and cleanup grammar rules (but increases during generation).

We then define a forward invariant GI with five constructors:

The key insight is that cleanup uses terminalContent instead of tracking a derivation to a terminal form. This avoids the non-confluence issue in the cleanup phase.

Terminal content #

def symbolOriginal {T Λ : Type} :
symbol T (TMtoGrammarNT T Λ)Option T

Extract the "original input character" from a grammar symbol. Terminals contribute their value. Nonterminals contribute their orig field (if some). Boundary markers, start, genMore, and none-orig nonterminals contribute nothing.

Equations
Instances For
    def terminalContent {T Λ : Type} (sf : List (symbol T (TMtoGrammarNT T Λ))) :

    The "terminal content" of a sentential form: the list of original input characters. This is invariant under simulation and cleanup rules.

    Equations
    Instances For
      theorem terminalContent_cons {T Λ : Type} (s : symbol T (TMtoGrammarNT T Λ)) (sf : List (symbol T (TMtoGrammarNT T Λ))) :
      terminalContent (s :: sf) = (match symbolOriginal s with | some t => [t] | none => []) ++ terminalContent sf
      theorem terminalContent_append {T Λ : Type} (sf₁ sf₂ : List (symbol T (TMtoGrammarNT T Λ))) :
      theorem terminalContent_preserved {T : Type} [Fintype T] {Λ : Type} [Inhabited Λ] [Fintype Λ] {M : Turing.TM0.Machine (Option T) Λ} (sf sf' : List (symbol T (TMtoGrammarNT T Λ))) (htrans : grammar_transforms (tmToGrammar T Λ M) sf sf') (hns : ssf, s symbol.nonterminal TMtoGrammarNT.start) (hng : ssf, s symbol.nonterminal TMtoGrammarNT.genMore) :

      Terminal-only forms can't be transformed #

      theorem no_transform_terminal {T : Type} (g : grammar T) (w : List T) (sf' : List (symbol T g.nt)) :

      Terminal-only forms can't be transformed.

      theorem derives_terminal_id {T : Type} (g : grammar T) (w : List T) (sf : List (symbol T g.nt)) (h : grammar_derives g (List.map symbol.terminal w) sf) :

      Derivation from terminal form is trivial.

      theorem no_cell_rule {T : Type} [Fintype T] {Λ : Type} [Inhabited Λ] [Fintype Λ] (M : Turing.TM0.Machine (Option T) Λ) (r : grule T (TMtoGrammarNT T Λ)) (hr : r (tmToGrammar T Λ M).rules) (orig cur : Option T) (hN : r.input_N = TMtoGrammarNT.cell orig cur) :

      No rule has cell as input_N.

      The corrected forward invariant #

      def isNonterminal {T N : Type} :
      symbol T NProp

      Whether a symbol is a nonterminal.

      Equations
      Instances For
        inductive GI {T Λ : Type} [Inhabited Λ] (M : Turing.TM0.Machine (Option T) Λ) :
        List (symbol T (TMtoGrammarNT T Λ))Prop

        The forward invariant for the tmToGrammar construction.

        This invariant tracks five phases:

        1. initial: The start symbol [S].
        2. generating: Building the initial configuration [LB, genMore, cells..., RB].
        3. simulating: Simulating TM steps. Tracks correspondence with actual TM configs.
        4. cleanup: Post-halt phase. Tracks terminalContent = w where TM halts on w.
        5. done: Terminal-only form w.map terminal where TM halts on w.
        Instances For

          Terminal halts #

          Preservation lemmas #

          theorem headCell_rule_classification {T : Type} [Fintype T] {Λ : Type} [Inhabited Λ] [Fintype Λ] (M : Turing.TM0.Machine (Option T) Λ) (r : grule T (TMtoGrammarNT T Λ)) (hr : r (tmToGrammar T Λ M).rules) (q : Λ) (orig cur : Option T) (hN : r.input_N = TMtoGrammarNT.headCell q orig cur) :
          (∃ (q' : Λ) (action : Turing.TM0.Stmt (Option T)), M q cur = some (q', action)) M q cur = none r.input_L = [] r.input_R = [] r.output_string = [symbol.nonterminal (TMtoGrammarNT.haltCell orig)]
          theorem sim_rule_detailed_classification {T : Type} [Fintype T] {Λ : Type} [Inhabited Λ] [Fintype Λ] (M : Turing.TM0.Machine (Option T) Λ) (r : grule T (TMtoGrammarNT T Λ)) (hr : r (tmToGrammar T Λ M).rules) (q : Λ) (orig cur : Option T) (hN : r.input_N = TMtoGrammarNT.headCell q orig cur) (q' : Λ) (action : Turing.TM0.Stmt (Option T)) (hMqa : M q cur = some (q', action)) :

          Detailed classification of simulation rules matching headCell q orig cur when M q cur = some (q', action).

          theorem sim_rule_gives_encodeTwoTrack {T : Type} [Fintype T] {Λ : Type} [Inhabited Λ] [Fintype Λ] (M : Turing.TM0.Machine (Option T) Λ) (tc : TwoTrackConfig) (r : grule T (TMtoGrammarNT T Λ)) (hr : r (tmToGrammar T Λ M).rules) (hN : r.input_N = TMtoGrammarNT.headCell tc.headState tc.headOrig tc.headCur) (hM : ∃ (q' : Λ) (action : Turing.TM0.Stmt (Option T)), M tc.headState tc.headCur = some (q', action)) (u v : List (symbol T (TMtoGrammarNT T Λ))) (heq : encodeTwoTrack tc = u ++ r.input_L ++ [symbol.nonterminal r.input_N] ++ r.input_R ++ v) :

          When a simulation rule fires on encodeTwoTrack tc, the result is encodeTwoTrack tc' for some tc' with stepTwoTrack M tc = some tc'.

          theorem tm_eval_dom_of_reaches_halt {T Λ : Type} [Inhabited Λ] (M : Turing.TM0.Machine (Option T) Λ) (l : List (Option T)) (tmCfg : Turing.TM0.Cfg (Option T) Λ) (hreach : Turing.Reaches (Turing.TM0.step M) (Turing.TM0.init l) tmCfg) (hhalt : Turing.TM0.step M tmCfg = none) :
          theorem tm_step_of_corresponds {T Λ : Type} [Inhabited Λ] (M : Turing.TM0.Machine (Option T) Λ) (tc : TwoTrackConfig) (tmCfg : Turing.TM0.Cfg (Option T) Λ) (hcorr : TMCorresponds tc tmCfg) (h_Mqc : M tc.headState tc.headCur = none) :
          theorem tm_step_some_of_corresponds {T Λ : Type} [Inhabited Λ] (M : Turing.TM0.Machine (Option T) Λ) (tc : TwoTrackConfig) (tmCfg : Turing.TM0.Cfg (Option T) Λ) (hcorr : TMCorresponds tc tmCfg) (q' : Λ) (action : Turing.TM0.Stmt (Option T)) (h_Mqc : M tc.headState tc.headCur = some (q', action)) :
          ∃ (tmCfg' : Turing.TM0.Cfg (Option T) Λ), Turing.TM0.step M tmCfg = some tmCfg'
          theorem GI_preserved_simulating {T : Type} [Fintype T] {Λ : Type} [Inhabited Λ] [Fintype Λ] (M : Turing.TM0.Machine (Option T) Λ) (tc : TwoTrackConfig) (tmCfg : Turing.TM0.Cfg (Option T) Λ) (hcorr : TMCorresponds tc tmCfg) (hreach : Turing.Reaches (Turing.TM0.step M) (Turing.TM0.init (List.map some (extractInput (twoTrackOriginals tc)))) tmCfg) (sf' : List (symbol T (TMtoGrammarNT T Λ))) (htrans : grammar_transforms (tmToGrammar T Λ M) (encodeTwoTrack tc) sf') :
          GI M sf'

          Invariant preserved from simulating state.

          theorem GI_preserved_cleanup {T : Type} [Fintype T] {Λ : Type} [Inhabited Λ] [Fintype Λ] (M : Turing.TM0.Machine (Option T) Λ) (sf sf' : List (symbol T (TMtoGrammarNT T Λ))) (w : List T) (hhalt : (Turing.TM0.eval M (List.map some w)).Dom) (hcontent : terminalContent sf = w) (hhas_nt : ssf, isNonterminal s) (hno_start : ssf, s symbol.nonterminal TMtoGrammarNT.start) (hno_genMore : ssf, s symbol.nonterminal TMtoGrammarNT.genMore) (hno_headCell : ssf, ∀ (q : Λ) (orig cur : Option T), s symbol.nonterminal (TMtoGrammarNT.headCell q orig cur)) (htrans : grammar_transforms (tmToGrammar T Λ M) sf sf') :
          GI M sf'
          theorem GI_preserved_done {T : Type} [Fintype T] {Λ : Type} [Inhabited Λ] [Fintype Λ] (M : Turing.TM0.Machine (Option T) Λ) (w : List T) (_hhalt : (Turing.TM0.eval M (List.map some w)).Dom) (sf' : List (symbol T (TMtoGrammarNT T Λ))) (htrans : grammar_transforms (tmToGrammar T Λ M) (List.map symbol.terminal w) sf') :
          GI M sf'

          Invariant preserved from done state (vacuously true).

          theorem GI_preserved {T : Type} [Fintype T] {Λ : Type} [Inhabited Λ] [Fintype Λ] (M : Turing.TM0.Machine (Option T) Λ) (sf sf' : List (symbol T (TMtoGrammarNT T Λ))) (hinv : GI M sf) (htrans : grammar_transforms (tmToGrammar T Λ M) sf sf') :
          GI M sf'

          The invariant is preserved by grammar transforms.

          theorem GI_reachable {T : Type} [Fintype T] {Λ : Type} [Inhabited Λ] [Fintype Λ] (M : Turing.TM0.Machine (Option T) Λ) (sf : List (symbol T (TMtoGrammarNT T Λ))) (h : grammar_derives (tmToGrammar T Λ M) [symbol.nonterminal TMtoGrammarNT.start] sf) :
          GI M sf

          Every reachable form satisfies the invariant.

          Main soundness theorem #

          Soundness: If the grammar generates w, then TM halts on w.

          The grammar constructed from a TM generates exactly the TM's language.