Parse Tables

Understanding and debugging LR parse tables

Parse Tables

Parse tables are the core data structure used by LR parsers. They encode the state machine that drives the parsing process.

Table Structure

An LR parse table consists of two parts:

ACTION Table

Maps (state, terminal) pairs to actions:

ActionNotationMeaning
Shift S5 Push terminal, go to state 5
Reduce R3 Reduce by rule 3
Accept Acc Parsing complete (success)
Error (empty) No action = syntax error

GOTO Table

Maps (state, non-terminal) pairs to states. After a reduction creates a non-terminal, GOTO tells the parser which state to enter.

Viewing Parse Tables

Galore provides utilities to visualize parse tables:

import { newParser, Printers } from "galore";

const [parser, tokenFunc, itemGraph] = newParser(grammar, { type: "lalr" });

// Get HTML representation with LR item hints
const html = Printers.parseTableToHtml(parser.parseTable, { itemGraph });

// Get text representation
const text = Printers.parseTableToText(parser.parseTable);

Using the Playground

The Interactive Playground shows parse tables with:

  • LR item hints - Each state shows its first LR item (e.g., "0: $accept -> . Program")
  • Full item sets on hover - See all items in a state by hovering
  • Conflict highlighting - Cells with conflicts appear in amber

Understanding Conflicts

A conflict occurs when a parse table cell has multiple actions. This means the grammar is ambiguous - the parser can't decide what to do.

Shift-Reduce Conflict

The parser can't decide whether to shift the next token or reduce the current production.

Classic Example: Dangling Else

Stmt -> "if" Expr "then" Stmt
      | "if" Expr "then" Stmt "else" Stmt
      | OTHER
      ;

When parsing if E then if E then S else S, after seeing if E then if E then S and looking at else:

  • Shift: Attach else to inner if (most languages do this)
  • Reduce: Complete inner if, attach else to outer if

Reduce-Reduce Conflict

The parser can't decide which rule to use for reduction.

Example: Ambiguous Expression

Expr -> Expr "+" Expr
      | Expr "*" Expr
      | NUMBER
      ;

When parsing 1 + 2 * 3, after 1 + 2 with * lookahead, should we:

  • Reduce 1 + 2 to Expr (then multiply by 3)?
  • Shift * (then add 1 to 2*3)?

Resolving Conflicts

1. Grammar Restructuring

The cleanest solution is to rewrite the grammar to eliminate ambiguity:

// Ambiguous
Expr -> Expr "+" Expr | Expr "*" Expr | NUMBER ;

// Unambiguous (precedence via grammar structure)
Expr -> Expr "+" Term | Term ;
Term -> Term "*" Factor | Factor ;
Factor -> NUMBER ;

This encodes precedence directly: multiplication binds tighter because it's resolved at a "deeper" level.

2. Left Factoring

When multiple rules share a common prefix, factor it out:

// Conflict-prone
Stmt -> "if" Expr "then" Stmt
      | "if" Expr "then" Stmt "else" Stmt
      ;

// Left-factored
Stmt -> "if" Expr "then" Stmt ElseOpt ;
ElseOpt -> "else" Stmt | ;

3. Precedence Directives (Not Yet Implemented)

Yacc/Bison-style %left, %right, %nonassoc directives are not yet implemented in Galore's DSL. Use grammar restructuring instead.

4. Runtime Conflict Resolution

Handle conflicts at parse time with an actionResolver:

parser.parse(input, {
  ruleHandlers: {},
  actionResolver: (actions, stack, tokenbuffer) => {
    // Prefer shift over reduce (common default)
    const shift = actions.find(a => a.tag === LRActionType.SHIFT);
    if (shift) return shift;

    // For reduce-reduce, prefer longer rule
    const reduces = actions.filter(a => a.tag === LRActionType.REDUCE);
    if (reduces.length > 1) {
      reduces.sort((a, b) => b.rule!.rhs.length - a.rule!.rhs.length);
    }

    return reduces[0] || actions[0];
  }
});

Detecting Conflicts

Check if a grammar has conflicts programmatically:

const [parser, tokenFunc, itemGraph] = newParser(grammar, { type: "lalr" });

if (parser.parseTable.hasConflicts) {
  console.log("Conflicts found:");

  // conflictActions[stateId][symbolLabel] = true
  for (const [stateId, symbols] of Object.entries(parser.parseTable.conflictActions)) {
    for (const symbol of Object.keys(symbols)) {
      console.log(`  State ${stateId}, symbol "${symbol}"`);
    }
  }
}

LR Item Sets

Each parse state corresponds to an LR item set. Understanding items helps debug conflicts.

LR Item Notation

An LR item shows how much of a rule has been seen:

Expr -> Expr . "+" Term    // Dot shows current position
        ↑
        We've seen "Expr", expecting "+"

Accessing Item Sets

const [parser, tokenFunc, itemGraph] = newParser(grammar, { type: "lalr" });

// Iterate over states
for (const itemSet of itemGraph.itemSets) {
  console.log(`State ${itemSet.id}:`);

  for (const item of itemSet.items.values()) {
    console.log(`  ${item.debugString}`);
  }
}

Parser Types and Conflicts

Different parser types handle conflicts differently:

TypeConflict Behavior
SLR Uses FOLLOW sets for lookahead; may report false conflicts
LALR More precise lookahead; fewer false conflicts
LR(1) Most precise; no false conflicts, but larger tables

If SLR reports conflicts, try LALR or LR(1) - the conflict may be a false positive.

Example: Debugging a Conflict

Here's a workflow for debugging a shift-reduce conflict:

  1. Open the grammar in the Playground
  2. Look for amber-highlighted cells in the parse table
  3. Hover over the conflicting state to see its LR items
  4. Identify which rules are competing
  5. Restructure the grammar or add an actionResolver
// Example: This grammar has a shift-reduce conflict
Expr -> Expr "+" Expr
      | NUMBER
      ;

// The conflict is in the state after seeing "Expr + Expr":
//   Expr -> Expr "+" Expr .     (reduce?)
//   Expr -> Expr . "+" Expr     (shift "+"?)

// Fix: Use grammar structure for precedence
Expr -> Expr "+" Term | Term ;
Term -> NUMBER ;