Parse Tables
Understanding and debugging LR parse tables
Parse Tables
Parse tables are the core data structure used by LR parsers. They encode the state machine that drives the parsing process.
Table Structure
An LR parse table consists of two parts:
ACTION Table
Maps (state, terminal) pairs to actions:
| Action | Notation | Meaning |
|---|---|---|
| Shift | S5 |
Push terminal, go to state 5 |
| Reduce | R3 |
Reduce by rule 3 |
| Accept | Acc |
Parsing complete (success) |
| Error | (empty) | No action = syntax error |
GOTO Table
Maps (state, non-terminal) pairs to states. After a reduction creates a non-terminal, GOTO tells the parser which state to enter.
Viewing Parse Tables
Galore provides utilities to visualize parse tables:
import { newParser, Printers } from "galore";
const [parser, tokenFunc, itemGraph] = newParser(grammar, { type: "lalr" });
// Get HTML representation with LR item hints
const html = Printers.parseTableToHtml(parser.parseTable, { itemGraph });
// Get text representation
const text = Printers.parseTableToText(parser.parseTable);
Using the Playground
The Interactive Playground shows parse tables with:
- LR item hints - Each state shows its first LR item (e.g., "0: $accept -> . Program")
- Full item sets on hover - See all items in a state by hovering
- Conflict highlighting - Cells with conflicts appear in amber
Understanding Conflicts
A conflict occurs when a parse table cell has multiple actions. This means the grammar is ambiguous - the parser can't decide what to do.
Shift-Reduce Conflict
The parser can't decide whether to shift the next token or reduce the current production.
Classic Example: Dangling Else
Stmt -> "if" Expr "then" Stmt
| "if" Expr "then" Stmt "else" Stmt
| OTHER
;
When parsing if E then if E then S else S, after seeing if E then if E then S and looking at else:
- Shift: Attach
elseto innerif(most languages do this) - Reduce: Complete inner
if, attachelseto outerif
Reduce-Reduce Conflict
The parser can't decide which rule to use for reduction.
Example: Ambiguous Expression
Expr -> Expr "+" Expr
| Expr "*" Expr
| NUMBER
;
When parsing 1 + 2 * 3, after 1 + 2 with * lookahead, should we:
- Reduce
1 + 2to Expr (then multiply by 3)? - Shift
*(then add 1 to 2*3)?
Resolving Conflicts
1. Grammar Restructuring
The cleanest solution is to rewrite the grammar to eliminate ambiguity:
// Ambiguous
Expr -> Expr "+" Expr | Expr "*" Expr | NUMBER ;
// Unambiguous (precedence via grammar structure)
Expr -> Expr "+" Term | Term ;
Term -> Term "*" Factor | Factor ;
Factor -> NUMBER ;
This encodes precedence directly: multiplication binds tighter because it's resolved at a "deeper" level.
2. Left Factoring
When multiple rules share a common prefix, factor it out:
// Conflict-prone
Stmt -> "if" Expr "then" Stmt
| "if" Expr "then" Stmt "else" Stmt
;
// Left-factored
Stmt -> "if" Expr "then" Stmt ElseOpt ;
ElseOpt -> "else" Stmt | ;
3. Precedence Directives (Not Yet Implemented)
Yacc/Bison-style %left, %right, %nonassoc directives are not yet implemented in Galore's DSL. Use grammar restructuring instead.
4. Runtime Conflict Resolution
Handle conflicts at parse time with an actionResolver:
parser.parse(input, {
ruleHandlers: {},
actionResolver: (actions, stack, tokenbuffer) => {
// Prefer shift over reduce (common default)
const shift = actions.find(a => a.tag === LRActionType.SHIFT);
if (shift) return shift;
// For reduce-reduce, prefer longer rule
const reduces = actions.filter(a => a.tag === LRActionType.REDUCE);
if (reduces.length > 1) {
reduces.sort((a, b) => b.rule!.rhs.length - a.rule!.rhs.length);
}
return reduces[0] || actions[0];
}
});
Detecting Conflicts
Check if a grammar has conflicts programmatically:
const [parser, tokenFunc, itemGraph] = newParser(grammar, { type: "lalr" });
if (parser.parseTable.hasConflicts) {
console.log("Conflicts found:");
// conflictActions[stateId][symbolLabel] = true
for (const [stateId, symbols] of Object.entries(parser.parseTable.conflictActions)) {
for (const symbol of Object.keys(symbols)) {
console.log(` State ${stateId}, symbol "${symbol}"`);
}
}
}
LR Item Sets
Each parse state corresponds to an LR item set. Understanding items helps debug conflicts.
LR Item Notation
An LR item shows how much of a rule has been seen:
Expr -> Expr . "+" Term // Dot shows current position
↑
We've seen "Expr", expecting "+"
Accessing Item Sets
const [parser, tokenFunc, itemGraph] = newParser(grammar, { type: "lalr" });
// Iterate over states
for (const itemSet of itemGraph.itemSets) {
console.log(`State ${itemSet.id}:`);
for (const item of itemSet.items.values()) {
console.log(` ${item.debugString}`);
}
}
Parser Types and Conflicts
Different parser types handle conflicts differently:
| Type | Conflict Behavior |
|---|---|
| SLR | Uses FOLLOW sets for lookahead; may report false conflicts |
| LALR | More precise lookahead; fewer false conflicts |
| LR(1) | Most precise; no false conflicts, but larger tables |
If SLR reports conflicts, try LALR or LR(1) - the conflict may be a false positive.
Example: Debugging a Conflict
Here's a workflow for debugging a shift-reduce conflict:
- Open the grammar in the Playground
- Look for amber-highlighted cells in the parse table
- Hover over the conflicting state to see its LR items
- Identify which rules are competing
- Restructure the grammar or add an
actionResolver
// Example: This grammar has a shift-reduce conflict
Expr -> Expr "+" Expr
| NUMBER
;
// The conflict is in the state after seeing "Expr + Expr":
// Expr -> Expr "+" Expr . (reduce?)
// Expr -> Expr . "+" Expr (shift "+"?)
// Fix: Use grammar structure for precedence
Expr -> Expr "+" Term | Term ;
Term -> NUMBER ;