Semantic Actions

Building ASTs and evaluating expressions with rule handlers

Semantic Actions

Semantic actions allow you to build abstract syntax trees (ASTs), evaluate expressions, or perform custom logic as the parser reduces rules.

Overview

There are two ways to attach actions to grammar rules:

  1. Grammar-level actions - Declared in the grammar DSL with { handler }
  2. Context callbacks - Provided via ParserContext when calling parse()

Grammar-Level Actions

Attach actions directly to productions in your grammar:

Expr -> Expr "+" Term { add }
      | Expr "-" Term { subtract }
      | Term { $1 }
      ;

Child References

Use { $N } to return the value of the Nth child (1-indexed):

// Return the expression inside parentheses
Factor -> "(" Expr ")" { $2 } ;

// Pass through the first child's value
Term -> Factor { $1 } ;

Named Handlers

Use { handlerName } to call a function you provide:

Expr -> Expr "+" Term { add } ;
Expr -> NUMBER { num } ;

Provide the handlers via ruleHandlers:

const result = parser.parse("1 + 2", {
  ruleHandlers: {
    add: (rule, parent, ...children) => {
      return children[0].value + children[2].value;
    },
    num: (rule, parent, ...children) => {
      return parseInt(children[0].value);
    }
  }
});

RuleActionHandler Signature

Named handlers receive the following arguments:

type RuleActionHandler = (
  rule: Rule,           // The grammar rule being reduced
  parent: PTNode,       // The node being created for this reduction
  ...children: PTNode[] // All child nodes from the production
) => any;               // Return value becomes parent.value

Accessing Child Values

const handlers = {
  binop: (rule, parent, ...children) => {
    const left = children[0].value;    // First child's semantic value
    const op = children[1].value;      // Operator token value
    const right = children[2].value;   // Third child's semantic value

    switch (op) {
      case '+': return left + right;
      case '-': return left - right;
      case '*': return left * right;
      case '/': return left / right;
    }
  }
};

PTNode Structure

Parse tree nodes (PTNode) contain:

PropertyTypeDescription
sym Sym The grammar symbol (terminal or non-terminal)
value any Semantic value (token value or handler result)
children PTNode[] Child nodes (empty for terminals)
parent PTNode | null Parent node reference
id number Unique node identifier

PTNode Methods

// Get child at index (negative indexes from end)
node.childAt(0)      // First child
node.childAt(-1)     // Last child

// Check if terminal
node.isTerminal      // true for leaf nodes

// Debug output
node.debugValue()    // Returns tree structure for debugging

ParserContext Callbacks

The ParserContext interface provides hooks for more advanced control:

interface ParserContext {
  // Named handlers for grammar actions
  ruleHandlers: { [name: string]: RuleActionHandler };

  // Called before each child is added to parent
  beforeAddingChildNode?: (parent: PTNode, child: PTNode) => PTNode[];

  // Called after a rule is reduced (fallback if no action)
  onReduction?: (node: PTNode, rule: Rule) => PTNode;

  // Called for each token before parsing
  onNextToken?: (token: Token) => Token | null;

  // Resolve conflicts between multiple actions
  actionResolver?: ActionResolverCallback;

  // Handle tokenizer errors
  onTokenError?: TokenErrorCallback;

  // Whether to build the parse tree (default: true)
  buildParseTree?: boolean;

  // Copy single child's value to parent (default: true)
  copySingleChild?: boolean;
}

beforeAddingChildNode

Filter or transform nodes before they're added to the parse tree:

const result = parser.parse(input, {
  ruleHandlers: {},
  beforeAddingChildNode: (parent, child) => {
    // Filter out whitespace tokens
    if (child.sym.label === 'WS') {
      return [];  // Don't add this child
    }

    // Flatten single-child nodes
    if (child.children.length === 1) {
      return [child.children[0]];
    }

    return [child];  // Add as-is
  }
});

onReduction

Default handler called when a rule has no explicit action:

const result = parser.parse(input, {
  ruleHandlers: {},
  onReduction: (node, rule) => {
    console.log(`Reduced: ${rule.nt.label} -> ${rule.rhs.syms.map(s => s.label).join(' ')}`);

    // Build AST nodes
    if (rule.nt.label === 'BinExpr') {
      node.value = {
        type: 'BinaryExpression',
        left: node.children[0].value,
        operator: node.children[1].value,
        right: node.children[2].value
      };
    }

    return node;
  }
});

onNextToken

Process or filter tokens as they're read:

const result = parser.parse(input, {
  ruleHandlers: {},
  onNextToken: (token) => {
    // Convert string token values to numbers
    if (token.tag === 'NUMBER') {
      token.value = parseFloat(token.value);
    }

    // Filter out comments (return null to skip)
    if (token.tag === 'COMMENT') {
      return null;
    }

    return token;
  }
});

Value Propagation

By default, Galore automatically propagates values up the tree:

  1. Terminal nodes - value is the token's matched text
  2. Single-child rules - Child's value is copied to parent (if copySingleChild: true)
  3. Explicit actions - Handler's return value becomes node.value
// With copySingleChild: true (default)
// The value bubbles up automatically through single-child productions

Factor -> NUMBER ;  // Factor.value = NUMBER.value
Term -> Factor ;    // Term.value = Factor.value
Expr -> Term ;      // Expr.value = Term.value

Complete Example

import { newParser } from "galore";

const grammar = `
  %token NUMBER /[0-9]+/
  %skip /[ \\t\\n]+/

  Expr -> Expr "+" Term { add }
        | Expr "-" Term { sub }
        | Term { $1 }
        ;

  Term -> Term "*" Factor { mul }
        | Term "/" Factor { div }
        | Factor { $1 }
        ;

  Factor -> "(" Expr ")" { $2 }
          | NUMBER { num }
          ;
`;

const [parser] = newParser(grammar, { type: "lalr" });

const result = parser.parse("2 + 3 * 4", {
  ruleHandlers: {
    num: (rule, parent, ...children) => parseInt(children[0].value),
    add: (rule, parent, ...children) => children[0].value + children[2].value,
    sub: (rule, parent, ...children) => children[0].value - children[2].value,
    mul: (rule, parent, ...children) => children[0].value * children[2].value,
    div: (rule, parent, ...children) => children[0].value / children[2].value,
  }
});

console.log(result.value);  // 14  (2 + (3 * 4))

Building ASTs

Use handlers to build a proper AST instead of a concrete parse tree:

const handlers = {
  binop: (rule, parent, ...children) => ({
    type: 'BinaryExpression',
    operator: children[1].value,
    left: children[0].value,
    right: children[2].value
  }),

  num: (rule, parent, ...children) => ({
    type: 'NumericLiteral',
    value: parseInt(children[0].value)
  }),

  ident: (rule, parent, ...children) => ({
    type: 'Identifier',
    name: children[0].value
  })
};

// Parsing "x + 1" produces:
// {
//   type: 'BinaryExpression',
//   operator: '+',
//   left: { type: 'Identifier', name: 'x' },
//   right: { type: 'NumericLiteral', value: 1 }
// }