Datalog: Rules & Queries
Declare WHAT you want, not HOW to compute it. Rayforce's Datalog layer compiles rules and queries down to the same DAG-based vectorized executor that powers select and update.
What is Datalog?
Datalog is a rule-based declarative query language. You define facts and rules, then ask questions. The engine figures out all possible answers automatically by applying rules until no new facts can be derived.
Unlike SQL, Datalog handles recursive queries naturally — transitive closure, reachability, and graph traversal are first-class operations, not awkward CTEs bolted on as an afterthought.
EAV Triple Storage
Datalog in Rayforce uses Entity-Attribute-Value (EAV) triples as its storage model. Every fact is a triple (entity, attribute, value) stored in a columnar datoms table.
Creating a datoms database
Use (datoms) to create an empty EAV database, then (assert-fact db entity attribute value) to add triples:
;; Create an empty EAV database
(set db (datoms))
;; Assert facts: entity 1 has name, dept, salary
(set db (assert-fact db 1 'name 'Alice))
(set db (assert-fact db 1 'dept 'Engineering))
(set db (assert-fact db 1 'salary 80000))
Each call to assert-fact returns a new database with the triple added. The underlying storage is a three-column table [e, a, v] backed by Rayforce's columnar vectors.
Scanning by attribute
Use (scan-eav db attribute) to query all entities with a given attribute, or (scan-eav db entity attribute) to get a specific value:
;; All salaries: returns a table of [entity, value]
(show (scan-eav db 'salary))
;; Specific entity's salary: returns the scalar value
(println (scan-eav db 3 'salary)) ;; => 90000
Verified output (from scan-eav db 'salary):
All salaries:
┌─────┬───────────────────────────────┐
│ e │ v │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 80000 │
│ 2 │ 60000 │
│ 3 │ 90000 │
├─────┴───────────────────────────────┤
│ 3 rows (3 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
Entity 3 salary:
90000
Rules
Rules define derived relations. A rule has a head (the relation being defined) and a body (the conditions that must hold):
(rule (head ?vars...)
;; body clauses — all must be satisfied
(?e :attr ?v)
(?e :other-attr ?w))
Variables start with ?. When the same variable appears in multiple clauses, it acts as a join condition — the values must be equal.
Simple rule example
;; Define "employee" as entities with both name and dept
(rule (employee ?e ?n ?d)
(?e :name ?n)
(?e :dept ?d))
;; Query using the rule
(show (query db (find ?n ?d) (where (employee ?e ?n ?d))))
This rule says: "an employee is any entity ?e that has both a :name attribute (bound to ?n) and a :dept attribute (bound to ?d)." The shared variable ?e across the two clauses produces a join on entity ID.
Verified output:
Employees (via rule):
┌─────┬───────────────────────────────┐
│ ?n │ ?d │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 158 │ 161 │
│ 159 │ 162 │
├─────┴───────────────────────────────┤
│ 2 rows (2 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
OR semantics with multiple clauses
Define multiple rules with the same head to express disjunction (OR). Each rule clause contributes its own set of results, and they are combined:
;; Two ways to be "reachable"
(rule (reachable ?x ?y) (?x :edge ?y)) ;; base: direct edge
(rule (reachable ?x ?z) (?x :edge ?y) (reachable ?y ?z)) ;; recursive
Queries
Queries retrieve data from the datoms database using pattern matching. The syntax is:
(query db (find ?vars...) (where clauses...))
The find clause specifies which variables to return. The where clause contains triple patterns and rule invocations that constrain the results.
Simple pattern query
;; Find all entities and their names
(show (query db (find ?e ?n) (where (?e :name ?n))))
Verified output:
┌─────┬───────────────────────────────┐
│ ?e │ ?n │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 158 │
│ 2 │ 159 │
│ 3 │ 160 │
├─────┴───────────────────────────────┤
│ 3 rows (3 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
The ?n column contains symbol intern IDs. Use (sym-name id) to convert them to readable names (see sym-name below).
Join query
When multiple patterns share a variable, Rayforce compiles them into a join:
;; Find name + department (join on entity ?e)
(show (query db (find ?n ?d) (where (?e :name ?n) (?e :dept ?d))))
Verified output:
┌─────┬───────────────────────────────┐
│ ?n │ ?d │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 158 │ 162 │
│ 159 │ 163 │
│ 160 │ 162 │
├─────┴───────────────────────────────┤
│ 3 rows (3 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
Constant value patterns
Use a literal value instead of a variable to filter by a specific attribute value:
;; Only entities in the Engineering department
(show (query db (find ?e ?n) (where (?e :name ?n) (?e :dept 'Engineering))))
Verified output (filters to entities 1 and 3):
┌─────┬───────────────────────────────┐
│ ?e │ ?n │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 158 │
│ 3 │ 160 │
├─────┴───────────────────────────────┤
│ 2 rows (2 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
Wildcard patterns
Use _ to match any value without binding it to a variable:
;; All entities that have any dept attribute
(show (query db (find ?e) (where (?e :dept _))))
Verified output:
┌─────────────────────────────────────┐
│ ?e │
│ i64 │
├─────────────────────────────────────┤
│ 1 │
│ 2 │
│ 3 │
├─────────────────────────────────────┤
│ 3 rows (3 shown) 1 columns (1 shown)│
└─────────────────────────────────────┘
How queries compile to the DAG
Under the hood, each query compiles to Rayforce's DAG execution pipeline:
- Each triple pattern
(?e :attr ?v)becomes aray_scan+ray_filteron the datoms table - Shared variables across patterns become
ray_joinoperations - The
findclause becomes a final projection selecting the requested columns - The optimizer applies predicate pushdown, filter reorder, and fusion — the same passes used for
select
Negation
Use (not (pattern)) in a where clause to exclude entities matching a pattern. Negation compiles to OP_ANTIJOIN — keeping rows from the positive clauses that have no match in the negated pattern.
;; Mark some employees as managers
(set db (assert-fact db 1 'manager 'true))
(set db (assert-fact db 3 'manager 'true))
;; Find employees who are NOT managers
(show (query db (find ?e ?n)
(where (?e :name ?n)
(not (?e :manager ?m)))))
Verified output (only Bob, entity 2, is not a manager):
┌─────┬───────────────────────────────┐
│ ?e │ ?n │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 2 │ 159 │
├─────┴───────────────────────────────┤
│ 1 rows (1 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
The negated pattern must share at least one variable with a positive clause (here ?e) so the antijoin knows which column to match on.
Recursive Rules & Fixpoint
The real power of Datalog is recursive rules. Define a rule that references itself, and the engine automatically computes the transitive closure by iterating until no new facts are produced (the "fixpoint").
Transitive closure example
;; Build a directed graph: 1->2->3->4
(set gdb (datoms))
(set gdb (assert-fact gdb 1 'edge 2))
(set gdb (assert-fact gdb 2 'edge 3))
(set gdb (assert-fact gdb 3 'edge 4))
;; Base case: direct edge
(rule (reachable ?x ?y) (?x :edge ?y))
;; Recursive case: edge + reachability
(rule (reachable ?x ?z) (?x :edge ?y) (reachable ?y ?z))
;; Query: all reachable pairs
(show (query gdb (find ?x ?y) (where (reachable ?x ?y))))
Verified output (6 reachable pairs from 3 edges: direct + transitive):
┌─────┬───────────────────────────────┐
│ ?x │ ?y │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 2 │
│ 1 │ 3 │
│ 2 │ 3 │
│ 2 │ 4 │
│ 3 │ 4 │
│ 1 │ 4 │
├─────┴───────────────────────────────┤
│ 6 rows (6 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
Semi-naive evaluation
Rayforce uses semi-naive evaluation for fixpoint computation, which is significantly faster than naive re-evaluation:
- Base pass: Apply all non-recursive rule clauses to produce initial facts
- Delta iteration: In each round, only use newly derived facts (the "delta") as input to rule bodies
- Antijoin: Remove facts already known from the delta to keep only truly new derivations
- Terminate: When the delta is empty, the fixpoint has been reached
Each iteration compiles to a fresh DAG and calls ray_execute. The antijoin step uses ray_antijoin to efficiently filter out previously known facts.
Stratification
When rules use negation, evaluation order matters. A negated predicate must be fully computed before it can be used in an antijoin. Rayforce handles this automatically.
The engine builds a dependency graph among predicates, detects which ones depend on negated versions of others, and partitions them into strata (layers). Each stratum is evaluated to fixpoint before the next begins.
- Strata are computed via topological sort on the dependency graph
- If a negation cycle is detected (predicate A negates B, and B negates A), the engine rejects the program with an "unstratifiable" error
- No user action is needed — stratification is handled internally during
queryevaluation
Pull Queries
Pull queries provide entity-centric retrieval — given an entity ID, return all (or selected) attributes as a dictionary:
;; Pull all attributes for entity 1
(println (pull db 1))
;; Pull only specific attributes
(println (pull db 2 [name salary]))
Verified output:
Pull all attributes for entity 1:
['name 158 'dept 160 'salary 80000]
Pull name + salary for entity 2:
['name 162 'salary 60000]
Pull returns a dictionary (key-value list). Symbol attribute values appear as intern IDs — use (sym-name id) to convert them to readable names.
sym-name
EAV queries and pull results store symbol values as integer intern IDs for efficient joining and comparison. Use (sym-name id) to convert an intern ID back to a readable symbol atom.
;; Get the intern ID from a pull result
(set p (pull db 1))
(set name-id (get p 1)) ;; value at index 1 in the dict
(println name-id) ;; => 158 (intern ID)
(println (sym-name name-id)) ;; => 'Alice
Verified output:
['name 158 'dept 160]
158
'Alice
sym-name accepts both scalar i64 values and i64 vectors. On a vector, it returns a SYM vector with each ID resolved. This is useful for converting entire columns of intern IDs to readable names.
Programmatic API
For advanced use cases, Rayforce provides a programmatic Datalog API that gives fine-grained control over the evaluation process:
| Function | Description |
|---|---|
(dl-program) |
Create a new empty Datalog program |
(dl-add-edb prog 'name table arity) |
Register an extensional database (base facts) table |
(dl-stratify prog) |
Compute evaluation strata for the program |
(dl-eval prog) |
Evaluate the program to fixpoint |
(dl-query prog 'pred) |
Retrieve the result table for a predicate |
(dl-provenance prog 'pred) |
Get provenance information for derived facts |
Example
;; Create a program and register base facts
(set prog (dl-program))
(set edges (table ['from 'to] (list [1 2 3] [2 3 4])))
(dl-add-edb prog 'edge edges 2)
;; Stratify and evaluate
(dl-stratify prog)
(dl-eval prog)
;; Query results
(show (dl-query prog 'edge))
Verified output:
┌──────────┬──────────────────────────┐
│ edge__c0 │ edge__c1 │
│ i64 │ i64 │
├──────────┼──────────────────────────┤
│ 1 │ 2 │
│ 2 │ 3 │
│ 3 │ 4 │
├──────────┴──────────────────────────┤
│ 3 rows (3 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
retract-fact
Use (retract-fact db entity attr value) to remove a triple from the datoms database:
;; Remove Bob's name fact
(set db (retract-fact db 2 'name 'Bob))
;; Verify: only Alice and Charlie remain
(show (query db (find ?e ?n) (where (?e :name ?n))))
Verified output (2 rows after retraction):
┌─────┬───────────────────────────────┐
│ ?e │ ?n │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 158 │
│ 3 │ 164 │
├─────┴───────────────────────────────┤
│ 2 rows (2 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
How Datalog Maps to Rayforce
Every Datalog concept compiles down to existing Rayforce DAG operations. The Datalog layer is purely a compilation frontend — the engine does all the heavy lifting.
| Datalog Concept | Rayforce Operation | Description |
|---|---|---|
Triple pattern (?e :attr ?v) |
ray_scan + ray_filter |
Indexed column scan filtered by attribute |
| Shared variable (join) | ray_join |
Hash join on shared variable column |
Constant value (?e :dept 'Engineering) |
ray_filter |
Equality filter on the value column |
Wildcard (?e :dept _) |
ray_scan |
Attribute scan without value constraint |
Negation (not ...) |
ray_antijoin |
Keep rows not present in another result |
| OR rules (same head) | union-all + distinct |
Combine results and deduplicate |
| Fixpoint (recursion) | Loop with ray_execute |
Iterate until delta is empty |
| Stratification | Topological sort | Compute evaluation order from dependency graph |
(find ?a ?n) |
Projection | Select output columns from the result |
(pull db entity) |
ray_scan on EAV index |
Entity attribute scan and collection |
| Rule invocation | Subgraph expansion | Inline the rule's compiled DAG as a subplan |
select/update, queries benefit from all optimizer passes: predicate pushdown, filter reorder, fusion, and morsel-parallel execution with SIMD.
Complete Example
The following example demonstrates the full Datalog workflow: EAV storage, queries with constants and wildcards, rules, negation, transitive closure, pull queries, sym-name, and retract-fact. This is the content of examples/rfl/datalog.rfl.
Source
; Datalog Example
; Demonstrates: datoms, assert-fact, retract-fact, rules, queries,
; negation, fixpoint (transitive closure), pull, sym-name,
; constant values, and wildcard patterns
; -- 1. EAV storage --
(set db (datoms))
(set db (assert-fact db 1 'name 'Alice))
(set db (assert-fact db 1 'dept 'Engineering))
(set db (assert-fact db 1 'salary 80000))
(set db (assert-fact db 2 'name 'Bob))
(set db (assert-fact db 2 'dept 'Sales))
(set db (assert-fact db 2 'salary 60000))
(set db (assert-fact db 3 'name 'Charlie))
(set db (assert-fact db 3 'dept 'Engineering))
(set db (assert-fact db 3 'salary 90000))
; -- 2. Simple query --
(println "--- All names ---")
(set r (query db (find ?e ?n) (where (?e :name ?n))))
(show r)
; -- 3. Join query --
(println "--- Name + Department ---")
(set r (query db (find ?n ?d) (where (?e :name ?n) (?e :dept ?d))))
(show r)
; -- 4. Constant value pattern --
(println "--- Engineers only (constant pattern) ---")
(set r (query db (find ?e ?n) (where (?e :name ?n) (?e :dept 'Engineering))))
(show r)
; -- 5. Wildcard pattern --
(println "--- Entities with any dept (wildcard _) ---")
(set r (query db (find ?e) (where (?e :dept _))))
(show r)
; -- 6. Rule: define "employee" relation --
(rule (employee ?e ?n ?d)
(?e :name ?n)
(?e :dept ?d))
(println "--- Employees (via rule) ---")
(set r (query db (find ?n ?d) (where (employee ?e ?n ?d))))
(show r)
; -- 7. Negation --
(set db (assert-fact db 1 'manager 'true))
(set db (assert-fact db 3 'manager 'true))
(println "--- Non-managers (negation) ---")
(set r (query db (find ?e ?n) (where (?e :name ?n) (not (?e :manager ?m)))))
(show r)
; -- 8. Transitive closure (recursive rules + fixpoint) --
(set gdb (datoms))
(set gdb (assert-fact gdb 1 'edge 2))
(set gdb (assert-fact gdb 2 'edge 3))
(set gdb (assert-fact gdb 3 'edge 4))
(rule (reachable ?x ?y) (?x :edge ?y))
(rule (reachable ?x ?z) (?x :edge ?y) (reachable ?y ?z))
(println "--- Reachable pairs (transitive closure) ---")
(set r (query gdb (find ?x ?y) (where (reachable ?x ?y))))
(show r)
; -- 9. Pull queries --
(println "--- Pull entity 1 (all attributes) ---")
(println (pull db 1))
(println "--- Pull entity 2 (name + salary only) ---")
(println (pull db 2 [name salary]))
; -- 10. sym-name: readable output --
(println "--- sym-name: convert intern IDs ---")
(set p (pull db 1))
(set name-id (get p 1))
(println name-id)
(println (sym-name name-id))
; -- 11. retract-fact --
(println "--- Before retract ---")
(set r (query db (find ?e ?n) (where (?e :name ?n))))
(show r)
(set db (retract-fact db 2 'name 'Bob))
(println "--- After retract (Bob removed) ---")
(set r (query db (find ?e ?n) (where (?e :name ?n))))
(show r)
; -- 12. Scan-eav: low-level attribute lookup --
(println "--- All salaries (scan-eav) ---")
(show (scan-eav db 'salary))
(println "--- Entity 3 salary ---")
(println (scan-eav db 3 'salary))
Output
Running ./rayforce examples/rfl/datalog.rfl produces:
--- All names ---
┌─────┬───────────────────────────────┐
│ ?e │ ?n │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 158 │
│ 2 │ 162 │
│ 3 │ 164 │
├─────┴───────────────────────────────┤
│ 3 rows (3 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
--- Name + Department ---
┌─────┬───────────────────────────────┐
│ ?n │ ?d │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 158 │ 160 │
│ 162 │ 163 │
│ 164 │ 160 │
├─────┴───────────────────────────────┤
│ 3 rows (3 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
--- Engineers only (constant pattern) ---
┌─────┬───────────────────────────────┐
│ ?e │ ?n │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 158 │
│ 3 │ 164 │
├─────┴───────────────────────────────┤
│ 2 rows (2 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
--- Entities with any dept (wildcard _) ---
┌─────────────────────────────────────┐
│ ?e │
│ i64 │
├─────────────────────────────────────┤
│ 1 │
│ 2 │
│ 3 │
├─────────────────────────────────────┤
│ 3 rows (3 shown) 1 columns (1 shown)│
└─────────────────────────────────────┘
--- Employees (via rule) ---
┌─────┬───────────────────────────────┐
│ ?n │ ?d │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 158 │ 160 │
│ 162 │ 163 │
│ 164 │ 160 │
├─────┴───────────────────────────────┤
│ 3 rows (3 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
--- Non-managers (negation) ---
┌─────┬───────────────────────────────┐
│ ?e │ ?n │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 2 │ 162 │
├─────┴───────────────────────────────┤
│ 1 rows (1 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
--- Reachable pairs (transitive closure) ---
┌─────┬───────────────────────────────┐
│ ?x │ ?y │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 2 │
│ 1 │ 3 │
│ 2 │ 3 │
│ 2 │ 4 │
│ 3 │ 4 │
│ 1 │ 4 │
├─────┴───────────────────────────────┤
│ 6 rows (6 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
--- Pull entity 1 (all attributes) ---
['name 158 'dept 160 'salary 80000 'manager 172]
--- Pull entity 2 (name + salary only) ---
['name 162 'salary 60000]
--- sym-name: convert intern IDs ---
158
'Alice
--- Before retract ---
┌─────┬───────────────────────────────┐
│ ?e │ ?n │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 158 │
│ 2 │ 162 │
│ 3 │ 164 │
├─────┴───────────────────────────────┤
│ 3 rows (3 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
--- After retract (Bob removed) ---
┌─────┬───────────────────────────────┐
│ ?e │ ?n │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 158 │
│ 3 │ 164 │
├─────┴───────────────────────────────┤
│ 2 rows (2 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
--- All salaries (scan-eav) ---
┌─────┬───────────────────────────────┐
│ e │ v │
│ i64 │ i64 │
├─────┼───────────────────────────────┤
│ 1 │ 80000 │
│ 2 │ 60000 │
│ 3 │ 90000 │
├─────┴───────────────────────────────┤
│ 3 rows (3 shown) 2 columns (2 shown)│
└─────────────────────────────────────┘
--- Entity 3 salary ---
9
'Alice) are assigned at runtime and may differ between sessions. The row counts and structure are stable. Use (sym-name id) to convert any intern ID back to its readable symbol name.