rusty-razor

Razor is a first-order model-finder. Given an input first-order theory, corresponding to the design specification of a system, written in Razor's input syntax, Razor constructs a set of models as examples of the system's execution. The output set of models may contain examples that are consistent with the user's expectation and support the system's design. They may also contain surprising examples of unexpected behavior, which reveal potential logical flaws in the specification. Razor produces no models if the input theory is logically inconsistent.

==TODO: point to examples==

As a first-order model-finder, Razor is comparable to tools like Kodkod, MACE, and Paradox. Kodkod, is the underlying model-finding engine of Alloy Analyzer, a modeling tool that has gained increasing recognition in the software industry, thanks to its intuitive language and well-crafted analysis tool. Inspired by Kodkod and Alloy, the goal of Razor is to make "lightweight formal methods" accessible to the general software development community, serving as a model-finding engine for domain specific analysis tools.

Note: First-order model-finding is often confused with temporal model-checking, also known as property-checking. Both model-finding and model-checking are reasoning techniques that exhaustively verify properties of a system in a given search space; however, model-checking is the practice of verifying temporal properties in a finite-state model of the system. In contrast, model-finding refers to finding examples of the system's behavior by constructing models for its corresponding first-order specification.

It's a well-known result that first-order model-finding is undecidable (see Gödel's incompleteness theorem). As a consequence, any practical model-finding algorithm must resort to some notion of bound on the search space of models to guarantee termination. Conventional model-finders, such as Kodkod, MACE, and Paradox, assume a bound on the size of the domain of models to be found. This assumption makes it possible to translate the input first-order theory 𝓣 to a propositional formula 𝓟 up to the give bound. This process is known as propositionalization. Next, the model-finder utilizes a SAT solver to find solutions for 𝓟. Intuitively speaking, one might think of the propositionalization step as the process of enumerating all potential models of the initial first-order formula where each propositional variable in 𝓟 represents a fact in first-order models of 𝓣. The job of the SAT solver is to find solutions to 𝓟, corresponding to images of first-order models for 𝓣. Finally, the model-finder maps the solutions to the SAT problem back to models of the original first-order theory 𝓣.

For example, consider the following specification of a conventional list in functional programming languages:

// Every list `x` is `'nil` or points to a `next` list:
∀ x . List(x) → x = 'nil ∨ ∃ y. next(x) = y and List(y);

// The `next` of a list is its sublist:
∀ x, y. next(x) = y → Sublist(x, y);
// The `next` of a sublist is itself a sublist:
∀ x, y, z. Sublist(x, y) ∧ next(y) = z → Sublist(x, z);

// `'nil` never point to a `next` list:
~∃ x. next('nil) = x;
// A list cannot be its own sublist (no cycles):
~∃ x. Sublist(x, x);

// `'my_list` is a list:
List('my_list);

The first five formulae of this theory describe the list data structure and the last formula asks for models with a list, namely 'my_list. Given a bound of 4 on the size of the models, a conventional model-finder queries the underlying SAT solver for solutions consisting of 1, 2, 3, or 4 elements (e.g., lists) that satisfy the theory. Consequently, the model-finder may spit out solutions such as the following (○, ●, and ⟶ respectively denote the 'nil list, a non-empty list, and the next function):

'my_list of length 0 (i.e., the 'nil list):

'my_list: ◯
'my_list of length 1 (a node pointing to 'nil):

'my_list: ⬤➝◯
'my_list of length 2:

'my_list: ⬤➝⬤➝◯

Two lists ('my_list and an unnamed list) of length 1:

'my_list: ⬤➝◯
(unnamed): ⬤➝◯
'my_list of length 3 and a (unnamed) list of length 0:

'my_list: ⬤➝⬤➝⬤➝◯
(unnamed): ◯

A key take-away is that the input bound on the model size does not only guarantee termination, but also is necessary to make propositionalization generally possible.

Note: Certain classes of first-order formulae including the Bernays–Schönfinkel–Ramsey class, also known as the effectively propositional (EPR) class, may be translated into propositional logic without an explicit search bound.

Razor takes a direct approach to first-order model-finding without propositionalization and SAT solving. The essence of Razor's algorithm is the chase from database theory. Our approach to model-finding has a couple of primary advantages: first, the model-finding algorithm doesn't inherently require a bound for propositionalization. Second, the algorithm can leverage information in first-order logic to guide the search for models, unlike the SAT-based approach where the underlying SAT solver is put in charge of the search. However, these features usually come at a performance cost as Razor doesn't make use of the blazingly fast modern SAT-solvers.

Unbounded Algorithm

As discussed earlier, SAT-based model-finders translate the input first-order theory to a propositional formula up to a bound, which is often provided by the user. Providing a search bound that is sufficient for a complete analysis yet small enough to make the search tractable poses a challenge to using SAT-based model-finders. Inferring a sufficient search bound to help the user has been an ongoing research in the literature.

In contrast, Razor's model-finding algorithm is unbounded: starting from an empty model 𝕞 (i.e., a model over an empty domain), Razor constructively augments 𝕞 with necessary elements and facts until 𝕞 satisfies the input theory or a contradiction occurs. When there is more than one way to continue, the algorithm augments clones of 𝕞 in different search branches.

As a direct result of undecidability of first-order logic, a run of Razor on an arbitrary theory may fail to terminate. To guarantee termination, we would need to devise a termination condition such an execution timeout, a count of generated models, a maximum search depth, or possibly a bound on the domain size of models. Because of the inherent unboundedness of Razor's algorithm, the user wouldn't have to provide the search bound upfront; in fact, the user may let the search continue for an arbitrary length of time and kill the Razor's process on demand.

Note: Some SAT-based model-finders like Paradox implement clever algorithms to reuse some components of the intermediate propositionalized formula for a bound of size n to generate a propositional formula for a bound at n + 1. This approach enables the solver to perform incremental model-finding as it expands the search bound. Nevertheless, at each step, a considerable portion of the previously generated propositional formulae and the earlier work of the SAT-solver must get discarded.

Conventional model-finders put their (often off-the-shelf) SAT-solvers fully in charge of constructing structures that are then translated to models of the input theory; therefore, the model-finding tool has little control over the search algorithm. In particular, relying on a SAT-solver would limit the opportunities to guide the search and optimize the quality of models.

Razor constructs models that are minimal with respect to their information content within a framework for exploring the space of models of the input theory. In addition, employing a first-order search algorithm enables Razor to trivially parallelize and distribute the search, give access to the intermediate partial models, and employ heuristics to guide the search.

Minimality

SAT-based model-finders tend to output noisy models: these are non-minimal models which hold facts that aren't necessary for satisfying the input theory. Noisy models can be hard to understand as they don't offer provenance for their content. In contrast, Razor's algorithm is guided towards producing minimal models, which contain nothing but facts that must be present for satisfying the theory. Formally speaking, Razor returns models that are minimal with respect to a homomorphism ordering on all models of the theory. The homomorphism ordering serves as a useful measure of information content, enabling Razor to offer provenance to justify the content of its models by the sentences of the input theory.

Consider the list specification from the previous section. A run of Razor bounded by a domain size of 4 elements outputs models such as the ones below:

'my_list of length 0 (i.e., the 'nil list):

'my_list: ◯
'my_list of length 1:

'my_list: ⬤➝◯
'my_list of length 0 (an alias for 'nil) and an unnamed list of of length 1:

'my_list: ◯
(unnamed): ⬤➝◯
'my_list of length 2:

'my_list: ⬤➝⬤➝◯
'my_list of length 0 and an unnamed list of of length 2:

'my_list: ◯
(unnamed): ⬤➝⬤➝◯
'my_list of length 3:

'my_list: ⬤➝⬤➝⬤➝◯

All these models are minimal in the sense that unlike some examples from the previous section, every peice of information in these models is required by the input theory and cannot be removed.

Note: For every model 𝕟 of the input theory 𝓣, Razor returns a model 𝕞 with a homomorphism from 𝕞 to 𝕟. Informally, it's always possible to construct any model of 𝓣 by adding more information to a model returned by Razor. Therefore, we refer to the models returned by Razor as a set of support for 𝓣.

First-order model-finding is often applied to uncover inconsistencies and logical flaws where the user verifies if an assertion follows from a specification. The assertion, which is often a desirable property of the system, holds if its negation is inconsistent with the specification, having no models. I will refer to this mode of interaction with a model-finder as verification. When working with a model-finder in the verification mode, only one (unexpected) model is sufficient to reveal a bug in the specification.

In a different mode of analysis, the user understands models of the theory as instances of the behavior of a system specification. I will call this mode of interaction with a model-finder exploration. Unlike the verification mode, where the model-finder helps the user to prove properties of a system, the user of the exploration mode is interested in investigating examples of a system's execution without necessarily having specific properties in mind. The goal of exploration is to help the user develop a better understanding of the specified system.

Note: check and run commands in Alloy operate in verification and exploration modes respectively.

SAT-based model-finders commonly offer models in no particular order as the order of output is dictated by the underlying SAT-solver. This behavior is often acceptable in the verification mode, where any counterexample suffices to disprove a hypothesis or invalidate a property. In contrast, effective model-finding for exploration greatly benefits from a systematic approach to presenting models, an exploration method that enables the user to effectively navigate through various classes of (possibly many) models of the theory and does not solely rely on the serendipity of SAT-solving. Razor's approach to model-finding offers a solution!

As I mentioned in the previous section, Razor produces models that are minimal with no noise, i.e., unnecessary facts. More accurately, these are models that are minimal with respect to a preorder ≺ induced by the homomorphism relation between the models of the theory. For two models 𝕞 and 𝕟 of a theory 𝓣, 𝕞 ≺ 𝕟 if and only if there is a homomorphism from 𝕞 to 𝕟 and it's easy to show that 𝕞 contains less noise than 𝕟. It is, though, always possible to augment 𝕞 with additional (unnecessary) facts to obtain 𝕟. This is the core principle that enables a framework for model exploration with respect to a measure of information content.

Note: Aluminum is a preliminary realization of this approach to exploring models, which was implemented as a modification of Alloy.

Model-finders could potentially incorporate heuristic measures to prioritize specific search branches or adopt various search strategies. However, delegating the search to a SAT-solver dramatically limits the opportunity for taking advantage of heuristics and contextual information.

Earlier, I brought up minimality as a quality measure but the user might be interested in other attributes and criteria for model exploration. For example, the user could ask for a new model that is less similar to the ones previously returned or one that would be "surprising". Also, the model-finder may keep track of various heuristics to deprioritize search branches that are more likely to fail or to switch between different scheduling algorithms.

Razor implements a variant of the chase algorithm to construct models for theories in geometric logic, a syntactic variation of first-order logic. Geometric theories are comprised of geometric sequents in the following form:

𝙰₁ ∧ ... ∧ 𝙰_m ⊢_𝘅 (∃ 𝚢₁₁, ..., 𝚢_1j₁ . 𝙰₁₁ ∧ ... ∧ 𝙰_1n₁)  
              ∨ (∃ 𝚢₂₁, ..., 𝚢_2j₂ . 𝙰₂₁ ∧ ... ∧ 𝙰_2n₂)  
              ∨ ...  
              ∨ (∃ 𝚢_i1, ..., 𝚢_{ij_i} . 𝙰_i1 ∧ ... ∧ 𝙰_{in_i})

where ⊢ denotes logical entailment, 𝙰_ks are atomic formulae (including equality), and 𝘅 is the set of free variables (assumed to be universally quantified) in the sequent. The premise (left side) and the consequence (right side) of ⊢ are respectively said to be the body and the head of the sequent. A body of empty conjunctions (m = 0) and a head of empty disjunctions (i = 0) are equivalent to truth (⊤) and falsehood (⟘) respectively.

It turns out that a first-order theory 𝓣 can be transformed to a geometric theory 𝓖 that is equisatisfiable to 𝓣 via standard syntactic manipulation. That is, 𝓣 is satisfiable (i.e., has a model) if and only if 𝓖 is satisfiable. But 𝓖 has other desirable properties that goes beyond equisatisfiability with 𝓣: in fact, for every model 𝕟 of 𝓣, a model 𝕞 of 𝓖 exists such that there is a homomorphism from 𝕞 to 𝕟. This is an important result that enables Razor to apply the chase to construct homomorphically minimal models of 𝓣.

In the context of a model-finding application, the chase constructs special models that contain minimum amount of information required for satisfying the input theory. Equisatisfiability of geometric and first-order logic also implies that satisfiability of geometirc theories is undecidable (see Gödel's incompleteness theorems).

Note: Chapter 4 of my doctoral dissertation presents a comprehensive discussion about geometric logic and the chase.

The chase is a well-known algorithm in database theory that constructs models for geometric theories. Given a geometric theory 𝓖 and starting with an empty model 𝕞, a run of the chase consists of repeated applications of the chase-step to augment 𝕞 with necessary information until there is a contradiction or 𝕞 satisfies 𝓖. Inspired by Steven Vickers, we refer to the units of information for augmenting models as observations.

Chase-Step

Given a geometric theory 𝓖 and a model 𝕞, a chase-step proceeds as below:

A sequent 𝜑 ⊢_𝘅 𝜓 from 𝓖 is selected for evaluation.
Given an assignment from the variables in 𝘅 to the elements of 𝕞: if 𝜑 is true and 𝜓 is false in 𝕞, new observations are added to 𝕞 to make 𝜓 true, such that:
- If 𝜓 is equal to ⟘ (i.e., an empty disjunction), the chase fails on 𝕞.
- If 𝜓 contains more than one disjunct, namely 𝜓₁ ∨ ... ∨ 𝜓_i (i > 1), the chase branches and augments clones of 𝕞 to make 𝜓_i true in each branch.
- If there is no sequent such that 𝜑 is true and 𝜓 is false in 𝕞, the model 𝕞 already satisfies 𝓖 and is returned as an output.

It can be shown a run of the chase always terminates on an unsatisfiable theory, although it may take a very very long time. However, when the theory is satisfiable, a run of the chase may not terminate, producing infinitely large and/or infinitely many models. This is consistent with semi-decidability of the satisfiability problem for first-order theories.

Note: As discussed earlier, Razor accepts a search bound to guarantee termination of the chase.

Razor's model-finding algorithm is a variant of the chase, an algorithm for constructing models for geometric theories. Given an input first-order theory 𝓣, Razor applies a standard syntactic transformation to convert 𝓣 to a Geometric Normal Form (GNF). The result of this transformation is a geometric theory 𝓖 that is equisatisfiable with 𝓣.

Note: Transformation to GNF consists of transforming the input theory to a Conjunctive Normal Form (CNF) followed by rearranging the conjuncts of the resulting CNF into an implication, where negative and positive conjuncts form the implication's premise and consequence respectively. The implication is semantically equivalent to a geometric sequence and is treated as such.

It's noteworthy that conversion to CNF involves Skolemization, a transformation by which existential quantifiers of the input theory are replaced with fresh "Skolem functions". Therefore, the resulting GNF is a theory over an extension of the original vocabulary over which the input theory is defined.

Let 𝓖 be an input geometric theory. The essense of a chase-step on a model 𝕞 is to find a sequent 𝜑 ⊢_𝘅 𝜓 of 𝓖 with an assignment from the variables in 𝘅 to the domain of 𝕞 such that 𝜑 is true and 𝜓 is false in 𝕞. A naive implementation of the chase would iterate over all sequents of 𝓖 and examine every assignment from 𝘅 to the domain of 𝕞. This would lead to |𝘅|^|𝕞| valuations of 𝜑 and 𝜓 in 𝕞 where |𝘅| and |𝕞| denote the size of 𝘅 and the domain size of 𝕞 respectively. Such a naive implementation is extremely inefficient for practical model-finding use-cases.

A practical implementation of the chase must address a number of algorithmic challenges: it must provide efficient data-structures for querying models and keeping track of various search branches; it should implement effective heuristics for choosing a search branch and for selecting a sequent to evaluate; and most importantly, it must implement a fast solution for finding assignments from the free variables in 𝘅 to the elements of 𝕞.

Hello Codd!

Codd's theorem is a well known result from database theory that establishes equivalence between relational algebra and relational calculus thus first-order logic. Razor takes advantage of this equivalence to implement the chase efficiently with the help of database theory. From a database perspective, a model 𝕞 constructed by the chase can be thought of as a monotonically growing database with relational tables for each predicate of the input theory 𝓖. Consequently, positive existential formulae are expressed queries with conjunction (∧), disjunction (∨), and existential quantifier (∃) acting as relational join (⋈), set union (∪), and projection (∏). Finally, a geometric sequent 𝜑 ⊢_𝘅 𝜓 is evaluated by a query 𝛷 - 𝛹, the set difference of the relational expressions corresponding to 𝜑 and 𝜓. Intuitively, evaluating 𝛷 - 𝛹 in 𝕞 results in a set of tuples that assign values to the variables in 𝘅 such that 𝜑 is true and 𝜓 is false, as required by the chase-step. After evaluating 𝛷 - 𝛹, the chase-step proceeds by inserting the resulting tuples in the relations mentioned by 𝛹, thus, making 𝜓 true in 𝕞.

Razor's implementation of the chase uses codd, a minimal in-memory database with relational expressions as queries. The underlying database abstraction would enable Razor to take advantage of many advances in database theory and particularly streaming databases to run more efficiently. For example, codd supports incrementally updated materialized views that are used to avoid full re-evaluation of expressions during a run of the chase as the database (i.e., the model 𝕞) grows.

The algorithm based on relational algebra requires a relationalization step to transform a given first-order theory 𝓣 with function symbols to an equivalent theory 𝓣′ without functions. Relationalization is a standard transformation that involves flattening of complex terms followed by replacing function symbols and constants (i.e., nullary functions) with new predicates. This transformation also extends the original theory with integrity axioms that preserve the semantics of the eliminated functions.

A discussion about the details of relationalization is out of the scope of this document. Here, I use an example to explain the core idea. Consider the atomic formula below, where 𝚁 is a predicate, 𝚏 and 𝚐 are function symbols, 𝚡 and 𝚢 are variables, and 𝗰 is a constant:

𝚁(𝚏(𝚐(𝚡), 𝚢), 𝗰)

To flatten this formula, we introduce fresh variables, namely 𝚟₁, 𝚟₂, and 𝚟₃ to recursively replace complex terms, consisting of function applications and constants. Notice that the original and the flattened formulae are semantically equivalent:

𝚁(𝚟₁, 𝚟₂) ∧ 𝚏(𝚟₃, 𝚢) = 𝚟₁ ∧ 𝚐(𝚡) = 𝚟₃ ∧ 𝗰 = 𝚟₂

Next, for each function symbol of arity 𝚗, we introduce a predicate of arity 𝚗 + 1 and replace the equalities between function applications and variables with atomic formulae over the new predicates:

𝚁(𝚟₁, 𝚟₂) ∧ 𝙵(𝚟₃, 𝚢, 𝚟₁) ∧ 𝙶(𝚡, 𝚟₃) ∧ 𝙲(𝚟₂)

In the previous example, 𝙵, 𝙶, and 𝙲 replace 𝚏, 𝚐, and 𝗰 in that order. The last position of these predicates correspond to the output of the functions that they replace. Finally, to preserve the semantics of the eliminated functions, the following sequents are added to the resulting theory 𝓣′:

𝙵(𝚡₁, 𝚡₂, 𝚢) ∧ 𝙵(𝚡′₁, 𝚡′₂, 𝚢′) ∧ 𝚡₁ = 𝚡′₁ ∧ 𝚡₂ = 𝚡′₂ ⊢ 𝚢 = 𝚢′
𝙶(𝚡, 𝚢) ∧ 𝙶(𝚡′, 𝚢′) ∧ 𝚡 = 𝚡′ ⊢ 𝚢 = 𝚢′
𝙲(𝚢) ∧ 𝙲(𝚢′) ⊢ 𝚢 = 𝚢′

These sequents ensure that the new predicates are interpreted by relations that map every vector of input arguments to exactly one output, as the original functions do.

Note: It's common to add another set of sequents to the resulting relationalized theory to preserve the totality of functions. The chase, however, interprets the functions of the input theory as partial functions; therefore, we're not concerned with preserving totality.

Equality

Interpreting equality as a relation can be done by axiomatizing equality as a predicate, captured by the following sequents with = as a special binary predicate symbol:

 
⊤ ⊢ =(x, x)                (reflexivity) 
=(𝚡, 𝚢) ⊢ =(𝚢, 𝚡)           (symmetry) 
=(𝚡, 𝚢) ∧ =(𝚢, 𝚣) ⊢ =(𝚡, 𝚣) (transitivity)

Note: Our treatment of equality takes advantage of the fact that equality axioms can be expressed as geometric sequents.

The translation to relational algebra must account for a number of practical challenges regarding variables and their order of appearance in formulae. From a database perspective, each position on the first-order predicates represents a relational attribute (i.e., a column of a relational table). Therefore, the variables of the formulae in the input theory enforce equality constraints on the resulting select query. Specifically, the positions of two conjoined predicates that are occupied by the same variable specify the key attributes of a relatoinal join where equality between join attributes is defined by the relation = from the previous section. As a result, Razor applies a standard linearization transformation that assigns unique names to variables that occupy positions on any predicate except for =. When renaming variables, linearization introduces additional equalities between the freshly introduced unique variables that rename the same variable in the input formula.

Now, I will use another example to describe linearization:

𝙿(𝚡) ∧ 𝚀(𝚢, 𝚡, 𝚢)

A possible linearized version of the formula above is obtained by renaming the second occurances of x and 𝚢, in 𝚀(𝚢, 𝚡, 𝚢), with new variables, namely x₀ and 𝚢₀. In addition, =(x, x₀) and =(𝚢, 𝚢₀) are conjoined into the formula to preserve equality between a variable and its renaming counterpart:

𝙿(𝚡) ∧ 𝚀(𝚢, 𝚡₀, 𝚢₀) ∧ =(x, x₀) ∧ =(𝚢, 𝚢₀)

I mentioned earlier that in our implementation of the chase based on relational algebra, a sequent 𝜑 ⊢_𝘅 𝜓 is evaluated as a relational expression 𝛷 - 𝛹 where 𝛷 and 𝛹 are relational expression obtained by transforming 𝜑 and 𝜓. The set difference operator is defined only when the attributes of 𝛷 and 𝛹 align; that is, they must share the same attributes in the same order. If 𝛷 and 𝛹 share the same set of attributes but in different orders, it's always possible to project either of the expressions to match the order of attributes. If 𝛷 and 𝛹 have different attributes, two cases are possible:

An attribute 𝜶 appears in 𝛷 but not in 𝛹. In this case, we simply project 𝜶 out of 𝛷.
An attribute 𝜷 appears in 𝛹 but not in 𝛷. The treatment for this case involves range restriction as described below.

In database theory, a range-restricted query

talk about other implementations

The syntax and semantics of Razor's input is that of conventional first-order logic, also known as predicate logic.

Razor's input supports three syntactic variations for the logical symbols and follows one of the more predominant conventions for the precedence of the logical connectives. For consistency and readability purposes, we use the alpha variation everywhere in this document.

Lowercase and uppercase identifiers in Razor are defined by the following:

A lowercase identifier is a word starting with either a lowercase alphabetic character ([a-z]) or the underscore (_), followed by any number of alphanumeric characters ([a-zA-Z0-9]) and/or the underscore. For example, rusty, _razor, and rusty123_RAZOR456 are lowercase identifiers.
An uppercase identifier is a word that starts with an uppercase alphabetic character ([A-Z]) and is followed by any number of alphanumeric characters ([a-zA-Z0-9]) and/or the underscore (_). For example, Rusty, and RAZOR_123 are uppercase identifiers.

A term in Razor's input is a conventional first-order term, inductively defined by the following:

A variable is a lowercase identifier, and it is a term on its own. For example, v, variable, and _var may be used as variable symbols.
A composite term consists of a lowercase identifier as a function symbol that is applied to zero or more terms as arguments that are wrapped in parentheses. For example, f(), f(a), f(g(a, b), c), and func(plus(x, y)) are terms.

Razor treats nullary functions (of arity zero that take no arguments) as constants. An apostrophe (') followed by a lowercase identifier is a syntactic sugar for constructing a constant. For example, 'a is a constant that is syntactically equivalent to a().

Formula

A formula in Razor's input is a conventional first-order formula with equality, inductively defined by the following:

An atomic formula consists of an upper case identifier as a predicate symbol that is applied to zero or more terms as arguments that are wrapped in parentheses. For example, R(), R(x), and R(f(x, 'a), y, 'b) are atomic formulae.
An equality is a especial type of atomic formula, with a binary infix connective = that is applied on two terms as its arguments. Razor treats an equality as identity between its arguments. For example, x = y, and f(x) = g(f(y), 'a) are equalities.
The result of applying the logical connectives and, or, implies and iff as infix binary connectives to two fromulae is itself a formula. Parentheses may be used to override the conventional precedence of the connectives. For example P(x) and x = y, P(x) and (Q(y) or R(x)), and P(x, y) implies x = y and R(z) are formulae.
The result of applying the logical connectives forall and exists to one or more comma (,) separated bound variables and a formula, as a propositional function, is itself a formula. The list of bound variables and the propositional function are separated by a period (.). For example, exists x. P(x), forall x, y . Q(x, y) or R(z), and forall x. exists y. P(x, y) are formulae.

Note: A variable that is not bound by a universal (forall) or existential (exists) quantifier is said to be a free variable. Razor assumes all free variables to be universally quantified over the entire formula in which they appear. For example, P(x) -> exists y . Q(x, y) is assumed to be equivalent to forall x . (P(x) -> exists y . Q(x, y)).

Razor's input is a first-order theory, consisting of a list of zero or more formulae, separated by the semi-colon (;). The input may contain conventional C-style comments (// for comment lines and /* and */ for comment blocks). Whitespaces including new lines are allowed. See the following input for an example:

// equality axioms

forall x . x = x; /* Reflexitivity */
forall x, y . (x = y implies y = x); /* Symmetry */
forall x, y, z . (x = y and y = z implies x = z); /* Transitivity */

Razor supports three syntactic variations of the logical symbols in the input:

alpha where logical symbols are written as alphabetic words.
compact where ASCII notations represent logical symbols.
math where (Unicode) mathematical symbols are used.

Note: Currently, Razor's parser accepts inputs that are comprised of any combination of the syntactic variations mentioned above. However, future releases of Razor may restrict the input to use only one of the variations above.

The table below shows all syntactic variations of the logical symbols:

symbol	alpha	compact	math
truth	`true`	`'\|'`	`⊤` (U+22A4)
falsehood	`false`	`_\|_`	`⟘` (U+27D8)
negation	`not`	`~`	`¬` (U+00AC)
conjunction	`and`	`&`	`∧` (U+2227)
disjunction	`or`	`\|`	`∨` (U+2228)
implication	`implies`	`->`	`→` (U+2192)
bi-implication	`iff`	`<=>`	`⇔` (U+21D4)
existential quantifier	`exists`	`?`	`∃` (U+2203)
universal quantifier	`forall`	`!`	`∀` (U+2200)

When a formula contains two or more logical connectives, the connectives are applied by the following order from the highest to the lowest precedence:

Negation (not) is applied first.
Conjunction (and) is applied next.
Disjunction (or) is applied next.
Implication (implies) and bi-implication (iff) are applied next.
Existential (exists) and universal (forall) quantifiers are applied last.

A connective with a higher precedence is applied before a consecutive connective with a lower precedence; that is, the connective with the higher precedence binds tighter to the formula on which it operates. For example, P() implies not Q() and R() is a formula consisting of an implication where P() is the premise and the conjunction of not Q() and R() is the consequence.

Parentheses may be used to override the precedence of connectives. For example, in P() and (Q() or R()) the disjunction (or) is applied before the conjunction (and).

All binary connectives of equal precedence except for implication (implies) and bi-implication (iff) are left-associative. For example, P() | Q() | R() is evaluated as (P() | Q()) | R().

Implication and bi-implication are right-associative. For example, P() <=> Q() -> R() <=> S() is evaluated as P() <=> (Q() -> (R() <=> S())).

The input theory accepted by Razor is defined by the grammar below:

LOWER       ::= [a-z_][a-zA-Z0-9_]*
UPPER       ::= [A-Z][a-zA-Z0-9_]*

TRUE        ::= "true"    | "'|'" | "⊤" (U+22A4)
FALSE       ::= "false"   | "_|_" | "⟘" (U+27D8)
NOT         ::= "not"     | "~"   | "¬" (U+00AC)
AND         ::= "and"     | "&"   | "∧" (U+2227)
OR          ::= "or"      | "|"   | "∨" (U+2228)
IMPLIES     ::= "implies" | "->"  | "→" (U+2192)
IFF         ::= "iff"     | "<=>" | "⇔" (U+21D4)
EXISTS      ::= "exists"  | "?"   | "∃" (U+2203)
FORALL      ::= "forall"  | "!"   | "∀" (U+2200)

VARIABLE    ::= LOWER
FUNCTION    ::= LOWER
PREDICATE   ::= UPPER
VARIABLES   ::= VARIABLE ("," VARIABLES)*
TERM        ::= VARIABLE | FUNCTION "(" TERMS? ")"
TERMS       ::= TERM ("," TERMS)*

ATOM        ::= TRUE | FALSE
              | TERM "=" TERM | PREDICATE "(" TERMS? ")"
              | "(" FORMULA ")"
F_NOT        ::= NOT F_QUANTIFIED | ATOM
F_AND        ::= F_NOT (AND F_QUANTIFIED)?
F_OR         ::= F_AND (OR F_QUANTIFIED)?
F_QUANTIFIED ::= (EXISTS | FORALL) VARIABLES "." F_QUANTIFIED | F_OR
FORMULA      ::= F_QUANTIFIED ((IMPLIES | IFF) F_QUANTIFIED)*

THEORY       ::= (FORMULA ";")*

rusty-razor is written in Rust, so you will need Rust 1.37.0 or newer to compile it. To build rusty-razor:

git clone https://github.com/salmans/rusty-razor.git
cd rusty-razor
cargo build --release

This puts rusty-razor's executable in /target/release.

Use the solve command to find models for an input theory. The -i (short for --input) reads the input from a file:

razor solve -i <input>

Run solve without the -i option to read the input from the standard input.

The --count parameter limits the number of models to construct:

razor solve -i <input> --count <number>

Unlike conventional model-finders such as Alloy, Razor doesn't require the user to provide a bound on the size of the models that it constructs. However, Razor may never terminate when running on theories with non-finite models -- it can be shown that a run of Razor on an unsatisfiable theory (i.e., a theory with no models) is guaranteed to terminate (although it might take a very very long time).This is a direct consequence of semi-decidability of first-order logic.

To guarantee termination, limit the size of the resulting models by the number of their elements using the --bound option with a value for the domain parameter:

razor solve -i <input> --bound domain=<number>

Use the --scheduler option to choose how Razor processes search branches. The fifo scheduler (the default scheduler) schedules new branches last and is a more suitable option for processing theories with few small satisfying models. The lifo scheduler schedules new branches first, and is more suitable for processing theories with many large models.

razor solve -i <input> --scheduler <fifo/lifo>

This section presents sample first-order theories, written in Razor's syntax. All examples are inspired by the events of Game of Thrones (spoiler alert!!).

Valar Morghulis: demonstrates a run of Razor on a simple example.
Golden Head: runs Razor over an unsatisfiable theory for which no models exist.
Hold the Door: covers more advanced features of Razor on theories with infinite models.

For more examples, see Razor's example theories.

All men must die. Ser Gregor is a man.

// All men must die:
forall x. (Man(x) implies MustDie(x));

// Ser Gregor is a man:
Man('gregor);

Run Razor on the previous theory valar-morghulis.raz:

razor solve -i theories/examples/valar-morghulis.raz

Razor returns only one model:

Domain: e#0

Elements: 'gregor -> e#0

Facts: Man(e#0), MustDie(e#0)

The model contains only one element e#0 in its domain. This element denotes 'gregor, a constant in the theory that represents Ser Gregor. The model also contains two facts: Man(e#0) is a fact that is derived from the second statement of the theory (i.e., Man('gregor)). The fact MustDie(e#0) is deduced by Razor according to the first statement of the theory.

Notice that the previous model is a "minimal" model for the given theory. The element e#0 is required to represent the constant 'gregor; the fact Man(e#0) must be present because the theory says so; and, the fact MustDie(e#0) must be true because of the first statement. Removing any piece of information makes the given structure a non-model of the theory.

While reading "The Lineages and Histories of the Great Houses of the Seven Kingdoms", Lord Eddard Stark learns that throughout the history, all male members of House Baratheon were described as "black of hair" and concludes that King Robert is not Prince Joffrey's (biological) father. A judgment that eventually put his head on a spike.

The next theory describes Ned's thought process:

// A person "x" cannot be both "black of hair" and "golden head"
~(BlackOfHair(x) & GoldenHead(x));

// Traditionally, a Baratheon child "y" inherited his/her father's ("x"'s) family name
Baratheon(x) & father(y) = x -> Baratheon(y);

// King Robert Baratheon is black of hair
Baratheon('robert) & BlackOfHair('robert);

// King Robert is Joffrey's father
father('joffrey) = 'robert;

// Joffrey has golden hair
GoldenHead('joffrey);

// Ned Stark's discovery (every Baratheon "x" is black of hair)
Baratheon(x) -> BlackOfHair(x);

We can verify Ned's conclusion by running Razor on this theory golden-lion.raz, asking for a scenario (i.e., model of the theory) that justifies Joffrey's golden head:

razor solve -i theories/examples/golden-lion.raz

Razor cannot find a model for the previous theory, meaning the theory is inconsistent. Notice that this theory is satisfiable (i.e., has a model) in the absence of Ned's discovery (try running Razor after commenting out the last line).

Wyllis was a young stable boy when he heard a voice from his future: "Hold the Door!" The voice transformed Wyllis to Hodor (Hold the door, Holdde door, Hoddedor, Hodor, Hodor...!), putting him on a life-long journey, leading him to the moment that he saves Bran's life. Indeed, because of this defining moment in his future, Wyllis became Hodor in his past.

The theory below describes Hodor's journey assuming that time progresses linearly hodor-linear.raz

// Wyllis hears "Hold the Door" (at time `t`), then he becomes Hodor in the next
// point of time
HoldTheDoor(t) -> Hodor(next(t));

// Hodor, after turning into Hodor at time "t", holds the Door at some time "tt"
// in future ("tt > t")
Hodor(t) -> ? tt . HoldTheDoor(tt) & After(t, tt);

// These are the rules by which time progresses linearly:
// (1) a point of time "t1" that is the next of "t0" (i.e., "next(t0)") is a point of
// time after "t0" ("t1 > t0")
next(t0) = t1 -> After(t0, t1);

// (2) if a point of time "t1" is after "t0", it is either immediately
// after "t0" (i.e., "next(t0)") or there exists some point of time "t2"
// that is immediately after "t0" and before "t1".
After(t0, t1) -> next(t0) = t1 | ? t2 . next(t0) = t2 & After(t2, t1);

// And we know at some point of time (namely "'t_hodor"), Wyllis became Hodor
Hodor('t_hodor);

An unbounded run of Razor on the previous theory will never terminate (feel free to press ctrl + c after a few seconds):

razor solve -i theories/examples/hodor-linear.raz

Assuming that time progresses linearly, the circular causality between the two events of "holding the door" and "becoming Hodor" results in an infinitely large model where time progresses unboundedly. We can restrict the size of the structures constructed by Razor by bounding the number of their elements. For example, if we restrict the number of elements to 4, Razor will find 9 incomplete structures, which do not satisfy the theory:

razor solve -i theories/examples/hodor-linear.raz --bound domain=4

For example, the following structure corresponds to an incomplete model where e#0 denotes the starting point t_hodor and e#1, e#2 and e#4 are other points in time:

Domain: e#0, e#2, e#4, e#1

Elements: 't_hodor -> e#0, sk#0[e#0] -> e#1, next[e#0], sk#1[e#0, e#1] -> e#2,
next[e#1] -> e#4

Facts: After(e#0, e#1), After(e#2, e#1), Hodor(e#0), Hodor(e#4), HoldTheDoor(e#1)

Now consider e#1 and e#2. The incomplete model shows that e#1 is after e#2, but neither e#1 immediately follows e#2 (no next point for e#2) nor there exists a point that is after e#2 and before e#1, violating the second rule of linear time progression. In general, it may be possible to extend the incomplete structure to a model of the theory by adding more information to the model. Any model of this particular theory, however, is infinitely large.

Next, we model time as a "big ball of wibbly wobbly timey wimey stuff!" To make it simple, let's assume that time-loops can only happen at the moment that Hodor heard a voice from the future, namely 't_hodor, changing our rules of time progression (hodor-time-loop.raz):

HoldTheDoor(t) -> Hodor(next(t));

Hodor(t) -> ? tt . HoldTheDoor(tt) & After(t, tt);

next(t0) = t1 -> After(t0, t1);
After(t0, t1) -> (next(t0) = t1) | ? t2 . next(t0) = t2 & After(t2, t1);

// Hold the door moment only happens at 't_hodor
HoldTheDoor(t) -> t = 't_hodor;

Hodor('t_hodor);

In presence of time-loops, Razor can explain Hodor's curious journey:

razor solve -i theories/examples/hodor-time-loop.raz

This time, Razor produces infinitely many (finite) models with time-loops of different size. Use can use the --count option to limit the number of models and halt the process.

rusty-razor

Razor

Conventional Model-Finding

Razor: A First-Order Approach

Unbounded Algorithm

Controlled Search

Minimality

Verification and Exploration

Fitness

Geometric Logic

Satisfiability

The Chase

Chase-Step

Termination

Implementation

Hello Codd!

Relationalization

Equality

Variable Linearization

Range Restriction

Syntax

Identifier

Term

Formula

Input

Syntactic Variations

Connective Precedence

Associativity

Grammar

Build

Run

`solve`

Bounded Model-Finding

Model-Finding Scheduler

Example

Valar Morghulis

Golden Head

Hold the Door

Linear Time

Time-Loop