Heuristic Type Inference in BitC
SRL Technical Report SRL2008-01
Abstract
This paper introduces a new type system that is designed for safe systems programming. A key feature of the type system is a new mutability model that combines explicitly unboxed types with a consistent typing of mutability. The type system is provably sound, supports polymorphism and eliminates the need for alias analysis to determine the immutability of a location. The paper identifies the chalenges posed by this type system for type inference, and proposes a heuristic approach to solving this problem.
1 Introduction
Systems programs rely on fine-grain control of data representation and use of state to achieve performance, conformance to hardware specification, and temporal predictability. Robustness and reuse of systems codes can be greatly improved by leveraging modern programming language features — such as static type safety, type inference, higher order functions, and polymorphism — but only if these features can be provided without sacrificing the above mentioned requirements. No existing language supports both of these feature sets simultaneously. Therefore, systems programmers continue to use Ada [8], C [10], C++ [9] or resort to domain specific and assembly level languages. We first discuss the support for these features in existing languages, identify the challenges in combining these feature sets and describe our approach toward solving this problem.
Type Inference and Polymorphism: Type inference achieves the advantages of static typing with a lower burden on the programmer, facilitating rapid prototyping and development. Polymorphic type inference (c.f. ML [18] or Haskell [15]) combines the advantages of static type safety with much of the convenience provided by dynamically typed languages like Python [20]. Automatic inference of polymorphism simplifies generic programming, and therefore increases the reuse and reliability of code. Safe languages like Java [19], C# [16], or Vault [2] do not support type inference. Cyclone [11] supports polymorphism only for functions that are explicitly annotated with a polymorphic type.
Representation Control: A systems programming language must be expressive enough to specify details of data-structure layout (boxed or unboxed), alignment and allocation (stack or heap). This feature is essential for systems programming for reasons of performance (ex: to control cache and paging behavior), conformance to hardware specification (ex: page table entries), and interfacing with external C or assembly code and data. A careful implementation of the standard TCP/IP protocol stack in Standard ML incurs a substantial overhead of up to 10x increase in system load and a 40x slowdown in accessing external memory relative to the equivalent C implementation [ 1 ,3 ]. This shows that in systems programs, data structure representation is as important as, or even more important than high level algorithms.
The philosophy of ML-like languages is that programs specify semantics and not realization (implementation). However, in systems programs, statements about representation and location are prescriptive, not descriptive. Compilers like TIL [25] implement unboxed representation as a discretionary optimization. Prescriptive requirements are not discretionary. First class treatment of representation is required in systems programming languages.
Mutability Support: One of the key features essential for systems programming is support for first class mutability (c.f. C). The support for mutability must be first class in the sense that any location (stack or heap) can be mutable, and we should be able to specify mutability at field level granularity. Many modern programming languages — particularly those that support type inference — do not support this first class notion of mutability. ML does not support first class mutability; all mutable cells must reside in the heap. In Haskell, all state must be encapsulated within Monads [14]. In contrast to the C-like languages, these functional languages do have a mathematically sound notion of immutability.
We have identified three features — unboxed representation, mutability and polymorphic type inference — that are desirable in a systems programming language. While several existing languages support two of these features, none combine all three of them elegantly. In this paper, we endeavor to bridge this gap between systems programming and modern language designs through the BitC [22] programming language. BitC is a direct expression of typed lambda calculus with side effects, extended to be able to reflect the semantics of explicit representation. It supports polymorphic type inference and a new model of mutability which is expressive and has sound semantics.
BitC is a call-by-value expression language. BitC's support for unboxed mutability makes it desirable to allow some freedom in the compatibility of types with respect to their mutability at copy boundaries. This kind of compatibility has ramifications for type inference since there is no longer a unique way to type an expression. However, usability constraints require that we minimize the amount of type annotations required from the programmer. In this paper, we discuss some of these issues, and present a solution based on a simple extension to the Hindley-Milner inference algorithm [17] that applies certain heuristics to subjectively infer the appropriate type for all expressions.
2 The Language
BitC is a safe, systems programming language. A detailed description of the language and its support for systems programming (including a rich set of primitive data types and bit-fields, type classes [12] for overloading, representation of explicitly boxed and unboxed data-structures, discriminated unions, tagged-unions with inlined discriminators, etc.) can be found in the language definition [22]. In this paper, we limit our presentation to a core calculus of BitC called *B* in the interest of brevity. The user visible part of *B* is defined below:
Patterns |
p ::= |
_x_ | p : τ | (p, p) | [p;*] | _x_::p |
Values |
_v_ ::= |
() | true | false | λp._e_ | (_v_, _v_) | [_v_;*] |
Expressions |
_e_ ::= |
_x_ | () | true | false | λp._e_ | _e_ _e_ | _e_ : τ | _e_ := _e_ | dup(_e_) | _e_^ |
|
|
| (_e_, _e_) | _e_.1 | _e_.2 | [_e_;*] | _e_::_e_ | match _e_ with p in _e_ [else _e_] |
|
|
| if _e_ then _e_ else _e_ | let _x_[:σ] = _e_ in _e_ |
Types |
τ ::= |
α | unit | bool | τ → τ | ⇑τ | Ψτ | τ ✗ τ | [τ] | τ \ *C* |
Type Scheme |
σ ::= |
τ | ∀α*.σ |
Constraints |
*C* ::= |
∅ | {τ = τ | τ ≅ τ} |
Program variables are denoted by _x_, _y_, etc. and type variables by α, β, etc. _X_;* denotes zero or more _X_s separated by a semicolon. The [] brackets denote optional parts of syntax. ⇑τ represents a reference (pointer) type and Ψτ represents a mutable type. The type τ \ *C* is a constrained type with the set of constraints *C*. The relation ≅ will be defined in section 4. The let construct can be used for allocating (possibly mutable) stack variables and to create let-polymorphic bindings. The expression dup(_e_), where e is of type τ, returns a reference of type ⇑τ to a heap-allocated copy of the value of _e_. The ^ operator is used to dereference heap cells. The := operator updates (mutates) both stack and heap locations. Unlike ML, := does not dereference its target. Pairs (,) are unboxed structures whose constituent elements contiguously allocated on the stack, or in their containing data-structure. _e_.1 and _e_.2 perform selection from pair values. Lists [...] have a boxed representation.
3 The Mutability Model
Traditionally, there are two models of mutability studied in the case of imperative languages. One of them is the ML model, where there is a clear separation between name bindings and updatable locations. All updatable (mutable) locations live in the heap within ``ref cells''. Fetching the value inside a ref cell requires an explicit dereferencing operation. The major advantage of this approach is that types are definitive about the mutability of every location, across all aliases. In this sense, we can say that the support for mutability is mathematically ``well-founded.'' This model benefits tools that perform static analysis or model checking because conclusions drawn about location immutability need never be conservative. This model of mutability also increases the amount of optimization the compiler can safely perform without complex alias analysis.
The other well known model of mutability is the C model, wherein
the support for mutability is ``first-class'' in the sense that
locations with mutable type can be passed as arguments
and stored in data-structures. This model permits mutation of
stack variables and unboxed values. There is a notion of
lvalues which are expressions that can be the target of
an assignment, and rvalues, that are otherwise used in
computations. The extraction of the value from a (mutable)
location is implicit, and does not require dereferencing.
However, in this model, types cannot distinguish mutable values
from immutable ones. For example, in C it is legal to write:
const bool *cp = ...;
bool *p = cp;
*p = false; // OK!
The alleged ``constness'' of the location pointed to by cp is a local property (only) with respect to the alias cp and not a statement of true immutability of the target location. The compiler or other analytical engines are not entitled to believe that certain locations or fields are constant even if so declared.
*B*
supports well-founded first class mutability. Similar to
ML, we impose the ``one location, one type'' rule.
let _cp_ : ⇑bool = dup(true) in let _p_ : ⇑Ψbool = _cp_
(* Error *)
We see that
_cp_ has the type reference to bool
(⇑bool). This type is incompatible with that of
_p_, reference to mutable-bool
(⇑Ψbool).
In order to support unboxed mutability, we still need to have a
notion of lvalues. This not only preserves the programmer's
mental model of the relationship between locations storage, but
also ensures that compiler transformations are semantics
preserving. In
*B*, the legal lvalues are defined by:
_L_v ::= _x_ | _L_v.1 | _L_v.2 | _L_v^ | _L_v : τ
4 Copy Compatibility
Since *B* is a
call-by-value language, it is desirable that we allow some
freedom in the compatibility of types with respect to their
mutability at a copy boundary. For example, in the
following expression:
let _fnxn_ = λ_x_.(let ___ = _x_ := _some-fnxn_ _x_ in _fnxn-returning-unit_ _x_) in
let _y_ : bool = true in _fnxn_ _y_
the type of
_fnxn_ is
(Ψbool) → unit, whereas that of the actual argument
_y_ is
bool.
Since the formal argument
_x_ is a copy of
_y_ and occupies a different location,
this expression is type safe. We refer to this notion
of compatibility of types as copy compatibility,
denoted by ≅. In this example,
bool ≅ Ψbool.
Copy compatibility need not be restricted to the outermost mutability compatibility, but must not extend past a reference boundary in order ensure that every location has unique type. We define copy compatibility for *B* as:
|
|
|
|
Copy compatibility can be permitted at argument passing, new variable binding, assignment, and basically at argument/return positions of all expressions, except where an lvalue is expected or returned. For example, the expression (_x_ : τ) : Ψτ is illegal since identifiers are lvalues, but the branches of a conditional can have different but copy compatible types, as in: if true then _a_ : τ else _b_ : Ψτ.
5 Type Inference
We would like to employ an inference algorithm with the following properties:
-
The inference algorithm must be decidable without programmer annotations. The problem with programmer annotations is pragmatic rather than ideological: ease of prototyping requires that these annotations be minimized.
-
The inference algorithm must ideally be complete. In the absence of principal types, it must minimize programmer annotations in the common case, and must be capable of inferring all sound types at least when guided by explicit annotation. This requirement is orthogonal to requirement (1).
-
The inference algorithm must automatically infer polymorphism (without programmer annotations) in order to maximize code reuse and encourage good software engineering.
-
The inference algorithm must not require whole program analysis. Whole program inference precludes separate compilation, poses scalability problems for large projects, and can result in surprising behaviour if the inferred types for a program change due to modifications in a distant part of the code base.
With these considerations in mind, we chose a variation of the Hindley-Milner algorithm [17] in *B*. We now describe challenges for type inference due to copy compatibility, explore the design choices, and finally present our solution.
Challenges Due to Copy Compatibility:
When an exact type compatibility requirement is replaced in the
language design by copy compatibility, it is no longer possible
to infer a unique (simple) type for the expression. For example,
in the expression
let _p_ = true,
we know that the type of the literal
true is
bool, but the type of
_p_
could either be
bool or
Ψbool. Therefore, we give
_p_ the so-called ``maybe'' type
α↓bool, which is a shorthand for the
constrained type
α \ {α ≅ bool}.
The actual type will later get resolved to either
bool or
Ψbool.
Similarly, in the expression
(inferred types are shown beside
∥).
let _pr_ = (true, false) in ... ∥ _pr_ : α↓(β↓bool ✗ γ↓bool)
copy compatibility is introduced at both the pair construction
(arguments passed to the pair constructor), and the
formation of a new binding.
Why Should We Infer Mutability?
It is natural to ask why mutability should be inferred at all.
That is: why not require explicit annotation for all mutable
values, and infer immutable types by default? In a language with
copy compatibility, this will result in a proliferation of type
annotations. Constructor applications,
(polymorphic) type instantiations, accessor functions,
etc.
will have to be explicitly annotated with their types. For
example, if
_fst_
is an accessor function that returns the first element of a
pair, and
_m_
is a variable of type
Ψbool, we will have to write:
let _xyz_ = [(_fst_ (_m_, false) : Ψbool ✗ bool)] : [Ψbool] in ...
Therefore, if mutability is not inferred, it results in a
substantial increase in the number of programmer annotations,
and type inference becomes ineffective.
Incompleteness of Inference The key idea of maybe types α↓τ is to defer commitments about the mutability status of types, and thus infer most-general types wherever possible. *B* is a let-polymorphic language and enforces the value restriction [27]. This means that the decision about the mutability of types cannot — in general — be deferred past their let bindings. For example, in the case of the expression: let _p_ = [] in ..., we cannot give _p_ the type ∀ α, β .β↓[α] (or ∀α.Ψ[α] or ∀α.β↓[α]). We must instead choose one of the polymorphic type ∀α.[α] or the monomorphic type β↓[α] That is, there is no principal type that can be inferred for _p_. Given this, we must fix these maybe types to be either mutable or immutable at a let-boundary. That is, the language definition must pick some solution for unsolved copy compatibility constraints before type generalization. In *B*, we trade completeness of inference to obtain a more expressive language without making major changes to the core type system.
Inference Considerations
First, we consider how to resolve copy compatibility
constraints at a let boundary. One possibility is
to fix all unresolved maybe types to immutable
versions. For example:
let _pf_ = (_n_ : Ψbool, λ_x_._x_) in ... ∥ _pf_ : ∀α.bool ✗ (α → α)
This scheme will preserve all polymorphism possible, but will
mandate a programmer annotation for every mutable value.
If mutable variants are chosen instead, no polymorphism can be
inferred by default. Therefore, neither of these
solutions are satisfactory. Further, from the standpoint of good
programming ideology and static analysis, the inferred
types must not be promiscuous with respect to mutability.
The previous section argued that we ``lose'' precision of inferred types (with respect to mutability) by the introduction of copy compatibility. Therefore, we can choose whether (or not) to introduce copy compatibility at various constructs like new bindings, function application/return, constructors, conditional expressions, etc. Another dimension of trade-off is whether to permit copy compatibility to the maximum permissible limit (as defined in section 4), or restrict it to top-level shallow mutability compatibility only. A further option is to require that all polymorphism be contained within function types, since function types can be polymorphic even if they abstract over mutable or maybe types.
Unless handled with care, full use of copy compatibility can
result in the inferred types that are counter-intuitive to the
programmer. For example:
let _copyList_ = λ_lst_.match _lst_ with [] in [] else
match _lst_ with _x_::_rest_ in _x_::_copyList_ _rest_ in ...
For a naïve reader, the type of
_copyList_ appears to be
∀α.[α] → [α]
but is actually the more general type:
∀ α, β . [α] → [β] \ {α ≅ β}.
Even though both the argument and return types of
_copyList_ are of (boxed) list type, they are
only required to be copy compatible because
_copyList_ copies the constituent elements, thereby using
new locations.
Now, if we default maybe types that are ultimately unresolved to
immutable, in the following definition we obtain:
let _mutLst2_ = _copyList_ _mutLst_ : [Ψbool] in ... ∥ _mutLst_ : [bool] !!
which is a correct typing, but is most likely not what the
programmer expects.
6 Type Inference in *B*
Having identified the various issues and trade-offs involved in type inference, we now describe the particular design choices made in BitC/*B* for handling copy compatibility. They have been driven in part by our experience writing BitC programs. In *B*, we allow copy compatibility to the full extent, up to a reference boundary. We allow copy compatibility to be invoked at arguments and return positions of all expressions that do not expect a location (lvalue).
At a let boundary, an unresolved maybe type
α↓τ
is resolved to
τ.
Intuitively, this means that we will default maybe types to the
types of their original copies unless a better type is inferred
or specified. The type
τ
of the original value is remembered as a ``hint'' to fix
unresolved maybe types at a let boundary. Here, we are
approximating the user's intent to the lexical ``flow'' of type
information. For example, in the following expression,
let _mb_ : Ψbool = true in let _l1_ = [_mb_] in let _l2_ : Ψ[bool] = [_mb_]
the type
β↓[α↓Ψbool]
is first inferred for both
_l1_ and _l2_ (due to copies at list construction and
binding). In the case of
_l1_, the final type is resolved to
[Ψbool] in accordance to the hints since we have no
further information. The type of
_l2_
is
Ψ[bool], since we have a (consistent) annotation --
there are no unresolved maybe types. Further, under this
scheme, the
_listCopy_ example described in the previous section gets
the more intuitive type
∀α.[α] → [α].
In the case of conflicting hints, we pick the most immutable
type obtainable from all hints. This ensures that inferred types
are always deterministic. For example:
let _bp_ = if true then (true, false) : Ψbool ✗ bool
else (false, true) : bool ✗ Ψbool in ... ∥ _bp_ : bool ✗ bool
Here, the hints provided by the two branches
of the if do not agree, and we
resolve the conflict by picking
bool ✗ bool as the effective hint.
In the case of locally defined identifiers, the top-most mutability is inferred by studying the syntactic usage of the identifier. That is, if the identifier is used as the target of an assignment (:=), it is given a shallowly mutable type. This is an ad hoc rule that reduces the need for explicit programmer annotations in the common case like iterators. In a full language like BitC, this rule must not be used for globals in order to keep inference deterministic. Further, this ``infer mutability first'' rule cannot, in general, be extended beyond shallow mutability. For example, in the expression let _fPtr_ = dup(λ_x_._x_) in (_useF1_ _fPtr_, _useF2_ _fPtr_), we do not have enough local information before type inference to determine whether _useF1_ and _useF2_ set the contents of the cell pointed to by _fPtr_, or use it polymorphically.
Due to copy compatibility, two function types are equal regardless of the shallow mutability of the argument and return types. Therefore, we enforce a syntactic restriction that all function types must be written with immutable types at copy compatible positions. The intuition here is that type of a function must be described in the interface form, and must hide the ``internal'' mutability information. For example, for the function definition let _f_ = λ_x_._x_ := true in ..., the external type is _f_ : bool → unit, and the internal type is _f_ : Ψbool → unit.
We must ensure that the internal types of a function do not influence the result type of applications, but the effect of arguments on the return types must be preserved. This ensures that the implementation of an abstraction can be changed (ex: from a pure recursive computation to loop involving mutation) unbeknownst to its callers. For example:
|
let _p_ : Ψbool = true in |
∥ _p_ : Ψbool |
|
let _f_ = λ_x_._p_ in let _g_ = λ_x_._x_ in |
∥ _f_ : ∀α.α → bool ; _g_ : ∀α.α → α |
|
let _ff_ = _f_ _p_ in let _gg_ = _g_ _p_ in ... |
∥ _ff_ : bool ; _gg_ : Ψbool |
The function _f_ returns _p_ : Ψbool, regardless of its input. The external type of _f_ abstracts away the mutability of _p_, and thus, _ff_ gets the type bool. _g_ is an identity function which returns its argument. The external type of _g_ preserves the mutability of the actual argument _p_, and thus _gg_ gets the type Ψbool.
Mutability Polymorphism:
A type is said to be mutability-polymorphic if it ranges over
all [im]mutability variants of a particular type.
For example,
∀α.α↓bool is mutability polymorphic over
bool.
By far the most useful case of this feature is to
define functions that are mutability-polymorphic over concrete
types, since functions with a polymorphic type such as
∀α.α → α are polymorphic over all
types, mutable or immutable.
The above inference algorithm tries to heuristically fix the
mutability of all expressions (including functions). However,
this behavior can be overridden by explicit annotation. For
example, the definition:
let _allTrue_ : ∀α.[α↓bool] → bool = ... in ...
∥ _allTrue_ : ∀α.[α↓bool] → bool
defines
_allTrue_,
a function whose argument is mutability-polymorphic over
[bool].
7 Formalization
We now formalize our type system and inference algorithm. Due to space limitations, we will limit our presentation to the following subset of *B*, called *B*′.
Syntax
Identifiers |
_x_ ::= |
_y_ | _z_ | ... |
Stack Locations |
l ::= |
l1 | l2 | ... |
Heap Locations |
ℓ ::= |
ℓ1 | ℓ2 | ... |
Locations |
?L? ::= |
l | ℓ |
Values |
_v_ ::= |
() | true | false | ℓ | λ_x_._e_ |
lvalues |
£ ::= |
l | ℓ^ |
Expressions |
_e_ ::= |
_v_ | l | _e_ _e_ | _e_ : τ | _e_ := _e_ |
|
|
| dup(_e_) | _e_^ |
|
|
| if _e_ then _e_ else _e_ |
|
|
| letκ _x_[:τ] = _e_ in _e_ |
Let-kinds |
ϰ ::= |
- | κ | ψ | ∀ |
Types |
τ ::= |
α | unit | bool | τ → τ |
ref / pointer |
|
| ⇑τ |
Mutable type |
|
| Ψτ |
Type Scheme |
σ ::= |
τ | ∀α.σ |
All the above syntactic forms can be parenthesized without change in meaning. The let-kind ``-'' is a placeholder for the unkinded (input) let form. A substitution is of Z for Y in X is written using the standard notation: X[Z/Y].
The calculus of *B*′ defines two kinds of locations: stack locations holding unboxed values, and heap locations holding boxed values. Heap locations are first class values, but stack locations are not. This distinction is sufficient to illustrate the two cases that must be handled by the type system: the types of values passed / copied by value (only) need to be copy compatible, but the types of values passed by reference must be strictly equal. More complex rules can be constructed based on these primitive cases.
In this calculus, we make a distinction between two ``kinds'' of let expressions — letψ: monomorphic, possibly mutable definition, and let∀: polymorphic definitions. The two kinds of let expressions have different execution semantics. This distinction is similar to Smith and Volpano's Polymorphic-C [23]. However, unlike Polymorphic-C, let-kind is meta syntax, and is not a part of the input program. The correct kind of let is inferred from the static type information. The kind ``-'' is a placeholder for the unkinded input let expression. We write let to range over letψ and let∀.
Rule |
Pre-conditions |
Evaluation Step |
E-Rval |
S(l) = _v_ |
S; H; l ⇒ S; H; _v_ |
EL-Tq |
S; H; _e_ : τ ⇛ S; H; _e_ |
|
E-Tq |
S; H; _e_ : τ ⇒ S; H; _e_ |
|
E-Let-Tq |
S; H; let _x_ : τ = _e_1 in _e_2 ⇒ S; H; let _x_ = _e_1 in _e_2 |
|
E-App1# |
S; H; _e_1 ⇒ S′; H′; _e_′1 |
S; H; _e_1 _e_2 ⇒ S′; H′; _e_′1 _e_2 |
E-App2# |
S; H; _e_2 ⇒ S′; H′; _e_′2 |
S; H; _v_1 _e_2 ⇒ S′; H′; _v_1 _e_′2 |
E-App |
l ∉ dom(S) |
S; H; λ_x_._e_ _v_ ⇒ S, l ↦ _v_; H; _e_[l/_x_] |
E-If# |
S; H; _e_ ⇒ S′; H′; _e_′ |
S; H; if _e_ then _e_1 else _e_2 ⇒ S′; H′; if _e_′1 then _e_2 else _e_3 |
E-If-True |
S; H; if true then _e_1 else _e_2 ⇒ S; H; _e_1 |
|
E-If-False |
S; H; if false then _e_1 else _e_2 ⇒ S; H; _e_2 |
|
E-Dup# |
S; H; _e_ ⇒ S′; H′; _e_′ |
S; H; dup(_e_) ⇒ S′; H′; dup(_e_′) |
E-Dup |
ℓ ∉ dom(H) |
S; H; dup(_v_) ⇒ S; H, ℓ ↦ _v_; ℓ |
EL-^# |
S; H; _e_ ⇒ S′; H′; _e_′ |
S; H; _e_^ ⇛ S′; H′; _e_′^ |
E-^# |
S; H; _e_ ⇒ S′; H′; _e_′ |
S; H; _e_^ ⇒ S′; H′; _e_′^ |
E-^ |
H(ℓ) = _v_ |
S; H; ℓ^ ⇒ S; H; _v_ |
E-:=lhs# |
S; H; _e_1 ⇛ S′; H′; _e_′1 |
S; H; _e_1 := _e_2 ⇒ S′; H′; _e_′1 := _e_2 |
E-:=rhs# |
S; H; _e_2 ⇒ S′; H′; _e_′2 |
S; H; £ := _e_2 ⇒ S′; H′; £ := _e_′2 |
E-:=Stack |
S, l ↦ _v_1; H; l := _v_2 ⇒ S, l ↦ _v_2; H; () |
|
E-:=Heap |
S; H, ℓ ↦ _v_1; ℓ^ := _v_2 ⇒ S; H, ℓ ↦ _v_2; () |
|
E-Let# |
S; H; _e_1 ⇒ S′; H′; _e_′1 |
S; H; let _x_ = _e_1 in _e_2 ⇒ S′; H′; let _x_ = _e_′1 in _e_2 |
E-Let-M |
l ∉ dom(S) |
S; H; letψ _x_ = _v_1 in _e_2 ⇒ S, l ↦ _v_1; H; _e_2[l/_x_] |
E-Let-P |
S; H; let∀ _x_ = _v_1 in _e_2 ⇒ S; H; _e_2[_v_1/_x_] |
Figure 1.
Dynamic Semantics
7.1 Dynamic Semantics of *B*
Syntax
Stack |
S ::= |
∅ | S, l ↦ _v_ |
Heap |
H ::= |
∅ | H, ℓ ↦ _v_ |
The system state is represented by the triple S; H; _e_ consisting of the stack S, the heap H, and the expression _e_ to be evaluated. Evaluation itself is a two place relationship S; H; _e_ ⇒ S′; H′; _e_′ that denotes transformation in the system state due to a single step of execution. Figure 1 shows the evaluation rules for our core language. We assume that the program is alpha-converted so that there are no name collisions due to inner bindings. Following the theoretical development in [5], we give separate execution semantics for left and right execution (evaluation of expressions that appear on the LHS and RHS of an assignment _e_l := _e_r) denoted by ⇛ and ⇒ respectively.
Since the E-Dup and E-^ rules work only on the heap, we can only capture references to heap cells. Stack locations cannot escape beyond their scope since E-Rval rule performs implicit value extraction from stack locations in rvalue contexts. State updates can be performed either on the stack or on the heap (E-:=Stack and E-:=Heap). The stack is modeled as a pseudo-heap. This enables us to abstract away details such as closure-construction and garbage collection while illustrating the core semantics (they can later be reified independently).
|
|
|
|
|
Figure 2.
Location Semantics
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 3.
Declarative Type Rules
|
|
|
|
|
|
Figure 4.
Copy Coercion Rules
7.2 Static Semantics of *B*
Syntax
Types |
τ ::= |
α | unit | bool | τ → τ |
|
|
| ⇑τ | Ψτ |
Constr. Type |
ρ ::= |
τ | τ \ *C* |
Type Scheme |
σ ::= |
ρ | ∀α.σ |
Constraints |
?c? ::= |
α ≼: τ |
Constraint Sets |
*C* ::= |
∅ | {?c?*} | *C* ∪ *C* |
Binding Environment |
Γ ::= |
∅ | Γ, _x_ ↦ σ |
Store Typing |
Σ ::= |
∅ | Σ, ?L? ↦ τ |
Logical Relations |
Ω ::= |
?true? | ?false? | Ω ∧ Ω | Ω ∨ Ω |
|
|
| ¬Ω | ?Predicate??(?Ω*?)? |
Solvable Entities |
ω ::= |
τ | Γ | Σ | _e_ |
We represent mathematical properties as: assumption ⊧property subject.
Figure 4 shows the copy coercion rules. The location semantics (lvalue) rules are shown in figure 2. Figure 3 shows the declarative type rules, rules for type generalization. The standard type judgment Γ; Σ ⊢ _e_ : τ is understood as: given a binding environment Γ and store typing Σ, the expression _e_ has type τ.
We use the following shorthand notation:
〚Γ; Σ ⊢ _e_ ≼: τ〛 ↝ 〚Γ; Σ ⊢ _e_ : τ′, τ ≼: τ′〛
Figure 3 shows a generalized declarative type system and associated rules for type generalization. The type judgment Γ; Σ ⊢ _e_ : τ is understood as: given a binding environment Γ and store typing Σ, the expression _e_ has type τ.
Definition 1: (Algebraic equivalences)
In our algebra of types, we define the following equivalence: ΨΨτ ≡ Ψτ. That is, the mutable type constructor is idempotent.
Definition 2: (Copy Compatibility)
We define the copy compatibility relationship (≅) as follows:
|
|
|
In terms of the copy-coercion rules shown in Figure 3, we can define copy compatibility as:
τ1 ≅ τ2 ≡ τ1 ≼: ▽(τ2)
The S-Ref rule ensures that copy compatibility does not extend beyond a ref-boundary. Since two function types are equal regardless of the shallow mutability of the argument and return positions, we write all function types in normalized form. The (contravariant) argument type is written in the maximally immutable form (devoid of shallow mutability), and the (covariant) return type is written in the maximally mutable form. This ensures that the ``outer'' type of a function is maximally permissive with respect to mutability. The S-Fn rule therefore is invariant in terms of its arguments and return types. [This normalization is different from the type displayed to the user, which is discussed in section 6.]
Definition 3: (Max and Min Mutability)
The operators △ and ▽ increase or decrease the mutability of a type, and are defined as:
△(Ψτ) = Ψτ and △(τ) = Ψτ, where τ ≠ Ψτ′
▽(Ψτ) = τ and ▽(τ) = τ, where τ ≠ Ψτ′
It is obvious that ∀τ.▽(τ) ≅ τ ≅ △(τ), and ∀τ, τ′.τ ≅ τ′ iff ▽(τ) = ▽(τ′) iff △(τ) = △(τ′).
Definition 4: (Structural Containment)
We define a structural containment relation τ ∊ ω as follows. τ ∊ τ′ if τ is structurally present as a part of τ′. τ ∊ _e_ if τ is structurally present as a part of _e_, as a type annotation. τ ∊ Γ if ∃ _x_ ↦ τ′ ∊ Γ such that τ ∊ τ′. τ ∊ Σ if ∃ ?L? ↦ τ′ ∊ Σ such that τ ∊ τ′. We write τ ∊ ω* if τ ∊ ω, for any ω ∊ {ω*}.
Definition 5: (Free Type Variables)
We denote the set of free type variables in a type τ as ?ftv?(τ).
?ftv?(α) = α
?ftv?(unit) = {}
?ftv?(bool) = {}
?ftv?(⇑τ) = ?ftv?(τ)
?ftv?(Ψτ) = ?ftv?(τ)
?ftv?(τ1 → τ2) = ?ftv?(τ1) ∪ ?ftv?(τ2)
?ftv?(τi*) = ∪ ?ftv?(τi)
?ftv?(σ) = ?ftv?(α*) ∪ ?ftv?(τ), where σ = ∀α*.τ
?ftv?(Γ) = ∪ ?ftv?(σi), ∀ _x_ ↦ σi ∊ Γ
?ftv?(Σ) = ∪ ?ftv?(τi), ∀ ?L? ↦ τi ∊ Σ
?ftv?(_e_) = ∪ ?ftv?(τi), ∀ τi ∊ _e_
?ftv?(ω*) = ∪ ?ftv?(ω).
Definition 6: (Value Restriction)
We define some definitions used in the enforcement of value restriction in Figure 3.
?Value?(_v_) = ?true?
?Value?(_x_) = ?true?
?Value?(ℓ) = ?true?
?Value?(l) = ?true?
?Value?(_e_ : τ) = ?Value??(?_e_?)?
?Value?(dup(_e_)) = ?Value??(?_e_?)?
?Value?(_e_^) = ?Value??(?_e_?)?
?Value??(?if _e_1 then _e_2 else _e_3?)?
= ?Value??(?_e_1?)? ∧ ?Value??(?_e_2?)? ∧ ?Value??(?_e_3?)?
?Value?(let _x_ = _e_1 in _e_2) = ?Value??(?_e_1?)? ∧ ?Value??(?_e_2?)?
?Value?(_e_ (otherwise) ) = ?false?
?Expansive?(_e_) = ¬?Value??(?_e_?)?
?Immutable?(unit) = ?true?
?Immutable?(bool) = ?true?
?Immutable?(τ1 → τ2) = ?true?
?Immutable?(⇑τ) = ?Immutable??(?τ?)?
?Immutable?(τ (otherwise) ) = ?false?
?Immut?(unit) = ?true?
?Immut?(bool) = ?true?
?Immut?(α) = ?true?
?Immut?(τ1 → τ2) = ?true?
?Immut?(⇑τ) = ?Immut??(?τ?)?
?Immut?(∀α*.τ) = ?Immut??(?τ?)?
?Immut?(τ (otherwise) ) = ?false?
?Mut?(τ) = ¬?Immut??(?τ?)?
Definition 7: (Stack and Heap Typing)
A heap H and a stack S are said to be well typed with respect to a binding context Γ and store typing Σ, and written Γ; Σ ⊢ H + S if
-
dom(Σ) = dom(H) ∪ dom(S)
-
∀ℓ ∊ dom(H), Γ; Σ ⊢ H(ℓ) ≼: Σ(ℓ)
-
∀l ∊ dom(S), Γ; Σ ⊢ S(l) ≼: Σ(l)
Lemma 1: (Inversion of Typing Relation)
-
If Γ; Σ ⊢ () : τ then τ = unit.
-
If Γ; Σ ⊢ true : τ then τ = bool.
-
If Γ; Σ ⊢ false : τ then τ = bool.
-
If Γ; Σ ⊢ ℓ : τ then τ = ⇑τ′.
-
If Γ; Σ ⊢ λ_x_._e_ : τ then τ = τ′1 → τ′2 such that τ′1 = ▽(τ1), τ′2 = △(τ2), and, Γ, _x_ ↦ τ1; Σ ⊢ _e_ : τ2.
-
If Γ; Σ ⊢ _e_^ : τ then Γ; Σ ⊢ _e_ ≼: τ′.
-
Other cases are similar.
Proof 1
Immediate from the definition of typing relation.
Lemma 2: (Inversion of Copy Coercion)
-
If τ ≼: bool then τ = bool or τ = Ψbool.
-
If τ ≼: unit then τ = unit or τ = Ψbool.
-
If τ ≼: τ1 → τ2 then τ = τ1 → τ2 or τ = Ψ(τ1 → τ2).
-
If τ ≼: ⇑τ′ then τ = ⇑τ′ or τ = Ψ⇑τ′.
-
If τ ≼: Ψτ′ then τ = Ψτ″ such that τ″ ≼: τ′.
-
If τ ≼: Ψτ′ then τ ≼: τ′.
Proof 1
By induction on the copy coercion derivation.
Lemma 3: (Canonical Forms)
-
If _v_ is a value, and Γ; Σ ⊢ _v_ ≼: unit, then, _v_ is ().
-
If _v_ is a value, and Γ; Σ ⊢ _v_ ≼: bool, then, _v_ is either true or false.
-
If _v_ is a value, and Γ; Σ ⊢ _v_ ≼: ⇑τ, then, _v_ is ℓ, ℓ ∊ dom(Σ).
-
If _v_ is a value, and Γ; Σ ⊢ _v_ ≼: τ1 → τ2, then, _v_ is λ_x_._e_.
Proof 1
By induction on the derivation of Γ; Σ ⊢ _v_ ≼: τ.
If Γ; Σ ⊢ _v_ ≼: bool, we have Γ; Σ ⊢ _v_ : τ and τ ≼: bool by Inversion of copy coercion relation, τ = bool or τ = Ψbool. If τ = bool, it is clear that the final rule in the derivation must be T-True, or T-False, in which case the result is immediate. The case τ = Ψbool cannot happen because there is no rule that derives a mutable type for a value, and we assume that the induction hypothesis Γ; Σ ⊢ _v_ : τ holds.
Other cases of the lemma are similar.
Lemma 4: (Progress)
If _e_ is a closed, well typed term, that is, ∅; Σ ⊢ _e_ : τ for some τ and Σ, given any heap H and stack S such that Γ; Σ ⊢ H + S,
-
If ⊢lval _e_, then _e_ is either a valid lvalue £ (that is, £ = l, l ∊ dom(S) or £ = ℓ^, ℓ ∊ dom(H)) or else ∃ _e_′, S′, H′ such that:S; H; _e_ ⇛ S′; H′; _e_′.
-
_e_ is a value _v_ or else ∃ _e_′, S′, H′ such that S; H; _e_ ⇒ S′; H′; _e_′.
Proof 1
By induction on the typing derivation.
-
Case T-Unit, T-True, T-False, T-Hloc, T-Lambda: (Values) Result is immediate for right execution, and cannot happen for right execution
-
Case T-Id: cannot happen, there is no execution rule for variables.
-
Case T-Sloc: Immediate for left execution. Right execution and can always continue with E-Rval rule as the stack is well typed (Γ; Σ ⊢ H + S).
-
Case T-App: Only right execution is possible, no application is well typed as an lvalue. We have: _e_ = _e_1 _e_2, _e_1 ≼: τ1 → τ2, and _e_2 ≼: τ1. If _e_1 or _e_2 is not a value, we can take E-App1# or E-App2#. Otherwise, when both _e_1 and _e_2 are values, by canonical forms lemma, _e_1 is of the form λ_x_._e_′, and we can take the step E-App.
-
Case T-If: Similar to T-App, only right execution is permitted.
-
Case T-Set: Only right execution is applicable. We have: _e_ = _e_1 := _e_2, _e_1 ≼: τ1 → τ2, Γ; Σ ⊢ _e_1 ≼: Ψτ, Γ; Σ ⊢ _e_2 ≼: τ, and ⊢lval _e_1. If _e_1 not an lvalue, since we have ⊢lval _e_1 we can take E-:=lhs# by induction hypothesis. Similarly, if _e_2 is not a value, we can take E-:=rhs#. Finally, if _e_1 = £ and _e_1 = _v_, we can take the step E-:=Stack or E-:=Heap as applicable.
-
Case T-Dup: Only right execution is permitted, and can take E-Dup# or E-Dup as applicable.
-
Case T-Deref: We have: _e_ = _e_1^, and Γ; Σ ⊢ _e_1 ≼: ⇑τ. Execution can take EL-^# or E-^# as applicable if _e_1 is not a value. If _e_1 ≼: ⇑τ is a value, then, from the canonical forms lemma, _e_1 = ℓ, ℓ ∊ dom(Σ). Now, since this is an lvalue, we are done in the case of left execution. In the case of right execution, we can take step E-^ since the heap is well typed (Γ; Σ ⊢ H + S).
-
Case T-Let-M: Only right execution is applicable. We have: _e_ = (letψ _x_ = _e_1 in _e_2), τ ≼: τ1, Γ; Σ; _e_1 ⊢gen τ ⋖ σ, and ⊢loc _x_ : σ, andΓ, _x_ ↦ σ; Σ ⊢ _e_2 : τ2. If _e_1 is not a value, we can take E-Let#. Otherwise, we can take E-Let-M.
-
Case T-Let-P: Only right execution is applicable. We have: _e_ = (letψ _x_ = _e_1 in _e_2), τ ≼: τ1, Γ; Σ; _e_1 ⊢gen τ ⋖ σ, and ⊢term _x_ : σ, andΓ, _x_ ↦ σ; Σ ⊢ _e_2 : τ2. If _e_1 is not a value, we can take E-Let#. Otherwise, can take E-Let-P.
-
Case T-TqExpr and Case T-Let-M-Tq, T-Let-P-Tq are similar.
Lemma 5: (Weakening)
We will write Γ; Σ ⊢ _e_ : τ ⋖ σ as a shorthand for Γ; Σ ⊢ _e_ : τ, and Γ; Σ; _e_ ⊢gen τ ⋖ σ.
-
If Γ; Σ ⊢ _e_ : τ then,
-
If Γ′ ⊇ Γ then Γ′; Σ ⊢ _e_ : τ.
-
If Σ′ ⊇ Σ then Γ; Σ′ ⊢ _e_ : τ.
-
-
If Γ; Σ ⊢ _e_ : τ ⋖ σ, σ = ∀α*.τ
-
If Γ′ ⊇ Γ and ?ftv?(Γ′) ∩ ?ftv?(α*) = ∅ then Γ′; Σ ⊢ _e_ : τ ⋖ σ.
-
If Σ′ ⊇ Σ and ?ftv?(Σ′) ∩ ?ftv?(α*) = ∅ then Γ; Σ′ ⊢ _e_ : τ ⋖ σ.
-
Proof 1
Straightforward induction on the typing derivation.
Lemma 6: (Value Substitution)
If Γ, _x_ : σ; Σ ⊢ _e_ : τ, ?Immut??(?σ?)? and Γ; Σ ⊢ _v_ : τv, and Γ; Σ; _e_ ⊢gen τv ⋖ σ then Γ; Σ ⊢ _e_[_v_/_x_] : τ
Proof 1
By induction on the typing derivation of Γ, _x_ : σ; Σ ⊢ _e_ : τ. We proceed by case analysis on the final step of the derivation.
-
Case T-Id: We have _e_ = _y_, where _y_ ∊ Γ, _x_, σ.
There are two sub cases to consider. If _x_ = _y_, then, _y_[_v_/_x_] = _v_, and the result type τ is an instantiation of the type scheme σ. One of the assumptions of the lemma states that Γ; Σ ⊢ _v_ : τv ⋖ σ. That is, τ ⊑ σ, and we can infer any more-specific type (and in particular the type being instantiated at the T-Id rule) instead for this substitution of the expression _v_. Therefore, we have Γ; Σ ⊢ _e_[_v_/_x_] : τ.
If _x_ ≠ _y_, then _y_[_v_/_x_] = _y_, and the result is immediate.
-
Case T-Lambda: We have _e_ = λ_y_._e_′, and τ = τ1 → τ2, and Γ, _x_ : σ, _y_ : τ1; Σ ⊢ _e_′ : τ2.
We can assume that _x_ ≠ _y_. Since it is clear that the type τ1 of _y_ can either use variables already in Γ or fresh type variables, we know that ?ftv?(Γ, _y_ : τ1) ∩ ?ftv?(σ) = ∅. Thus, by weakening lemma, we have: Γ, _y_ : τ1; Σ ⊢ _v_ : τv ⋖ σ , and, by induction hypothesis, Γ, _y_ : τ1; Σ ⊢ _e_′[_v_/_x_] : τ2. Finally, by the T-Lambda rule, we have: Γ; Σ ⊢ λ_x_y.(_e_′[_v_/_x_]) : τ1 → τ2, and thus Γ; Σ ⊢ λ_x_y._e_′[_v_/_x_] : τ1 → τ2, which is the desired result.
-
T-Set case is similar, except that the substitution cannot happen on the LHS of an assignment, since we do not perform substitution of mutable values.
-
Other cases are similar.
Lemma 7: (Location Substitution)
If Γ, _x_ : τ0; Σ ⊢ _e_ : τ, and for some Σ′ ⊇ Σ, Σ(l) = τ0 : , then Γ; Σ′ ⊢ _e_[l/_x_] : τ.
Proof 1
By induction on the typing derivation of Γ, _x_ : τ; Σ ⊢ _e_ : τ, similar to lemma 6.
Lemma 8: (Stack and Heap Assignment)
-
If Γ; Σ ⊢ H, ℓ ↦ _v_ + S, and Σ(ℓ) ≼: τ, and Γ; Σ ⊢ _v_′ : τ, then, Γ; Σ ⊢ H, ℓ ↦ _v_′ + S.
-
Similarly, if Γ; Σ ⊢ H + S, l ↦ _v_ and Σ(l) ≼: τ, and Γ; Σ ⊢ _v_′ : τ, then, Γ; Σ ⊢ H + S, l ↦ _v_′.
Proof 1
Immediate from the definition of stack and heap typing.
Lemma 9: (Preservation)
If Γ; Σ ⊢ _e_ : τ and Γ; Σ ⊢ H + S then,
-
If S; H; _e_ ⇛ S′; H′; _e_′, then, there exists a Σ′ ⊇ Σ such that Γ; Σ′ ⊢ _e_′ : τ and Γ; Σ′ ⊢ H′ + S′.
-
If S; H; _e_ ⇒ S′; H′; _e_′, there exists a Σ′ ⊇ Σ such that Γ; Σ′ ⊢ _e_′ ≼: τ′, Γ; Σ′ ⊢ H′ + S′ and ▽(τ) = ▽(τ′).
Proof 1
By induction on the derivation of Γ; Σ ⊢ _e_ : τ. We proceed by the case analysis of the final step.
-
Case T-Id, T-True, T-False, T-Hloc, T-Lambda cannot happen.
-
Case T-Sloc: Only right execution is applicable. We have: _e_ = l and Σ(l) : τ. The only applicable step is E-Rval, and we have _e_′ = S(l). From the definition of stack typing, we have: S(l) ≼: Σ(l) and thus _e_′ ≼: τ which implies ▽(τ) = ▽(τ′).
-
Case T-App: _e_ = _e_1 _e_2, and Γ; Σ ⊢ _e_1 ≼: τ2 → τ0, and Γ; Σ ⊢ _e_2 ≼: τ2, and τ0 ≼: τ, and _e_ : τ where τ2 = ▽(τ′2)τ0 = △(τ′0) and .
This cannot happen for left execution. For right execution, we proceed by further case analysis of the applicable execution rules for S; H; _e_ ⇒ S′; H′; _e_′.
-
Case E-App1#: We have: S; H; _e_1 ⇒ S′; H′; _e_′1 and _e_′ = _e_′1 _e_2. By induction hypothesis, we have: Γ; Σ′ ⊢ _e_′1 ≼: τ2 → τ0 for some Σ′ ⊇ Σ. One of the assumptions of the T-App rule states that Γ; Σ ⊢ _e_2 ≼: τ2, and by weakening lemma, we have, Γ; Σ′ ⊢ _e_2 ≼: τ2. Finally, by the T-App rule, we conclude that (_e_′1 _e_2) : τ.
-
Case E-App2#: Similar to the previous sub-case.
-
Case E-App: We have: _e_1 = λ_x_._e_0 and _e_2 = _v_ and and _e_′ = _e_0[l/_x_] and S, l ↦ _v_; H; _e_1 ⇒ S′; H′; _e_′1.
By the inversion lemma for λ_x_._e_0 we have Γ, Σ ⊢ _e_0 : τ′0.
Further from location substitution lemma, we have Γ; Σ′ ⊢ _e_0[l/_x_] : τ′0 where Σ′ ⊇ Σ and Σ(l) : τ′2.
Thus, we have τ0 ≼: τ′0 and τ0 ≼: τ. Therefore, it is clear that ▽(τ′0) = ▽(τ).
-
-
Case T-Set: _e_ = _e_1 := _e_2, and Γ; Σ ⊢ _e_1 ≼: Ψτ, and Γ; Σ ⊢ _e_2 ≼: τ ⊢lval _e_1
If the step taken is E-:=#lhs or E-:=#rhs, the result follows from the induction hypothesis and T-Set rule (as in the case of T-App). If the step taken is E-:=Stack or E-:=Heap, the result follows from the stack and heap assignment lemma.
-
Case T-Deref: We have: _e_ = _e_′^ and Γ; Σ ⊢ _e_′ ≼: ⇑τ. If the step taken is EL-^# or E-^#, the result follows from induction hypothesis and T-Deref rule. If the step taken is E-^ (right execution only) _e_′ is a value, and from canonical forms lemma, we know that _e_′ = ℓ and ℓ ∊ dom(Σ) and the result follows from the fact that Γ; Σ ⊢ H + S.
-
Case T-Let-P: Right execution only. We have: _e_ = (let∀ _x_ = _e_1 in _e_2) and Γ; Σ ⊢ _e_1 ≼: τ1 and τ ≼: τ1 and Γ; Σ; _e_1 ⊢gen τ ⋖ σ and Γ, _x_ ↦ σ; Σ ⊢ _e_2 : τ2. There are two sub-cases to consider:
-
If we take step E-Let#, S; H; _e_1 ⇒ S′; H′; _e_′1 and _e_′ = (let∀ _x_ = _e_′1 in _e_2). If _e_ = _v_ It is clear that ?Value??(?_e_1?)? implies ?Value??(?_e_′1?)?. Now, the result follows from the induction hypothesis and the E-Let-P rule.
-
If we take the step E-Let-P, _e_ = (let∀ _x_ = _v_ in _e_2) Since _x_ : σ has a polymorphic type, (that is, σ = ∀α*.τ) we know that ?Immut??(?τ?)?. Also, from canonical forms lemma, all values have an immutable type. Therefore, τ = τ1. Now, the result follows from value substitution lemma.
-
-
Case T-Let-M: Similar to T-Let-P, except that we should always use the GEN-EXPANSIVE rule during generalization, and use the location substitution lemma instead of the value substitution lemma
-
Cases T-If, T-Dup, T-TqExpr, and T-Let-M-Tq T-Let-P-Tq are similar.
Definition 8: (Stuck State)
A system state S; H; _e_ is said to be stuck if _e_ ≠ _v_ and there are no S′, H′, and _e_′ such that S; H; _e_ ⇒ S′; H′; _e_′.
Theorem 1: (Type Soundness)
If ∅; Σ ⊢ _e_ : τ and Γ; Σ ⊢ H + S and S; H; _e_ ⇒* S′; H′; _e_′ then S′; H′; _e_′ is not stuck. That is, execution of a well typed expression cannot lead to a stuck state. Here, ⇒* represents the reflexive-transitive-closure of ⇒.
Proof 1
By straightforward induction on the length of ⇒*. If _e_ = _v_, proof is immediate. Otherwise, from Lemma 4 (Progress), we know that we can take at least one step forward. Further, from Lemma 9 (Preservation), we know that a (left/right) execution of a well typed expression in with respect to a well typed stack and heap will always result in another well typed expression, stack and heap. Proof now follows from induction hypothesis.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 5.
Intermediate Declarative System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 6.
Type Inference Rules.
|
|
|
|
|
|
|
|
|
|
Figure 7.
Unification Rules.
|
|
|
|
|
|
|
|
|
|
Figure 8.
Solving Copy Compatibility Constraints.
7.3 Heuristic Type Inference
Syntax
Types |
τ ::= |
α | unit | bool | ⌊τ⌋ → ⌈τ⌉ |
|
|
| ⇑τ | Ψτ |
Maybe Type |
|
| τ↓τ′ |
Unct. Types |
ς ::= |
α | unit | bool | ⌊ς⌋ → ⌈ς⌉ |
|
|
| ⇑ς | Ψς |
Type Scheme |
σ ::= |
τ | ∀α.σ |
Constraints |
?c? ::= |
τ = τ | τ ≅ τ | τ ≼: τ |
Constraint Sets |
*C* ::= |
∅ | {?c?*} | *C* ∪ *C* |
Substitutions |
θ ::= |
〈〉 | [α ↣ τ] | [κ ↣ ϰ] | θ ∘ θ |
The application of a substitution θ on X is written as θ〈X〉. As a matter of notational convenience, we write: θa.b to mean θa ∘ θb. Note that θa.b〈x〉 = θa ∘ θb〈x〉 = θa〈θb〈x〉〉 = θb〈θa〈x〉〉.
Definition 9: (Meta Constructors)
⌊τ⌋ and ⌈τ⌉ are "meta-constructors" which (respectively) minimize and maximize the mutability of a type, but are interpreted lazily. The meta-constructors are idempotent.
Note that in our type system, we have restricted meta types to be syntactically present only as part of function types.
Definition 10: (FTVs (Extension))
We enhance the definition of ?ftv?(...) in Definition 5 as follows:
...
?ftv?(⌊τ1⌋ → ⌈τ2⌉) = ?ftv?(τ1) ∪ ?ftv?(τ2)
?ftv?(τ1 = τ2) = ?ftv?(τ1) ∪ ?ftv?(τ2)
?ftv?(τ1 ≅ τ2) = ?ftv?(τ1) ∪ ?ftv?(τ2)
?ftv?(τ1 ≼: τ2) = ?ftv?(τ1) ∪ ?ftv?(τ2)
?ftv?(*C*) = ∪ ?ftv?(?c?i), ∀ ?c?i ∊ *C*
?ftv?(ρ) = ?ftv?(τ) ∪ ?ftv?(*C*), where ρ = τ \ *C*
?ftv?(σ) = ?ftv?(α*) ∪ ?ftv?(ρ), where σ = ∀α*.ρ
?ftv?(τ1↓τ2) = ?ftv?(τ1) ∪ ?ftv?(ℑ(τ2)).
Definition 11: (MTVs)
The function ?mtv?(ω) = α* is defined to be the set of all type variables within the solvable entity ω. That is, it returns the set of all type-variables α* where α occurs within a maybe type as α↓τh.
?mtv?(α) = {}
?mtv?(unit) = {}
?mtv?(bool) = {}
?mtv?(α↓τh) = α ∪ ?mtv??(?τh?)?
?mtv?(τ↓τh) = ?mtv??(?τ?)?, where τ ≠ α
?mtv?(⇑τ) = ?mtv??(?τ?)?
?mtv?(Ψτ) = ?mtv??(?τ?)?
?mtv?(τ1 → τ2) = ?mtv??(?τ1?)? ∪ ?mtv??(?τ2?)?
?mtv?(Γ) = ∪ ?mtv??(?τi?)?, ∀ _x_i ↦ τi ∊ Γ
?mtv?(Σ) = ∪ ?mtv??(?τi?)?, ∀ ?L?i ↦ τi ∊ Σ
?mtv?(_e_) = ∪ ?mtv??(?τi?)?, ∀ τi ∊ _e_
?mtv?(ω*) = ∪ ?mtv??(?ω?)?
Definition 12: (NTVs)
The set of unconstrained variables is defined as:
?ntv?(α) = {α}
?ntv?(unit) = {}
?ntv?(bool) = {}
?ntv?(α↓τh) = ?ntv??(?τh?)?
?ntv?(τ↓τh) = ?ntv??(?τ?)?, where τ ≠ α
?ntv?(⇑τ) = ?ntv??(?τ?)?
?ntv?(Ψτ) = ?ntv??(?τ?)?
?ntv?(τ1 → τ2) = ?ntv??(?τ1?)? ∪ ?ntv??(?τ2?)?
?ntv?(Γ) = ∪ ?ntv??(?τi?)?, ∀ _x_i ↦ τi ∊ Γ
?ntv?(Σ) = ∪ ?ntv??(?τi?)?, ∀ ?L?i ↦ τi ∊ Σ
?ntv?(_e_) = ∪ ?ntv??(?τi?)?, ∀ τi ∊ _e_
?ntv?(ω*) = ∪ ?ntv??(?ω?)?
Definition 13: (TVs)
The set of all type variables in a solvable entity is given by TV() function, defined as follows:
?tv?(ω) = ?mtv??(?ω?)? ∪ ?ntv??(?ω?)?
?tv?(ω*) = ∪ ?tv??(?ω?)?
Note that this function is different from FTV(), defined in Definition 10.
Definition 14: (Constraint Set Extraction)
|
|
|
|
|
|
|
Definition 15: (Consistency of Maybe types)
We identify the following consistency property on maybe types.
|
Definition 16: (Constraint Set Closure)
A closure operation ?close??(?*C*?)? on a constraint set *C* produces an equivalent set of atomic constraints by using the copy coercion rules defined in Figure 3 (note that this conversion is total); and by explicitely adding all transitive relationships. This conversion does not affect the MPCs in *C*.
-
?close?({τ1 → τ2 ≼: τ′1 → τ′2} ∪ *C*) = ?close??(?*C* ∪ {τ1 ≼: τ′1, τ2 ≼: τ′2, τ′1 ≼: τ1, τ′2 ≼: τ2}?)?
-
?close?({⇑τ ≼: ⇑τ′} ∪ *C*) = ?close??(?*C* ∪ {τ ≼: τ′, τ′ ≼: τ}?)?
-
?close?({Ψτ ≼: Ψτ′} ∪ *C*) = ?close??(?*C* ∪ {τ ≼: τ′, τ′ ≼: τ}?)?
-
?close??(?{τ1 ≼: τ2, τ2 ≼: τ3} ∪ *C*?)? where τ1 ≼: τ3 ∉ *C*
= ?close??(?{τ1 ≼: τ2, τ2 ≼: τ3, τ1 ≼: τ3} ∪ *C*?)? -
?close??(?*C*?)? where none of the above cases are applicable = *C*
Definition 17: (Normalization of Constraint Sets)
The normalization function Ν(*C*) is used to obtain a normalized form of a constraint set *C* by re-writing it only in terms of closed subtype constraints.
Ν(*C*) = ?close??(?*C*′?)? where *C*′ is obtained from *C* using the normalizing translations:
-
〚τ1 = τ2〛 ↝ 〚τ1 ≼: τ2, τ2 ≼: τ1〛
-
〚τ1 ≅ τ2〛 ↝ 〚τ1 ≼: α, τ2 ≼: α〛, where ⊧new α.
Definition 18: (Canonical forms of Solvable Entities)
For any ω such that 〈〉 ⊧sat {|ω|}, we write ω to represent the equivalent entity in which (1) all meta-constructors in are fully interpreted, and (2) the (tautological) constraints embedded within ω are removed to obtain an equivalent unconstrained entity.
For example, if τ = Ψbool↓bool, then τ = Ψbool.
Definition 19: (Constraint Satisfiability)
A Substitution θ satisfies a constraint set *C*, written θ ⊧sat *C* iff Ν(θ〈*C*〉) consists only of tautologies (trivially).
Definition 20: (Consistency and Acyclicity)
|
|
|
|
|
|
|
Definition 21: (Type Inference)
Type inference is a program transformation that accepts a program in which let expressions are not annotated with their kinds, and returns the same programs in which let expressions are annotated with their kinds and all expressions are annotated with their types.
The type inference algorithm is as shown in Figure 6. The inference judgment Γ; Σ ⊢i _e_ : τ ∥ θ should be understood as: given the binding context Γ and the store typing Σ, we infer the type τ for the expression _e_. θ is list of substitutions obtained by unifications performed during inference, and must be propagated to further derivations. The judgment ⊧new α* identifies new type variables.
We use the following shorthand translations as a notational convenience. The translations defined in section 7.2 are repeated here.
-
〚Γ; Σ ⊢ _e_ ≼: τ〛 ↝ 〚Γ; Σ ⊢ _e_ : τ′, τ ≼: τ′〛
-
〚Γ; Σ ⊢i _e_ : τ〛 ↝ 〚Γ; Σ ⊢i _e_ : τ ∥ 〈〉〛
-
〚θ0 ∥ Γ; Σ ⊢i _e_ : τ ∥ θ〛 ↝ 〚θ0〈Γ〉; θ0〈Σ〉 ⊢i θ0〈_e_〉 : τ ∥ θ〛
-
〚θ ∥ Γ; Σ ⊢ _e_ : τ〛 ↝ 〚θ〈Γ〉; θ〈Σ〉 ⊢ θ〈_e_〉 : θ〈τ〉〛
-
〚θ ∥ Γ; Σ ⊢j _e_ : τ〛 ↝ 〚θ〈Γ〉; θ〈Σ〉 ⊢j θ〈_e_〉 : θ〈τ〉〛
-
〚θ ∥ Γ; Σ ⊢⋄ _e_ : τ〛 ↝ 〚θ〈Γ〉; θ〈Σ〉 ⊢ θ〈_e_〉 : θ〈τ〉〛
-
〚θ{...}〛 ↝ 〚θ〈{...}〉〛
-
〚θ{|...|}〛 ↝ 〚θ〈{|...|}〉〛
Unification rules are as shown in Figure 8. The unification judgment θ ⊢unf τ1 = τ2 is understood as: τ1 unifies with τ2 under the substitution θ.
A constraint solver for solving copy compatibility constraints at let-boundaries is defined in Figure 7. The judgment θ; _x_; _e_ ⊢solve τ1 :≫ τ2 should be read as: the (possibly) constrained type τ1 for the identifier _x_ (possibly) used in the expression _e_ is transformed to the unconstrained type τ2 by solving all the copy compatibility constraints.
We prove the soundness of the inference system, through an intermediate declarative system defined in Figure 5
Lemma 10: ( ≼: implies ⊴:)
If τ ≼: τ′, then τ ⊴: τ′.
Proof 1
Evident from the definition of ⊴: in Figure 5. Both ≼: and ⊴: are partial functions on types, but ⊴: clearly covers all cases that ≼: relates.
Lemma 11: (⊴: begets ≼: )
If τ ⊴: τ′, then ∃ θ such that θ〈τ〉 ≼: θ〈τ′〉.
Proof 1
By construction of θ. One possible solution is θ = ∀ α↓τh ∊ τ , [α ↣ ℑ(τh)].
Lemma 12: (Consistency of ⊴:)
If τ ⊴: τ′ and ⊪ {τ, τ′}, then ∀ θ such that θ ⊧sat {|τ|}, we have θ〈τ〉 ≼: θ〈τ′〉.
Proof 1
By straightforward induction over the derivation of τ ⊴: τ′, with case analysis over the definition of ⊴: in Figure 5. A normalizing derivation with no redundant applications of reflexive and transitive rules must be considered.
Lemma 13: (Weakening of Satisfiability and Consistency)
Weakening of properties over solvable entities:
-
If θ ⊧sat *C* and *C*′ ⊆ *C*, then θ ⊧sat *C*′.
-
If ⊧cst *C* and *C*′ ⊆ *C*, then ⊧cst *C*′.
-
If ⊪ {ωn*}, and {ωm*} ⊆ {ωn*}, then ⊪ {ωm*}.
-
If ⊪ {ω*}, then ⊧cst {|ω*|}.
Lemma 14: (Satisfied constraints are consistent)
If θ ⊧sat {|ω*|}, then ⊪ θ{ω*}.
Proof 1
Due to assumption (3), since θ satisfies all constraints in ω*, we must have ?mtv??(?θ〈ω*〉?)? = ∅. That is, there exists no α↓τ ∊ θ〈ω*〉. Further, we must also have ∀ τ↓τ′ ∊ θ〈ω*〉, τ ≅ ℑ(τ′). Now, from Definition 15, we obtain ⊪ θ{ω*}.
Lemma 15: (Substitution on Declarative Derivation)
If Γ; Σ ⊢ _e_ : τ then θ ∥ Γ; Σ ⊢ _e_ : τ.
Proof 1
Straightforward induction on the derivation of Γ; Σ ⊢ _e_ : τ, except for the fact that we should use appropriate α-renaming on generalized variables ∀σ ∊ Γ, so that generalized variables do not get substituted.
Lemma 16: (Consistency of Intermediate Derivation)
If Γ; Σ ⊢j _e_ : τ, then ⊪ {Γ, Σ, _e_, τ}.
Proof 1
Lemma 17: (Substitution Consistency of Maybe types)
If
-
⊪ {ωn*}
-
{ωm*} ⊆ {ωn*}
-
θ is a substitution such that ⊪ {θ〈ωm*〉}
-
dom(θ) ∩ ?tv??(?ωn*?)? = ?tv??(?ωm*?)?
Then, ⊪ θ{ωn*}.
Proof 1
-
Due to assumption (1), and Definition 15, we have
-
∀ α, τ, τ′ such that τ↓τ′ ∊ ωn*, τ = α or τ ≅ ℑ(τ′)
-
∀ α↓τ and α↓τ′ ∊ ωn*, τ = τ′
-
?mtv??(?ωn*?)? ∩ ?ntv??(?ωn*?)? = ∅
-
-
Due to assumption (3), Definition 15, we have
-
∀ α, τ, τ′ such that τ↓τ′ ∊ θ〈ωm*〉, τ = α or τ ≅ ℑ(τ′)
-
∀ α↓τ and α↓τ′ ∊ θ〈ωm*〉, τ = τ′
-
?mtv??(?θ〈ωm*〉?)? ∩ ?ntv??(?θ〈ωm*〉?)? = ∅
-
-
Due to assumption (4), we can write θ = θ1 ∘ θ2 ∘ θ3 such that dom(θ1) = ?mtv??(?ωm*?)?, dom(θ2) = ?ntv??(?ωm*?)?, and dom(θ3) ∩ ?tv??(?ωn*?)? = ∅.
-
From cases (2.a, 2.b, 2.c, 1.a, and 3), we can conclude that ∀ α, τ, τ′ such that τ↓τ′ ∊ θ〈ωn*〉, τ = α or τ ≅ ℑ(τ′)
-
From cases (2.b, 1.b, and 3), we conclude that ∀ α↓τ and α↓τ′ ∊ θ〈ωn*〉, τ = τ′.
-
From cases (2.c, 1.c, and 3), we conclude that ?mtv??(?θ〈ωn*〉?)? ∩ ?ntv??(?θ〈ωn*〉?)? = ∅
-
From cases (4, 5, 6) and Definition 15, we obtain ⊪ θ{ωn*}.
Lemma 18: (Partial solutions)
If
-
⊪ {ωn*}
-
{ωm*} ⊆ {ωn*}
-
θm ⊧sat {|ωm*|}
-
dom(θm) ∩ ?tv??(?ωn*?)? = ?tv??(?ωm*?)?
Then,
-
⊪ θm{ωn*}.
-
∃ θn such that θn ⊧sat θm{|ωn*|}, and θn.m{ωm*} = θm{ωm*}.
Proof 1
-
From assumption (3) and Lemma 14, we conclude that ⊪ θm{ωm*}.
-
From assumptions (1 and 2), case (1), assumptions (4) and Lemma 17, we obtain ⊪ θm{ωn*}, the required conclusion (1).
-
Conclusion (2) can be proved by construction of θn. One possibility is θn = [α ↣ ℑ(τ)], ∀ α↓τ ∊ θm〈ωn*〉 .
-
Clearly, θn ⊧sat θm{|ωn*|}.
-
We know that dom(θ) = ?mtv??(?θm{|ωn*|}?)?.
-
Due to assumption (3), we must have ?mtv??(?θm〈ωm*〉?)? = ∅.
-
By construction, dom(θ) ∩ ?ntv??(?θm{|ωn*|}?)? = ∅, and consequently, dom(θ) ∩ ?ntv??(?θm{|ωm*|}?)? = ∅.
-
From cases (3.c and 3.d), we conclude that θn.m{ωm*} = θm{ωm*}
-
Lemma 19: (Additivity of Consistent Entities)
If
-
⊪ {ω*, ωm*}
-
⊪ {ω*, ωn*}
-
?tv??(?ωm*?)? ∩ ?tv??(?ωn*?)? = ?tv??(?ω*?)?
Then, ⊪ {ω*, ωm*, ωn*}.
Proof 1
-
∀ τ↓τ′ ∊ ω*, ωm*, or ωn* , if τ ≠ α for some α, then from assumptions (1 and 2), we must have τ ≅ ℑ(τ′).
-
∀ α↓τi ∊ ωm*, and ωn* , from assumptions (1, 2, and 3), and Definition 15, we have ∃ α↓τ0 ∊ ω* , such that τi* = τ0.
-
Due to assumption (1), we must have ?mtv??(?ω*, ωm*?)? ∩ ?ntv??(?ω*, ωm*?)? = ∅. Due to assumption (2), we must have ?mtv??(?ω*, ωn*?)? ∩ ?ntv??(?ω*, ωn*?)? = ∅. This, along with assumption (3) gives us ?mtv??(?ω*, ωm*, ωn*?)? ∩ ?ntv??(?ω*, ωm*, ωn*?)? = ∅.
-
From cases (1, 2, and 3), and Definition 15, we obtain ⊪ {ω*, ωm*, ωn*}.
Theorem 2: (Soundness of Intermediate System)
If:
-
Γ; Σ ⊢j _e_ : τ
-
θ ⊧sat {|Γ, Σ, _e_, τ|}.
Then, θ ∥ Γ; Σ ⊢⋄ _e_ : τ.
Proof 1
By induction on the derivation of Γ; Σ ⊢j _e_ : τ We proceed by case analysis on the last step, assuming α-reduction vacuously.
-
Cases J-Unit, J-True. J-False, J-Id, J-Hloc, J-Sloc are trivial.
-
Case J-Lambda:
-
In this case, we have:
-
Γ, _x_ ↦ τ1; Σ ⊢j _e_ : τ2
⊪ {Γ, Σ, _e_, τ1, τ2}
Γ; Σ ⊢j λ_x_._e_ : ⌊τ1⌋ → ⌈τ2⌉
-
θ ⊧sat {|Γ, Σ, λ_x_._e_i, ⌊τ1⌋ → ⌈τ2⌉|}
and we need to show that θ ∥ Γ; Σ ⊢⋄ λ_x_._e_i : ⌊τ1⌋ → ⌈τ2⌉.
-
-
It is evident (from the syntactic structure) that {|λ_x_._e_i|} = {|_e_i|} and {|⌊τ1⌋ → ⌈τ2⌉|} = {|τ1, τ2|}. Therefore, we can re-write case (2.a.ii) as θ ⊧sat {|Γ, Σ, _e_i, τ1, τ2|}. Similarly, since {|Γ, τ1|} = {|(Γ, _x_ ↦ τ1)|}, we can write θ ⊧sat {|Γ, _x_ ↦ τ1, Σ, _e_i, τ2|}.
-
Since we have Γ, _x_ ↦ τ1; Σ ⊢ _e_i : τ2, and θ ⊧sat {|Γ, _x_ ↦ τ1, Σ, _e_i, τ2|}, from induction hypothesis, we obtain θ ∥ Γ, _x_ ↦ τ1; Σ ⊢⋄ _e_i : τ2.
-
θ ∥ Γ, _x_ ↦ τ1; Σ ⊢⋄ _e_i : τ2 is a shorthand for θ〈Γ, _x_ ↦ τ1〉; θ〈Σ〉 ⊢ θ〈_e_i〉 : θ〈τ2〉, which can be re-written as θ〈Γ〉, _x_ ↦ θ〈τ1〉; θ〈Σ〉 ⊢ θ〈_e_i〉 : θ〈τ2〉
-
From case (2.d) and T-Lambda rule in figure 3, we obtain θ〈Γ〉; θ〈Σ〉 ⊢ λ_x_.θ〈_e_i〉 : ▽(θ〈τ1〉) → △(θ〈τ2〉).
-
From Definition 18 and Definition 9, it is clear that ⌊θ〈τ1〉⌋ = ▽(θ〈τ1〉), and ⌈θ〈τ2〉⌉ = △(θ〈τ2〉). Therefore, we can write ▽(θ〈τ1〉) → △(θ〈τ2〉) = ⌊θ〈τ1〉⌋ → ⌈θ〈τ2〉⌉ = ⌊θ〈τ1〉⌋ → ⌈θ〈τ2〉⌉ = θ〈⌊τ1⌋ → ⌈τ2⌉〉. It is further evident that λ_x_.θ〈_e_i〉 = θ〈λ_x_._e_i〉.
-
Substituting the equivalencies in case (2.f) into case (2.e), we obtain: θ〈Γ〉; θ〈Σ〉 ⊢ θ〈λ_x_._e_i〉 : θ〈⌊τ1⌋ → ⌈τ2⌉〉. That is, θ ∥ Γ; Σ ⊢⋄ λ_x_._e_i : ⌊τ1⌋ → ⌈τ2⌉.
-
-
Case J-App:
-
In this case, we have:
-
(A) Γ; Σ ⊢j _e_1 : τ1
(B) τ1 ⊴: ⌊τa⌋ → ⌈τr⌉
(C) Γ; Σ ⊢j _e_2 : τ2
(D) τ2 ⊴: ℑ(τa)
(E) τr ⊴: ℑ(τ)
(F) ⊪ {Γ, Σ, _e_1, _e_2, τ, τ1, τ2, ⌊τa⌋ → ⌈τr⌉}
Γ; Σ ⊢j _e_1 _e_2 : τ
-
θ ⊧sat {|Γ, Σ, _e_1 _e_2, τ|}.
We need to show that θ ∥ Γ; Σ ⊢⋄ _e_1 _e_2 : τ.
-
-
From case (3.a.ii), we obtain θ ⊧sat {|Γ, Σ, _e_1, _e_2, τ|}.
-
-
From case (3.a.i.F), we have ⊪ {Γ, Σ, _e_1, _e_2, τ, τ1, τ2, ⌊τa⌋ → ⌈τr⌉}.
-
Clearly, {Γ, Σ, _e_1, _e_2, τ} ⊆ {Γ, Σ, _e_1, _e_2, τ, τ1, τ2, ⌊τa⌋ → ⌈τr⌉}.
-
From case (3.b), we have θ ⊧sat {|Γ, Σ, _e_1, _e_2, τ|}.
-
We can assume that dom(θ) ∩ ?tvs??(?{Γ, Σ, _e_1, _e_2, τ, τ1, τ2, ⌊τa⌋ → ⌈τr⌉}?)? = ?tvs??(?{Γ, Σ, _e_1, _e_2, τ}?)?, because this property can be obtained by suitable α-renaming using fresh type variables. The only common type variables that need to be common to both the derivations are those present in Γ or Σ.
-
Now, from cases (3.c.i, 3.c.ii, 3.c.iii and 3.c.iv) and Lemma 18, we conclude that ∃ θ″ such that if θ′ = θ″ ∘ θ, we have
-
θ″ ⊧sat θ{|Γ, Σ, _e_1, _e_2, τ, τ1, τ2, ⌊τa⌋ → ⌈τr⌉|}, and therefore,
θ′ ⊧sat {|Γ, Σ, _e_1, _e_2, τ, τ1, τ2, ⌊τa⌋ → ⌈τr⌉|}. -
θ′{Γ, Σ, _e_1, _e_2, τ} = θ{Γ, Σ, _e_1, _e_2, τ}.
-
-
-
From case (3.c.v.A) and Lemma 13 (weakening), we obtain
-
θ′ ⊧sat {|Γ, Σ, _e_1, τ1|}
-
θ′ ⊧sat {|Γ, Σ, _e_2, τ2|}
-
-
From case (3.d.i), and induction hypothesis with respect to case (3.a.i.A), we conclude that θ′ ∥ Γ; Σ ⊢⋄ _e_1 : τ1.
-
From case (3.d.ii), and induction hypothesis with respect to case (3.a.i.C), we conclude that θ′ ∥ Γ; Σ ⊢⋄ _e_2 : τ2.
-
From cases (3.a.i.F and 3.c.v.A), and Lemma 13 (weakening), we obtain ⊪ {τ1, ⌊τa⌋ → ⌈τr⌉} and θ′ ⊧sat {|τ1, ⌊τa⌋ → ⌈τr⌉|}. Now, from case (3.a.i.B) and Lemma 12, we obtain θ′〈τ1〉 ≼: θ′〈⌊τa⌋ → ⌈τr⌉〉. That is, θ′〈τ1〉 ≼: ▽(θ′〈τa〉) → △(θ′〈τr〉).
-
Similarly, for case (3.a.i.D), we obtain θ′〈τ2〉 ≼: θ′〈ℑ(τa)〉. Given case (3.c.v.A), θ′〈τ2〉 ≅ θ′〈ℑ(τa)〉. For any two types τ′ and τ″ such that τ′ ≅ τ″, we have τ′ ≼: ▽(τ′), and τ″ ≼: ▽(τ′). Therefore, we conclude that θ′〈ℑ(τa)〉 ≼: ▽(θ′〈τa〉). We can now write θ′〈τ2〉 ≼: ▽(θ′〈τa〉).
-
Similar to case (h), from case (3.a.i.E), we obtain △(θ′〈ℑ(τr)〉) ≼: θ′〈τ〉.
-
From cases (3.e, 3.f, 3.g, 3.h, and 3.i) and the T-App rule in figure 3, we obtain, θ′ ∥ Γ; Σ ⊢⋄ _e_1 _e_2 : τ.
-
From cases (3.j, and 3.c.v.B), we finally conclude that θ ∥ Γ; Σ ⊢⋄ _e_1 _e_2 : τ.
-
-
Cases J-If, J-Dup, J-Deref, J-Set, J-Tqexpr, J-Let-M[Tq], and J-Let-P[Tq] are similar.
Lemma 20: (Substitution on Intermediate Derivation)
If Γ; Σ ⊢j _e_ : τ and θ is a substitution such that ⊪ θ{Γ, Σ, _e_, τ}, then θ ∥ Γ; Σ ⊢j _e_ : τ.
Proof 1
Straightforward induction on the derivation of Γ; Σ ⊢j _e_ : τ, similar to Lemma 15.
Theorem 3: (Correctness of Unification)
If θ ⊢unf τ1 = τ2, then:
-
θ〈τ1〉 ⊴: θ〈τ2〉 and θ〈τ2〉 ⊴: θ〈τ1〉
-
⊧ca {|τ1|}, and ⊧ca {|τ2|} implies ⊧ca θ{|τ1, τ2|}.
Proof 1
By straightforward induction on the derivation of θ ⊢unf τ1 = τ2.
Lemma 21: (Consistency of Unified Types)
If θ ⊢unf τ1 = τ2, then:
-
⊪ {ω*, τ1, τ2} implies ⊪ θ{ω*, τ1, τ2}.
-
⊪ {ω*, τ1}, and ⊪ {ω*, τ2} implies ⊪ θ{ω*, τ1, τ2}.
Proof 1
By straightforward induction on the derivation of θ ⊢unf τ1 = τ2.
Theorem 4: (Correctness of the Constraint Solver)
If θ ⊢s τ :≫ τ′, then θ ⊧sat {|τ|} and θ〈τ〉 = τ′
Proof 1
By straightforward induction on the derivation of θ ⊢ τ :≫ τ′, noting that the solver infers compatible types at steps Sol-Ct-Var and that the Sol-Ct-Const explicitely checks for compatibility.
Lemma 22: (Corollary to Correctness of the Constraint Solver)
If θ ⊢solve τ :≫ τ′, then θ ⊧sat {|τ|} and θ〈τ〉 = τ′
Theorem 5: (Totality of the Constraint Solver)
If ⊧cst {|τ|} then ∃ θ and τ′ such that θ ⊢s τ :≫ τ′.
Proof 1
Evident from the definition of the solver in figure 7.
Theorem 6: (Uniqueness of Solutions Produced by the Solver)
If θ′ ⊢s τ :≫ τ′ and θ″ ⊢s τ :≫ τ″, then θ′ = θ″ and τ′ = τ″.
Proof 1
Evident from the definition of the solver in figure 7.
Theorem 7: (Decidability of Unification)
If ⊧acy {|τ1|} and ⊧acy {|τ2|}, then, a normalizing derivation of θ ⊢unf τ1 = τ2 where no two uses of U-Commut occur consecutively is decidable.
Proof 1
The unifier and constraint solver builds a solution tree by always invoking itself types having smaller shapes of types (after eliminating redundant uses of the U-Commut rule). Since types are of bounded size and acyclic, and since unification itself does not produce any cycles (Theorem 3), these derivations must be bounded.
Theorem 8: (Decidability of the Constraint Solver)
If ⊧acy {|τ|}, then, the derivation of θ ⊢s τ :≫ τ′ is decidable.
Proof 1
Similar to Theorem 7.
Lemma 23: (Structural Isomorphism)
If τ1 ⊴: τ2 and τ2 ⊴: τ1, then for any substitution θ, we have θ〈τ1〉 ⊴: θ〈τ2〉 and θ〈τ2〉 ⊴: θ〈τ1〉.
Proof 1
Since we have both τ1 ⊴: τ2 and τ2 ⊴: τ1, the two types τ1 and τ2 must be structurally equivalent expect for the fact that one of the types can be of the form τ↓τ′, and the other of the form τ. The conclusion is thus evident.
Lemma 24: (Consistency of Heuristic Type Inference)
If: Γ; Σ ⊢i _e_ : τ ∥ θ and ⊪ {Γ, Σ} then, ⊪ θ{Γ, Σ, _e_, τ}
Proof 1
By induction on the derivation of Γ; Σ ⊢i _e_ : τ ∥ θ, noting that (1) all maybe types are introduced through new type variables, (2) the two parts of a maybe type are never separated once constructed (3) all substitutions produced during inference are obtained from the unifier or the solver, which preserve consistency according to Theorem 3 (property 3), and Theorem 4.
Lemma 25: (Type Variable Propagation)
If: Γ; Σ ⊢i _e_1 : τ1 ∥ θ1 and θ1 ∥ Γ; Σ ⊢i _e_2 : τ2 ∥ θ2, then ?tv??(?θ1.2〈_e_1, τ1〉?)? ∩ ?tv??(?θ1.2〈_e_2, τ2〉?)? = ?tv??(?θ1.2〈Γ, Σ〉?)?.
Proof 1
By straightforward induction on the derivation of θ1 ∥ Γ; Σ ⊢i _e_2 : τ2 ∥ θ2. This derivation happens in an environment that already contains the substitutions obtained from the first derivation: θ1. All of the cases in the second derivation either use (1) primitive types that do not alter the type variables involved, (2) types (and thus type variables) from θ1〈Γ〉 or θ1〈Σ〉, or (3) new type variables. Therefore, it is evident that, if {α*} = ∪ ?tv??(?τ?)?, ∀ τ ∊ range(θ2) , we have (dom(θ2) ∪ {α*} ∪ ?tv??(?θ1.2〈_e_2, τ2〉?)?) ∩ ?tv??(?θ1〈_e_1, τ1〉?)? = ?tv??(?θ1〈Γ, Σ〉?)?. From this, we can conclude that ?tv??(?θ1.2〈_e_1, τ1〉?)? ∩ ?tv??(?θ1.2〈_e_2, τ2〉?)? = ?tv??(?θ1.2〈Γ, Σ〉?)?.
Note: For the sake of simplicity, type variables present as type qualifications that are unbound in the environment are treated as fresh type variables. For example, the expression if true then _e_a : α else _e_2 : α, where α ∉ ?tv??(?Γ, Σ?)? is equivalent to if true then _e_a : β else _e_2 : γ, where ⊧new βγ. This formulation obviates the need for tracking type variable scopes in the environment. If necessary, any alternate behaviour can be obtained by introducing dummy bindings (ex: through an lambda expression that encompasses all expressions that must be in scope of a type variable).
Theorem 9: (Soundness of Heuristic Type Inference)
If: Γ; Σ ⊢i _e_ : τ ∥ θ and ⊪ {Γ, Σ} then, θ ∥ Γ; Σ ⊢j _e_ : τ.
Proof 1
By induction on the derivation of Γ; Σ ⊢i _e_ : τ ∥ θ. We proceed by case analysis on the last step, assuming α-reduction vacuously.
-
Cases I-Unit, I-True. I-False, I-Id, I-Hloc, I-Sloc are trivial.
-
Case I-Lambda:
-
In this case, we have:
-
(A) Γ, _x_ ↦ α; Σ ⊢i _e_i : τ ∥ θ
(B) ⊧new α
Γ; Σ ⊢i λ_x_._e_i : ⌊θ〈α〉⌋ → ⌈τ⌉ ∥ θ
-
⊪ {Γ, Σ}
We need to show that θ ∥ Γ; Σ ⊢j λ_x_._e_i : ⌊θ〈α〉⌋ → ⌈τ⌉.
-
-
From case (2.a.ii), conclusion of case (2.a.i) and Lemma 24, we obtain ⊪ θ{Γ, Σ, λ_x_._e_i, ⌊θ〈α〉⌋ → ⌈τ⌉}. This can be re-written as ⊪ θ{Γ, Σ, _e_i, α, τ}.
-
From cases (2.a.ii and 2.a.i.B), we can write ⊪ {Γ, _x_ ↦ α, Σ}.
-
From cases (2.a.i.A and 2.c), from induction hypothesis, we conclude that θ ∥ Γ, _x_ ↦ α; Σ ⊢j _e_i : τ.
-
From cases (2.d and 2.b), and the J-Lambda rule in figure 5, we obtain θ ∥ Γ; Σ ⊢j λ_x_._e_i : ⌊α⌋ → ⌈τ⌉. This can be equivalently stated as: θ ∥ Γ; Σ ⊢j λ_x_._e_i : ⌊θ〈α〉⌋ → ⌈τ⌉.
-
-
Case I-App:
-
In this case, we have:
-
(A) Γ; Σ ⊢i _e_1 : τ1 ∥ θ1
(B) θ1 ∥ Γ; Σ ⊢i _e_2 : τ2 ∥ θ2
(C) θf ⊢unf θ2〈τ1〉 = β↓(⌊δ⌋ → ⌈α⌉)
(D) θa ⊢unf τ2 = γ↓▽(θf〈δ〉)
(E) τ = ε↓▽(θa.f〈α〉)
(F) ⊧new αβγδε
Γ; Σ ⊢i _e_1 _e_2 : τ ∥ θ1.2.f.a
-
⊪ {Γ, Σ}
-
θ = θ1.2.f.a
We need to show that θ ∥ Γ; Σ ⊢j _e_1 _e_2 : τ.
-
-
From cases (3.a.i.A and 3.a.ii) and induction hypothesis, we can write, θ1 ∥ Γ; Σ ⊢j _e_1 : τ1.
-
From cases (3.a.ii and 3.b) and Lemma 24, we obtain ⊪ θ1{Γ, Σ, _e_1, τ1}.
-
From case (3.c) and Lemma 13 (weakening), we obtain ⊪ θ1{Γ, Σ}. Using this result along with case (3.a.i.B) and induction hypothesis, we obtain θ1.2 ∥ Γ; Σ ⊢j _e_2 : τ2.
-
From case (3.d) and Lemma 16, we obtain ⊪ θ2〈θ1{Γ, Σ, _e_2, τ2}〉.
-
From cases (3.b and 3.e), and Lemma 20, we obtain θ1.2 ∥ Γ; Σ ⊢j _e_1 : τ1.
-
From case (3.f) and Lemma 16, we obtain ⊪ θ1.2{Γ, Σ, _e_1, τ1}.
-
From cases (3.a.i.A and 3.a.i.B), and Lemma 25, we obtain: ?tv??(?θ1.2〈_e_1, τ1〉?)? ∩ ?tv??(?θ1.2〈_e_2, τ2〉?)? = ?tv??(?θ1.2〈Γ, Σ〉?)?. This can be written as: ?tv??(?θ1.2〈Γ, Σ, _e_1, τ1〉?)? ∩ ?tv??(?θ1.2〈Γ, Σ, _e_2, τ2〉?)? = ?tv??(?θ1.2〈Γ, Σ〉?)?.
-
From cases (3.g, 3.e, and 3.h), we obtain ⊪ θ1.2{Γ, Σ, _e_1, τ1, _e_2, τ2}.
-
It is evident that ⊪ θ1.2{Γ, Σ, _e_1, τ1, _e_2, τ2, β↓(⌊δ⌋ → ⌈α⌉)}, since β, γ, and δ are new type variables (from case (3.a.i.F)).
-
From cases (3.a.i.C, and 3.j) and Lemma 21-conclusion (1), we obtain ⊪ θ1.2.f{Γ, Σ, _e_1, τ1, _e_2, τ2, β↓(⌊δ⌋ → ⌈α⌉)}.
-
From case (3.a.i.C), and Theorem 3-conclusion(1), we have θ2.f〈τ1〉 ⊴: θf〈β↓(⌊δ⌋ → ⌈α⌉)〉 and θf〈β↓(⌊δ⌋ → ⌈α⌉)〉 ⊴: θ2.f〈τ1〉. This can be equivalently written as θ1.2.f〈τ1〉 ⊴: θ1.2.f〈β↓(⌊δ⌋ → ⌈α⌉)〉 and θ1.2.f〈β↓(⌊δ⌋ → ⌈α⌉)〉 ⊴: θ1.2.f〈τ1〉.
-
From case (3.a.iii), we know that θ = θ1.2.f.a. Similar to cases (3.j, 3.k, and 3.l), from case (3.a.i.D) we obtain:
-
⊪ θ{Γ, Σ, _e_1, τ1, _e_2, τ2, β↓(⌊δ⌋ → ⌈α⌉), γ↓▽(θf〈δ〉)}
-
θ〈τ2〉 ⊴: θ〈γ↓▽(θf〈δ〉)〉 and θ〈γ↓▽(θf〈δ〉)〉 ⊴: θ〈τ2〉.
-
-
From case (3.m.i), and Lemma 13 (weakening-conclusion (3)), we obtain ⊪ θ{Γ, Σ, _e_1, τ1, _e_2, τ2, β↓(⌊δ⌋ → ⌈α⌉)}. From this, it is evident that ⊪ θ{Γ, Σ, _e_1, τ1, _e_2, τ2, ⌊δ⌋ → ⌈α⌉}. Now, from cases (3.m.1 and 3.a.i.E), since ε is a new type variable, and θa.f〈α〉 = θ〈α〉, and θ〈α〉 ∊ θ〈⌊δ⌋ → ⌈α⌉〉, we conclude that ⊪ θ{Γ, Σ, _e_1, τ1, _e_2, τ2, ⌊δ⌋ → ⌈α⌉, τ}.
-
From case (3.l) and Lemma 23, we obtain (as one of the conclusions): θ〈τ1〉 ⊴: θ〈β↓(⌊δ⌋ → ⌈α⌉)〉, from which we can conclude that θ〈τ1〉 ⊴: θ〈⌊δ⌋ → ⌈α⌉〉.
-
From case (3.m.ii), we have θ〈τ2〉 ⊴: θ〈γ↓▽(θf〈δ〉)〉, from which, we can conclude that θ〈τ2〉 ⊴: θ〈▽(θf〈δ〉)〉, and therefore θ〈τ2〉 ⊴: ▽(θ〈δ〉), and thus θ〈τ2〉 ⊴: ℑ(θ〈δ〉).
-
For any type τ′, it is evident from the definition of ℑ(τ′) in figure 6 that ▽(τ′) ⊴: ℑ(τ′). Therefore, ▽(θ〈α〉) ⊴: ℑ(θ〈α〉). From this, it is evident that γ↓▽(θ〈α〉) ⊴: ℑ(θ〈α〉). Now, from case (3.a.i.E), (since θa.f〈α〉 = θ〈α〉, and θ〈τ〉 = τ), we can write θ〈α〉 ⊴: ℑ(θ〈τ〉).
-
Now, from cases (3.n.i, 3.p, 3.n.ii, 3.q, 3.r, and 3.o) and the J-App rule in figure 5, we obtain θ ∥ Γ; Σ ⊢j _e_1 _e_2 : τ.
-
-
Cases I-If, I-Dup, I-Deref, I-Set, I-Tqexpr, I-Let[Tq] are similar.
Theorem 10: ((Direct Statemet of) Soundness of Heuristic Type Inference)
If Γ; Σ ⊢i _e_ : τ ∥ θu and ⊪ {Γ, Σ}, then, ∀ θs such that θs ⊧sat θu{|Γ, Σ, _e_, τ|}, we have θs.u ∥ Γ; Σ ⊢⋄ _e_ : τ.
The inference algorithm is not complete. For example, we cannot type the expression let _id_ = dup(λ_x_._x_) in _id_^ := _id_^ without the help of an annotation for _id_.
8 Related Work
Grossman [5] provides a theory of using quantified types with imperative C style mutation and & operator for Cyclone. However, his formalization requires explicit annotation for all polymorphic definitions and instantiations. Since C (and Cyclone) have no notion of immutability, both languages require explicit annotation of polymorphism. In contrast, we believe that the best way to integrate polymorphism into the systems programming paradigm is by automatic — albeit incomplete — inference. A further contribution of our work (in comparison to [5]) is that we give a formal specification and proof of correctness of the inference algorithm, not just the type system.
C's const notion of immutability-by-alias offers localized checking of immutability properties, and encourages good programming practice by serving as documentation of programmers intentions. Other systems have proposed immutability-by-name [2], referential immutability [ 21 ,26 ] (transitive immutability-by-reference), etc. These techniques are orthogonal and complementary to the immutability-by-location property in BitC/*B*. For example, we could have types like (const Ψτ) that can express both global and local usage properties of a location.
A monadic model [14] of mutability is used in pure functional languages like Haskell [15]. In this model, the type system distinguishes side-effecting computations from pure ones (and not just mutable locations from immutable ones). Even though this model is beneficial for integration with verification systems, it is considerably different from the normal programming idioms used by systems programmers. For example, Hughes argues that there is no satisfactory way of creating and using global mutable variables using monads [6]. There have been proposals for adding unboxed representation control to Haskell [ 13 ,7 ]. However, these systems are pure and therefore and do not consider the effects of mutability.
Cqual [4] provides a framework of type qualifiers, which can be used to infer maximal const qualifications for C programs. However, CQual does not deal with polymorphism of types. In a monomorphic language, we can infer types and qualifiers independently. Adding polymorphism to CQual would introduce substantial challenges, particularly if polymorphism should be automatically inferred. The inference of types and qualifiers (mutability) becomes co-dependent: we need base types to infer qualifiers; but, we also need the qualifiers to infer base types due to the value restriction. BitC/*B* supports a polymorphic language and performs simultaneous inference of base types and mutability.
9 Conclusions
In this paper, we have proposed a type system that integrates all of unboxed representation, well-founded first-class mutability, and polymorphism. The mutability model is expressive enough to permit mutation of unboxed/stack locations, and at the same time guarantees that types are definitive about the mutability of every location across all aliases. This type system meets the requirements for systems programming languages.
First class mutability introduces challenges for type inference at copy boundaries. There is a fundamental conflict of goals between the inference of principal types and copy compatibility. We have proposed an inference algorithm that resolves this conflict through a combination of hinting mechanisms and selective heuristics. This algorithm is deterministic, does not require whole program analysis, and minimizes programmer annotations in the common case. We have provided a formal framework for this type system, and implemented it as part of the BitC compiler. The source code can be obtained from http://bitc-lang.org.
The bootstrap compiler for BitC has been implemented in C++. Currently, the backend emits portable C code. The core of the compiler involves 28,686 lines of C++ code, of which implementation of the type system accounts for about 6,894 lines. An informal description of the inference algorithm for full BitC can be found in [24].
Bibliography
- E. Biagioni, R. Harper, and P. Lee ``A network protocol stack in Standard ML'' Higher Order and Symbolic Computation, Vol.14, No.4 , 2001.
- R. Deline and M. Fähndrich, ``VAULT: a programming language for reliable systems'' http://research.microsoft.com/vault , 2001
- H. Derby, ``The performance of FoxNet 2.0'' Technical Report CMU-CS-99-137 School of Computer Science, Carnegie Mellon University, June 1999.
- J. S. Foster, R. Johnson, J. Kodumal, and A. Aiken ``Flow-Insensitive Type Qualifiers'' Trans. on Programming Languages and Systems. 28(6):1035-1087, Nov. 2006.
- D. Grossman, ``Quantified Types in an Imperative Language'' ACM Transactions on Programming Languages and Systems , 2006.
- J. Hughes ``Global variables in Haskell'' Journal of Functional Programming archive Volume 14, Issue 5, Sept. 2004.
- I. S. Diatchki, M. P. Jones, and R. Leslie. ``High- level Views on Low-level Representations.'' Proc. ACM Int. Conference on Functional Programming pp. 168–179, 2005.
- International Std. Organization ISO/IEC 8652:1995 (Information Technology — Prog. Languages — Ada), 1995.
- International Std. Organization ISO/IEC FDIS 14882:1998(E) (Prog. Languages - C++), 1998.
- International Std. Organization ISO/IEC 9899:1999 (Prog. Languages - C), 1999.
- T. Jim, G. Morrisett, D. Grossman, M. Hicks, J. Cheney, and Y. Wang ``Cyclone: A safe dialect of C.'' Proc. of USENIX Annual Technical Conference pp 275288, 2002.
- M. P. Jones ``Qualified types: theory and practice.'' Cambridge Distinguished Dissertations In Computer Science ISBN:0-521-47253-9, 1995
- S. L. Peyton Jones and J. Launchbury ``Unboxed values as first class citizens in a non-strict functional language.'' Functional Programming Languages and Computer Architecture , 1991
- S. L. Peyton Jones and P. Wadler ``Imperative functional programming.'' Proc. ACM SIGPLAN Principles of Programming Languages. , 1993
- S. L. Peyton Jones (ed.). Haskell 98 Language and Libraries: The Revised report. Cambridge University Press, 2003.
- ECMA International ``Standard ECMA-334 C# Language Specification'' http://www.ecma-international.org/publications/standards/Ecma-334.htm
- R. Milner ``A theory of type polymorphism in programming.'' Journal of Computer and System Sciences pp 348-375, 1978.
- R. Milner, M. Tofte, R. Harper, and D.MacQueen. The Definition of Standard ML - Revised The MIT Press, May 1997.
- J. Gosling, B. Joy, G. Steele, and G. Bracha ``The Java Language Specification,'' Third Edition http://java.sun.com/docs/books/jls
- G. van Rossum, ``Python Reference Manual'' F. L. Drake, Jr. (ed.) http://docs.python.org/ref/ref.html , 2006.
- J. S. Shapiro, J. M. Smith, and D. J. Farber. ``EROS: a fast capability system'' ACM Symposium on Operating Systems Principles , Dec. 1999.
- J. S. Shapiro, S. Sridhar, M. S. Doerrie, ``BitC Language Specification'' http://www.bitc-lang.org/docs/bitc/spec.html
- G. Smith and D. Volpano. ``A sound polymorphic type system for a dialect of C.'' Science of Computer Programming 32(2--3):49--72, 1998.
- S. Sridhar and J. S. Shapiro. ``Type Inference for Unboxed Types and First Class Mutability'' Proc. 3rd Workshop on Prog. Languages and Operating Systems, 2006.
- D. Tarditi, G. Morrisett, P. Cheng, C. Stone, R. Harper, and P. Lee. ``TIL: A type-directed optimizing compiler for ML'' Proc. ACM SIGPLAN PLDI , 1996.
- M. S. Tschantz and M. D. Ernst, ``Javari: Adding reference immutability to Java'' Object-Oriented Programming Systems, Languages, and Applications , Oct 2005.
- A. Wright, ``Simple Imperative Polymorphism'' Lisp and Symbolic Comp. 8(4):343-355, 1995.