We need a set of rules for initialization that ensure the following properties about the state of the program:
The state of the program at the first instruction of main() should be well defined, and should not depend (as, e.g., in O'Caml) on the accident of link order. In particular, we should avoid the C++ bug that constructors on globals run in unspecified order, as this fact routinely bites C++ programmers in the ass.
In abstract, it should be possible for all initializing expressions to be evaluated at compile time by a whole-program compiler, or at pre-link time in an environment supporting dynamic linking. No compiler should be obligated to implement this, but it should be possible for an aggressive compiler to choose to implement this.
One implication of this is that external proclaimed procedures that are not part of the core bitc libraries should not be callable by initializers.
The result of initialization should respect dependency ordering, in the sense that any identifier that is evaluated by an initializing expression must have been initialized prior to evaluation.
Certain forms of forward reference need to be permitted, however. In particular it must be legal to evaluate an identifier naming a procedure object (thereby obtaining the location of the procedure object) that has not yet been defined. What is not permitted is to apply such a procedure. In the terminology of the specification, application requires that the procedure definition be complete.
Requirement 3 must be preserved in the face of separate compilation. This is a restatement of [1], but it motivates some restrictions on use-occurrences in initializing expressions.
No working program should break in the field as a consequence of a dynamic library update, unless due to bugs in the library. In particular, the satisfaction of initialization ordering constraints and type checking shoudl not lead to breakage in the field.
Note that shap is no longer sure what he meant here. Possibly he was concerned about init ordering again.
Causal ordering in the presence of separate compilation is moderately difficult to establish. We can (just barely) achieve it if we require that initializing expressions are pure, meaning that the result of any initializing expression is some deterministic function of the inputs to that expression plus the pre-existing bindings in the lexical environment of that function. If we allow initializing expressions to have side effects, the ordering requirements must be strengthened beyond what can be achieved under separate compilation.
Perhaps a better way to say that is to emphasize my choice of wording of the requirement. The requirement is that the initializing expressions must be pure in the mathematical sense. This is not the same as requiring them to be functional in the sense of ML or Haskell.
We need to allow initializing expressions to refer to global mutable variables (that is: to source them, not to target them). The constraint we will impose (by construction) is that globally reachable state cannot be modified until after all initializers have been executed.
The following rules are sufficient to ensure causal dependency ordering:
If an interface B imports an interface A, all symbols bound in interface A will be initialized before any symbol in interface B is initialized.
At any use occurrence where a symbol X is evaluated by an initializing expression, that symbol must be defined (not just proclaimed) in the lexical environment of the use occurrence.
At any use occurrence where an IMPORTED symbol X bound in an interface I is defined, we examine the transitive closure of free variables that are evaluated by the initializing expression of X, and we require that for each such free variable V, either:
V is defined (not just proclaimed) in the lexical environment where X is bound [That is: in the *interface* environment], OR
V has been imported into the lexical environment where X is bound [That is: the *interface* environment], OR
V is defined in the source unit of compilation where X is defined, and further is defined in the lexical environment at the point where X is defined [That is: the environment at the point of definition].
This restriction is a bit subtle. What it ensures is that anything appearing in the initializer for X respects (transitively) the interface initialization ordering requirement.
When rule (3B) applies, the interface supplying B must still be imported into the defining unit of compilation in order for symbol resolution requirements to be satisfied. The purpose of (3B) is to ensure that causal ordering of exported symbols cannot be violated.
Every initializer expression must be a pure expression. This ensures that no intertwining of causal dependencies can occur as a consequence of having some initializers check a value before an assignment while others check after the assignment.
1.4.7