Sunday, February 05, 2017

On the Meaning of Casts and Blame for Gradual Typing

Gradually typed languages enable programmers to choose which parts of their programs are statically typed and which parts are dynamically typed. Thus, gradually typed languages perform some type checking at compile time and some type checking at run time. When specifying the semantics of a gradually typed language, we usually express the run time checking in terms of casts. Thus, the semantics of a gradually typed language depends crucially on the semantics of casts. This blog post tries to answer the question: "What is a cast?"

Syntax and Static Semantics of Casts

Syntactically, a cast is an expression of the form \[ e : A \Rightarrow^{\ell} B \] where \(e\) is a subexpression; \(A\) and \(B\) are the source and target types, respectively. The \(\ell\) is what we call a blame label, which records the location of the cast (e.g. line number and character position).

Regarding the static semantics (compile-time type checking), a cast enables the sub-expression \(e\) of static type \(A\) to be used in a context expecting a different type \(B\). \[ \frac{\Gamma \vdash e : A} {\Gamma \vdash (e : A \Rightarrow^{\ell} B) : B} \] In gradual typing, \(A\) and \(B\) typically differ in how "dynamic" they are but are otherwise similar to each other. So we often restrict the typing rule for casts to only allow source and target types that have some values in common, that is, when \(A\) and \(B\) are consistent. \[ \frac{\Gamma \vdash e : A \quad A \sim B} {\Gamma \vdash (e : A \Rightarrow^{\ell} B) : B} \] For example, if we let \(\star\) be the unknown type (aka. \(\mathtt{dynamic}\)), then we have \(\mathtt{Int} \sim \star\) and \(\star \sim \mathtt{Int}\) but \(\mathtt{Int} \not\sim \mathtt{Int}\to\mathtt{Int}\). Here are the rules for consistency with integers, functions, and the dynamic type. \begin{gather*} \mathtt{Int} \sim \mathtt{Int} \qquad \frac{A \sim B \qquad A' \sim B'} {A \to A' \sim B \to B'} \qquad A \sim \star \qquad \star \sim B \end{gather*}

Dynamic Semantics of Casts

The dynamic semantics of a cast is to check whether the value produced by subexpression \(e\) is of the target type \(B\) and if so, return the value; otherwise signal an error. The following is a strawman denotational semantics that expresses this basic intuition about casts. Suppose we have already defined the meaning of types, so \(\mathcal{T}[\!| A |\!]\) is the set of values of type \(A\). The meaning function \(\mathcal{E}[\!| e |\!]\) maps an expression to a result (either a value \(v\) or error \(\mathsf{blame}\,\ell\)). \begin{align*} \mathcal{E} [\!| e : A \Rightarrow^{\ell} B |\!] &= \begin{cases} v & \text{if } v \in \mathcal{T}[\!| B |\!] \\ \mathsf{blame}\,\ell & \text{if } v \notin \mathcal{T}[\!| B |\!] \end{cases} \\ & \text{where } v = \mathcal{E} [\!| e |\!] \end{align*}

If we restrict ourselves to first-order types such as \(\mathtt{Int}\), it is straightforward to define \(\mathcal{T}\) and check whether a value is in the set. \begin{align*} \mathcal{T}[\!| \mathtt{Int} |\!] &= \mathbb{Z} \end{align*} The story for function types, that is, for \(A \to B\), is more complicated. In a denotational setting, it traditionally takes sophisticated mathematics to come up with mathematical entities that can serve as function values when the \(\star\) type is involved (Scott 1970, 1976). The primary challenge is that one cannot simply use the usual notion of a mathematical function to represent function values because of a cardinality problem. Suppose that \(D\) is the set of all values. The set of mathematical functions whose domain and codomain is \(D\) is necessarily larger than \(D\), so the mathematical functions cannot fit into the set of all values. There is nothing wrong with sophisticated mathematics per se, but when it comes to using a specification for communication (e.g. between language designers and compiler writers), it is less desirable to require readers of the specification to fully understand a large number of auxiliary definitions and decide whether those definitions match their intuitions.

Competing Operational Semantics for Casts

We'll come back to denotational semantics in a bit, but first let's turn to operational semantics, in particular reduction semantics, which is what the recent literature uses to explains casts and the type \(\star\) (Gronski 2006, Siek 2007, Wadler 2009). In a reduction semantics, we give rewrite rules to say what happens when a syntactic value flows into a cast, that is, we say what expression the cast reduces to. Recall that a syntactic value is just an expression that cannot be further reduced. We can proceed by cases on the consistency of the source type \(A\) and target type \(B\).

  • Case \((v : \mathtt{Int} \Rightarrow^{\ell} \mathtt{Int})\). This one is easy, the static type system ensures that \(v\) has type \(\mathtt{Int}\), so there is nothing to check and we can rewrite to \(v\). \[ v : \mathtt{Int} \Rightarrow^{\ell} \mathtt{Int} \longrightarrow v \]
  • Case \((v : \star \Rightarrow^{\ell} \star)\). This one is also easy. \[ v : \star \Rightarrow^{\ell} \star \longrightarrow v \]
  • Case \((v : A \to A' \Rightarrow^{\ell} B \to B')\). This one is more complicated. We'd like to check that the function \(v\) has type \(B \to B'\). Suppose \(B'=\mathtt{Int}\). How can we determine whether a function returns an integer? In general, that's just as hard as the halting problem, which is undecidable. So instead of checking now, we'll delay the checking until when the function is called. We can accomplish this by rewriting to a lambda expression that casts the input, calls \(v\), and then casts the output. \[ v : A \to A' \Rightarrow^{\ell} B \to B' \longrightarrow \lambda x{:}B. (v \; (x : B \Rightarrow^{\ell} A)) : A' \Rightarrow^{\ell} B' \] Here we see the importance of attaching blame labels to casts. Because of the delayed checking, the point of error can be far removed from the original source code location, but thanks to the blame label we can point back to the source location of the cast that ultimately failed (Findler and Felleisen 2002).
  • Case \((v : A \Rightarrow^{\ell} \star)\). For this one there are multiple options in the literature. One option is declare this as a syntactic value (Siek 2009), so no rewrite rule is necessary. Another option is to factor all casts to \(\star\) through the ground types \(G\): \[ G ::= \mathtt{Int} \mid \star \to \star \] Then we expand the cast from \(A\) to \(\star\) into two casts that go through the unique ground type for \(A\). \begin{align*} v : A \Rightarrow^{\ell} \star &\longrightarrow (v : A \Rightarrow^{\ell} G) : G \Rightarrow^{\ell} \star\\ & \text{where } A \sim G, A \neq G, A \neq \star \end{align*} and then declare that expressions of the form \((v : G \Rightarrow^{\ell} \star)\) are values (Wadler 2009).
  • Case \((v : \star \Rightarrow^{\ell} B)\). There are multiple options here as well, but the choice is linked to the above choice regarding casting from \(A\) to \(\star\). If \(v = (v' : A \Rightarrow^{\ell'} \star)\), then we need the following rewrite rules \begin{align*} (v' : A \Rightarrow^{\ell'} \star) : \star \Rightarrow^{\ell} B &\longrightarrow v' : A \Rightarrow^{\ell} B & \text{if } A \sim B \\[2ex] (v' : A \Rightarrow^{\ell'} \star) : \star \Rightarrow^{\ell} B &\longrightarrow \mathsf{blame}\,\ell & \text{if } A \not\sim B \end{align*} On the other hand, if we want to factor through the ground types, we have the following reduction rules. \begin{align*} v : \star \Rightarrow^{\ell} B &\longrightarrow v : \star \Rightarrow^{\ell} G \Rightarrow^{\ell} B \\ & \text{if } B \sim G, B \neq G, B \neq \star \\[2ex] (v : G \Rightarrow^{\ell'} \star) : \star \Rightarrow^{\ell} G &\longrightarrow v \\[2ex] (v : G \Rightarrow^{\ell'} \star) : \star \Rightarrow^{\ell} G' &\longrightarrow \mathsf{blame}\,\ell\\ & \text{if } G \neq G' \end{align*}

Given that we have multiple options regarding the reduction semantics, an immediate question is whether it matters, that is, can we actually observe different behaviors for some program? Yes, in the following example we cast the identity function on integers to an incorrect type. \begin{equation} \begin{array}{l} \mathtt{let}\, id = (\lambda x{:}\mathtt{Int}. x)\, \mathtt{in}\\ \mathtt{let}\, f = (id : \mathtt{Int}\to \mathtt{Int} \Rightarrow^{\ell_1} \star) \, \mathtt{in} \\ \mathtt{let}\, g = (f : \star \Rightarrow^{\ell_2} (\mathtt{Int}\to \mathtt{Int}) \to \mathtt{Int})\,\mathtt{in} \\ \quad g \; id \end{array} \tag{P0}\label{P0} \end{equation} If we choose the semantics that factors through ground types, the above program reduces to \(\mathsf{blame}\, \ell_1\). If we choose the other semantics, the above program reduces to \(\mathsf{blame}\, \ell_2\). Ever since around 2008 I've been wondering which of these is correct, though for the purposes of full disclosure, I've always felt that \(\mathsf{blame}\,\ell_2\) was the better choice for this program. I've also been thinking for a long time that it would be nice to have some alternative, hopefully more intuitive, way to specify the semantics of casts, with which we could then compare the above two alternatives.

A Denotational Semantics of Functions and Casts

I've recently found out that there is a simple way to represent function values in a denotational semantics. The intuition is that, although a function may be able to deal with an infinite number of different inputs, the function only has to deal with a finite number of inputs on any one execution of the program. Thus, we can represent functions with finite tables of input-output pairs. An empty table is written \(\emptyset\), a single-entry table has the form \(v \mapsto v'\) where \(v\) is the input and \(v'\) is the corresponding output. We build a larger table out of two smaller tables \(v_1\) and \(v_2\) with the notation \(v_1 \sqcup v_2\). So, with the addition of integer values \(n \in \mathbb{Z}\), the following grammar specifies the values. \[ v ::= n \mid \emptyset \mid v \mapsto v \mid v \sqcup v \]

Of course, we can't use just one fixed-size table as the denotation of a lambda expression. Depending on the context of the lambda, we may need a bigger table that handles more inputs. Therefore we map each lambda expression to the set of all finite tables that jive with that lambda. To be more precise, we shall define a meaning function \(\mathcal{E}\) that maps an expression and an environment to a set of values, and an auxiliary function \(\mathcal{F}\) that determines whether a table jives with a lambda expression in a given environment. Here's a first try at defining \(\mathcal{F}\). \begin{align*} \mathcal{F}(n, \lambda x{:}A. e, \rho) &= \mathsf{false} \\ \mathcal{F}(\emptyset, \lambda x{:}A. e, \rho) &= \mathsf{true} \\ \mathcal{F}(v \mapsto v', \lambda x{:}A. e, \rho) &= \mathcal{T}(A,v) \text{ and } v' \in \mathcal{E}[\!| e |\!]\rho(x{:=}v) \\ \mathcal{F}(v_1 \sqcup v_2, \lambda x{:}A. e, \rho) &= \mathcal{F}(v_1, \lambda x{:}A. e, \rho) \text{ and } \mathcal{F}(v_2, \lambda x{:}A. e, \rho) \end{align*} (We shall define \(\mathcal{T}(A,v)\) shortly.) We then define the semantics of a lambda-expression in terms of \(\mathcal{F}\). \[ \mathcal{E}[\!| \lambda x{:}A.\, e|\!]\rho = \{ v \mid \mathcal{F}(v, \lambda x{:}A. e, \rho) \} \] The semantics of function application is essentially that of table lookup. We write \((v_2 \mapsto v) \sqsubseteq v_1\) to say, roughly, that \(v_2 \mapsto v\) is an entry in the table \(v_1\). (We give the full definition of \(\sqsubseteq\) in the Appendix.) \[ \mathcal{E}[\!| e_1 \, e_2 |\!]\rho = \left\{ v \middle| \begin{array}{l} \exists v_1 v_2.\; v_1 \in \mathcal{E}[\!| e_1 |\!]\rho \text{ and } v_2 \in \mathcal{E}[\!| e_2 |\!]\rho \\ \text{ and } (v_2 \mapsto v) \sqsubseteq v_1 \end{array} \right\} \] Finally, to give meaning to lambda-bound variables, we simply look them up in the environment. \[ \mathcal{E}[\!| x |\!]\rho = \{ \rho(x) \} \]

Now that we have a good representation for function values, we can talk about giving meaning to higher-order casts, that is, casts from one function type to another. Recall that in our strawman semantics, we got stuck when trying to define the meaning of types in the form of map \(\mathcal{T}\) from a type to a set of values. Now we can proceed based on the above definition of values \(v\). (To make the termination of \(\mathcal{T}\) more obvious, we'll instead define \(\mathcal{T}\) has a map from a type and a value to a Boolean. The measure is a lexicographic ordering on the size of the type and then the size of the value.) \begin{align*} \mathcal{T}(\mathtt{Int}, v) &= (\exists n. \; v = n) \\ \mathcal{T}(\star, v) &= \mathsf{true} \\ \mathcal{T}(A \to B, n) &= \mathsf{false} \\ \mathcal{T}(A \to B, \emptyset) &= \mathsf{true} \\ \mathcal{T}(A \to B, v \mapsto v') &= \mathcal{T}(A, v) \text{ and } \mathcal{T}(B, v') \\ \mathcal{T}(A \to B, v_1 \sqcup v_2) &= \mathcal{T}(A \to B, v_1) \text{ and } \mathcal{T}(A \to B, v_2) \end{align*} With \(\mathcal{T}\) defined, we define the meaning to casts as follows. \begin{align*} \mathcal{E} [\!| e : A \Rightarrow^{\ell} B |\!]\rho &= \{ v \mid v \in \mathcal{E} [\!| e |\!]\rho \text{ and } \mathcal{T}(B, v) \}\\ & \quad\; \cup \left\{ \mathsf{blame}\,\ell \middle| \begin{array}{l} \exists v.\; v \in \mathcal{E} [\!| e |\!]\rho \text{ and } \neg \mathcal{T}(B, v)\\ \text{and } (\forall l'. v \neq \mathsf{blame}\,l') \end{array}\right\}\\ & \quad\; \cup \{ \mathsf{blame}\,\ell' \mid \mathsf{blame}\,\ell' \in \mathcal{E} [\!| e |\!]\rho \} \end{align*} This version says that the result of the cast should only be those values of \(e\) that also have type \(B\). It also says that we signal an error when a value of \(e\) does not have type \(B\). Also, if there was an error in \(e\) then we propagate it. The really interesting thing about this semantics is that, unlike the reduction semantics, we actually check functions at the moment they go through the cast, instead of delaying the check to when they are called. We immediately determine whether the function is of the target type. If the function is not of the target type, we can immediately attribute blame to this cast, so there is no need for complex blame tracking rules.

Of course, we need to extend values to include blame: \[ v ::= n \mid \emptyset \mid v \mapsto v \mid v \sqcup v \mid \mathsf{blame}\,\ell \] and augment \(\mathcal{T}\) and \(\mathcal{F}\) to handle \(\mathsf{blame}\,\ell\). \begin{align*} \mathcal{T}(A\to B, \mathsf{blame}\,\ell) &= \mathsf{false} \\ \mathcal{F}(\mathsf{blame}\,\ell, \lambda x{:}A.e, \rho) &= \mathsf{false} \end{align*} To propagate errors to the meaning of the entire program, we augment the meaning of other language forms, such as function application to pass along blame. \begin{align*} \mathcal{E}[\!| e_1 \, e_2 |\!]\rho &= \left\{ v \middle| \begin{array}{l} \exists v_1 v_2.\; v_1 \in \mathcal{E}[\!| e_1 |\!]\rho \text{ and } v_2 \in \mathcal{E}[\!| e_2 |\!]\rho \\ \text{and } (v_2 \mapsto v) \sqsubseteq v_1 \end{array} \right\} \\ & \quad\; \cup \{ \mathsf{blame}\, \ell \mid \mathsf{blame}\, \ell \in \mathcal{E}[\!| e_1 |\!]\rho \text{ or } \mathsf{blame}\,\ell \in \mathcal{E}[\!| e_2 |\!]\rho\} \end{align*}

Two Examples

Let us consider the ramifications of this semantics. The following example program creates a function \(f\) that returns \(1\) on non-zero input and returns the identity function when applied to \(0\). We cast this function to the type \(\mathtt{Int}\to\mathtt{Int}\) on two separate occasions, cast \(\ell_3\) and cast \(\ell_4\), to create \(g\) and \(h\). We apply \(g\) to \(1\) and \(h\) to its result. \[ \begin{array}{l} \mathtt{let}\,f = \left(\lambda x:\mathtt{Int}.\; \begin{array}{l} \mathtt{if}\, x \,\mathtt{then}\, (0: \mathtt{Int}\Rightarrow^{\ell_1}\,\star)\\ \mathtt{else}\, ((\lambda y:\mathtt{Int}.\; y) : \mathtt{Int}\to\mathtt{Int}\Rightarrow{\ell_2} \, \star) \end{array} \right) \; \mathtt{in} \\ \mathtt{let}\,g = (f : \mathtt{Int}\to\star \Rightarrow^{\ell_3} \mathtt{Int}\to\mathtt{Int})\, \mathtt{in} \\ \mathtt{let}\,h = (f : \mathtt{Int}\to\star \Rightarrow^{\ell_4} \mathtt{Int}\to\mathtt{Int})\, \mathtt{in} \\ \mathtt{let}\,z = (g \; 1)\, \mathtt{in} \\ \quad (h\; z) \end{array} \] The meaning of this program is \(\{ \mathsf{blame}\,\ell_3, \mathsf{blame}\,\ell_4\}\). To understand this outcome, we can analyze the meaning of the various parts of the program. (The semantics is compositional!) Toward writing down the denotation of \(f\), let's define auxiliary functions \(id\) and \(F\). \begin{align*} id(n) &= \mathsf{false} \\ id(\emptyset) &= \mathsf{true} \\ id(v \mapsto v') &= (v = v') \\ id(v_1 \sqcup v_2) &= id(v_1) \text{ and } id(v_2) \\ id(\mathsf{blame}\,\ell) &= \mathsf{false} \\ \\ F(n) &= \textsf{false} \\ F(\emptyset) &= \textsf{true} \\ F(0 \mapsto v) &= \mathit{id}(v) \\ F(n \mapsto 0) &= (n \neq 0)\\ F(v_1 \sqcup v_2) &= F(v_1) \text{ and } F(v_2) \\ F(\mathsf{blame}\,\ell) &= \mathsf{false} \end{align*} The denotation of \(f\) is \[ \mathcal{E}[\!| f |\!] = \{ v \mid F(v) \} \] To express the denotation of \(g\), we define \(G\) \begin{align*} G(n) &= \textsf{false} \\ G(\emptyset) &= \textsf{true} \\ G(n \mapsto 0) &= (n \neq 0) \\ G(v_1 \sqcup v_2) &= G(v_1) \text{ and } G(v_2) \\ G(\mathsf{blame}\,\ell) &= \mathsf{false} \end{align*} The meaning of \(g\) is all the values that satisfy \(G\) and also \(\mathsf{blame}\,\ell_3\). \[ \mathcal{E}[\!| g |\!] = \{ v \mid G(v) \} \cup \{ \mathsf{blame}\, \ell_3 \} \] The meaning of \(h\) is similar, but with different blame. \[ \mathcal{E}[\!| h |\!] = \{ v \mid G(v) \} \cup \{ \mathsf{blame}\, \ell_4 \} \] The function \(g\) applied to \(1\) produces \(\{ 0, \mathsf{blame}\, \ell_3\}\), whereas \(h\) applied to \(0\) produces \(\{ \mathsf{blame}\, \ell_4\}\). Thus, the meaning of the whole program is \(\{ \mathsf{blame}\,\ell_3, \mathsf{blame}\,\ell_4\}\).

Because cast \(\ell_3\) signals an error, one might be tempted to have the meaning of \(g\) be just \(\{ \mathsf{blame}\,\ell_3\}\). However, we want to allow implementations of this language that do not blame \(\ell_3\) (\(g\) is never applied to \(0\) after all, so its guilt was not directly observable) and instead blame \(\ell_4\), who was caught red handed. So it is important for the meaning of \(g\) to include the subset of values from \(f\) that have type \(\mathtt{Int}\to\mathtt{Int}\) so that we can carry on and find other errors as well. We shall expect implementations of this language to be sound with respect to blame, that is, if execution results in blame, it should blame one of the labels that is in the denotation of the program (and not some other innocent cast).

Let us return to the example (P0). The denotation of that program is \(\{\mathsf{blame}\,\ell_2\}\) because the cast at \(\ell_2\) is a cast to \((\mathtt{Int}\to \mathtt{Int}) \to \mathtt{Int}\) and the identity function is not of that type. The other case at \(\ell_1\) is innocent because it is a cast to \(\star\) and all values are of that type, including the identity cast.

Discussion

By giving the cast calculus a denotational semantics in terms of finite function tables, it became straightforward to define whether a function value is of a given type. This in turn made it easy to define the meaning of casts, even casts at function type. A cast succeeds if the input value is of the target type and it fails otherwise. With this semantics we assign blame to a cast in an eager fashion, without the need for the blame tracking machinery that is present in the operational semantics.

We saw an example program where the reduction semantics that factors through ground types attributes blame to a cast that the denotational semantics says is innocent. This lends some evidence to that semantics being less desirable.

I plan to investigate whether the alternative reduction semantics is sound with respect to the denotational semantics in the sense that the reduction semantics only blames a cast if the denotational semantics says it is guilty.

Appendix

We give the full definition of the cast calculus here in the appendix. The relation \(\sqsubseteq\) that we used to define table lookup is the dual of the subtyping relation for intersection types. The denotational semantics is a mild reformation of the intersection type system that I discussed in previous blog posts.

Syntax \[ \begin{array}{lcl} A &::=& \mathtt{Int} \mid A \to B \mid \star \\ e &::= &n \mid \mathit{op}(e,e) \mid \mathtt{if}\, e\, \mathtt{then}\, e \,\mathtt{else}\, e \mid x \mid \lambda x{:}A \mid e \; e \mid e : A \Rightarrow^\ell B \end{array} \] Consistency \begin{gather*} \mathtt{Int} \sim \mathtt{Int} \qquad \frac{A \sim B \qquad A' \sim B'} {A \to A' \sim B \to B'} \qquad A \sim \star \qquad \star \sim B \end{gather*} Type System \begin{gather*} \frac{}{\Gamma \vdash n : \mathtt{Int}} \quad \frac{\Gamma \vdash e_1 : \mathtt{Int} \quad \Gamma \vdash e_2 : \mathtt{Int}} {\Gamma \vdash \mathit{op}(e_1,e_2) : \mathtt{Int}} \\[2ex] \frac{\Gamma \vdash e_1 : \mathtt{Int} \quad \Gamma \vdash e_2 : A \quad \Gamma \vdash e_3 : A} {\Gamma \vdash \mathtt{if}\, e_1\, \mathtt{then}\, e_2 \,\mathtt{else}\, e_3 : A} \\[2ex] \frac{x{:}A \in \Gamma}{\Gamma \vdash x : A} \quad \frac{\Gamma,x{:}A \vdash e : B}{\Gamma \vdash \lambda x{:}A.\; e : A \to B} \quad \frac{\Gamma e_1 : A \to B \quad \Gamma e_2 : A} {\Gamma \vdash e_1 \; e_2 : B} \\[2ex] \frac{\Gamma \vdash e : A \quad A \sim B} {\Gamma \vdash (e : A \Rightarrow^\ell B) : B} \end{gather*} Values \[ v ::= n \mid \emptyset \mid v \mapsto v \mid v \sqcup v \mid \mathsf{blame}\,\ell \] Table Lookup (Value Information Ordering) \begin{gather*} \frac{}{n \sqsubseteq n} \quad \frac{v'_1 \sqsubseteq v_1 \quad v_2 \sqsubseteq v'_2} {v_1 \mapsto v_2 \sqsubseteq v'_1 \mapsto v'_2} \quad \frac{}{\mathsf{blame}\,\ell \sqsubseteq \mathsf{blame}\,\ell} \\[2ex] \frac{}{v_1 \sqsubseteq v_1 \sqcap v_2} \quad \frac{}{v_2 \sqsubseteq v_1 \sqcap v_2} \quad \frac{v_1 \sqsubseteq v_3 \quad v_2 \sqsubseteq v_3} {v_1 \sqcup v_2 \sqsubseteq v_3} \\[2ex] \frac{}{v_1 \mapsto (v_2 \sqcup v_3) \sqsubseteq (v_1 \mapsto v_2) \sqcup (v_1 \mapsto v_3)} \quad \frac{}{\emptyset \sqsubseteq v_1 \mapsto v_2} \quad \frac{}{\emptyset \sqsubseteq \emptyset} \end{gather*} \noindent Semantics of Types \begin{align*} \mathcal{T}(\mathtt{Int}, v) &= (\exists n. \; v = n) \\ \mathcal{T}(\star, v) &= \mathsf{true} \\ \mathcal{T}(A \to B, n) &= \mathsf{false} \\ \mathcal{T}(A \to B, \emptyset) &= \mathsf{true} \\ \mathcal{T}(A \to B, v \mapsto v') &= \mathcal{T}(A, v) \text{ and } \mathcal{T}(B, v') \\ \mathcal{T}(A \to B, v_1 \sqcup v_2) &= \mathcal{T}(A \to B, v_1) \text{ and } \mathcal{T}(A \to B, v_2) \\ \mathcal{T}(A\to B, \mathsf{blame}\,\ell) &= \mathsf{false} \end{align*} Denotational Semantics \begin{align*} \mathcal{E}[\!| n |\!]\rho &= \{ n \}\\ \mathcal{E}[\!| \mathit{op}(e_1,e_2) |\!]\rho &= \left\{ v \middle| \begin{array}{l} \exists v_1 v_2 n_1 n_2.\; v_1 \in \mathcal{E}[\!| e_1 |\!]\rho \land v_2 \in \mathcal{E}[\!| e_2 |\!]\rho \\ \land\; n_1 \sqsubseteq v_1 \land n_2 \sqsubseteq v_2 \land v = [\!| \mathit{op} |\!](n_1,n_2) \end{array} \right\}\\ & \quad\; \cup \{ \mathsf{blame}\,\ell' \mid \mathsf{blame}\,\ell' \in (\mathcal{E} [\!| e_1 |\!]\rho \cup \mathcal{E} [\!| e_2 |\!]\rho) \} \\ \mathcal{E}[\!| \mathtt{if}\, e_1\, \mathtt{then}\, e_2 \,\mathtt{else}\, e_3 |\!]\rho &= \left\{ v \middle| \begin{array}{l} \exists v_1 n. v_1 \in \mathcal{E}[\!| e_1 |\!]\rho \land n \sqsubseteq v_1 \\ \land\; (n = 0 \Longrightarrow v \in \mathcal{E}[\!| e_3 |\!]\rho) \\ \land\; (n \neq 0 \Longrightarrow v \in \mathcal{E}[\!| e_2 |\!]\rho) \end{array} \right\}\\ & \quad\; \cup \{ \mathsf{blame}\,\ell' \mid \mathsf{blame}\,\ell' \in (\mathcal{E} [\!| e_1 |\!]\rho \cup \mathcal{E} [\!| e_2 |\!]\rho \cup \mathcal{E} [\!| e_3 |\!]\rho ) \} \\ \mathcal{E}[\!| x |\!]\rho &= \{ \rho(x) \}\\ \mathcal{E}[\!| \lambda x{:}A.\, e|\!]\rho &= \{ v \mid \mathcal{F}(v, \lambda x{:}A.\, e, \rho) \} \\ \mathcal{E}[\!| e_1 \, e_2 |\!]\rho &= \left\{ v \middle| \begin{array}{l} \exists v_1 v_2.\; v_1 \in \mathcal{E}[\!| e_1 |\!]\rho \land v_2 \in \mathcal{E}[\!| e_2 |\!]\rho \\ \land\; (v_2 \mapsto v) \sqsubseteq v_1 \end{array} \right\} \\ & \quad\; \cup \{ \mathsf{blame}\,\ell' \mid \mathsf{blame}\,\ell' \in (\mathcal{E} [\!| e_1 |\!]\rho \cup \mathcal{E} [\!| e_2 |\!]\rho) \} \\ \mathcal{E} [\!| e : A \Rightarrow^{\ell} B |\!]\rho &= \{ v \mid v \in \mathcal{E} [\!| e |\!]\rho \text{ and } \mathcal{T}(B, v) \}\\ & \quad\; \cup \left\{ \mathsf{blame}\,\ell \middle| \begin{array}{l}\exists v.\; v \in \mathcal{E} [\!| e |\!] \rho \text{ and } \neg \mathcal{T}(B, v)\\ \text{and } (\forall \ell'. v \neq \mathsf{blame}\,\ell') \end{array} \right\} \\ & \quad\; \cup \{ \mathsf{blame}\,\ell' \mid \mathsf{blame}\,\ell' \in \mathcal{E} [\!| e |\!]\rho \} \\ \mathcal{F}(n, \lambda x{:}A.\, e, \rho) &= \mathsf{false} \\ \mathcal{F}(\emptyset, \lambda x{:}A.\, e, \rho) &= \mathsf{true} \\ \mathcal{F}(v \mapsto v', \lambda x{:}A.\, e, \rho) &= \mathcal{T}(A,v) \text{ and } v' \in \mathcal{E}[\!| e |\!]\rho(x{:=}v) \\ \mathcal{F}(v_1 \sqcup v_2, \lambda x{:}A.\, e, \rho) &= \mathcal{F}(v_1, \lambda x{:}A.\, e, \rho) \text{ and } \mathcal{F}(v_2, \lambda x{:}A.\, e, \rho) \\ \mathcal{F}(\mathsf{blame}\,\ell, \lambda x{:}A.e, \rho) &= \mathsf{false} \end{align*}

References

  • (Findler 2002) Contracts for higher-order functions. R. B. Findler and M. Felleisen. International Conference on Functional Programming. 2002.
  • (Gronski 2006) Sage: Hybrid Checking for Flexible Specifications. Jessica Gronski and Kenneth Knowles and Aaron Tomb and Stephen N. Freund and Cormac Flanagan. Scheme and Functional Programming Workshop, 2006.
  • (Scott 1970) Outline of a Mathematical Theory of Computation. Dana Scott. Oxford University. 1970. Technical report PRG-2.
  • (Scott 1976) Data Types as Lattices. Dana Scott. SIAM Journal on Computing. 1976. Volume 5, Number 3.
  • (Siek 2009) Exploring the Design Space of Higher-Order Casts. Jeremy G. Siek and Ronald Garcia and Walid Taha. European Symposium on Programming. 2009.
  • (Wadler 2009) Well-typed programs can't be blamed. Philip Wadler and Robert Bruce Findler. European Symposium on Programming. 2009.

1 comment:

  1. Great post! I found these explanations very interesting, helpful and thought-provoking.

    In particular: I must admit that (1) I always prefer an operational view of semantics, if there is one that's applicable and (2) I humbly take objection to classifying casts as values.

    In particular, I'd prefer an operational account of casts/blame that does not classify casts as values (that is, the ordinary syntax of values is preserved, and has no cast form).

    So my question is: Has anyone explored an environment-passing-style way of doing casts that accomplishes (1) and (2) above?

    Specifically, suppose I extend environments with lists of "cast types" for each variable-value pairing:

    Closing environments (with cast-type lists):
    ρ ::= ε | ρ, x : v ► τs
    τs ::= ε | τs, τ

    We could normalize the list of types, where we coalesce types that are equivalent. Alternatively, we could keep the whole list and just check against *all* of them. Those two approaches should be equivalent (ignoring space/time costs).

    Finally, I’m proposing that we hijack the usual environment-projection notation `ρ(x) = v` to mean something like this:

    (x : v ► τs) in Γ
    ⊢ v : τs
    ----------------- :: env-lookup
    ρ(x) "is" v

    Where `⊢ v : τs` is the ordinary notion of value typing, but “lifted” to lists of types.

    I should note that we're doing something like this in our "VM of the Future" work (https://arxiv.org/abs/1608.06012). In particular, we use environment-passing style, and keep type information in the environment as we interleave program evaluation and checking. My proposal above is an attempt to transport that idea back into the world of gradual typing.

    Any comments or advice is much appreciated!

    ReplyDelete