Monday, October 08, 2012

Is TypeScript gradually typed? Part 2

Consider the following TypeScript program, in which a number is stored in variable x of type any and then passed to the display function that expects a string. As we saw in the previous post, a gradual type system allows the implicit down-cast from any to string, so this is a well-typed program.

function display(y : string) {
   document.body.innerHTML = y.charAt(0);
}

var x : any = 3;
display(x);
But what happens at run-time? The answer for TypeScript is that this program hits an error at the y.charAt(0) method call because charAt is not supported by numbers like 3. But isn't y guaranteed to be a string? No, not in TypeScript. TypeScript does not guarantee that the run-time value in a variable is consistent with the static type of the variable. The reason for this is simple, TypeScript does not perform any run-time checks at down-casts to ensure that the incoming value is of the target type. In the above program, the call display(x) causes an implicit cast to string, but there's no run-time check to make sure that the value is in fact a string. TypeScript is implemented as a compiler to JavaScript, and the compiler simply ignores the implicit casts. Let's refer to this no-checking semantics as level 1 gradual typing. I briefly describe this approach in the paper Gradual Typing for Objects.

Level 2 Gradual Typing

A second alternative semantics for gradual typing is to perform run-time checking to ensure that values are consistent with the static types. For implicit casts concerning simple types like number and string, this run-time checking is straightforward. In the above example program, an error would be signaled just prior to the call to display, saying that display expects a string, not a number.

For implicit casts concerning complex types, such as function and object types, run-time checking is more subtle. Consider the following program that defines a deriv function that takes another function as a parameter.

function deriv(d:number, f:(number)=>number, x:number) {
 return (f(x + d) - f(x - d)) / (2.0 * d);
}

function fun(y):any {
 if (y > 0)
  return Math.pow(y,3) - y - 1;
 else
  return "yikes";
}

deriv(0.01, fun, 3.0);
deriv(0.01, fun, -3.0);
The function fun has type (any)=>any, and at each call to deriv, this function is implicitly cast to (number)=>number. The fundamental challenge in casting functions is that it's impossible to tell in general how a function will behave, and in particular, what the return value will be. Here we don't know whether fun will return a number or a string until we've actually called it.

The standard way to deal with function casts is to delay the checking until subsequent calls. One way to visualize this semantics is to imagine the compiler generating the following wrapper function, casted_fun, that applies casts to the argument and return value.

function casted_fun(z:number):number {
 return <number>fun(<any>z);
}

deriv(0.01, casted_fun, 3.0);
deriv(0.01, casted_fun, -3.0);

My first two papers on gradual typing, Gradual Typing for Functional Languages and Gradual Typing for Objects, both used level 2 gradual typing.

Level 3 Gradual Typing

The down-side of delayed checking of function casts is that when an error is finally caught, the location of the error can be far away from the cast that failed. In the above example, the error would occur during the call f(x + d), not at the call to deriv. Findler and Felleisen solved this problem by introducing the notion of blame tracking in their paper Contracts for higher-order functions. The idea is to associate source location information with each cast and then to carry along this information at run-time, in the wrapper functions, so that when the cast in a wrapper fails, it can emit an error that mentions the source location of the original cast, in this example, the call to deriv.

Implementing casts in a way that supports blame tracking while also keeping space overheads to a constant factor is challenging. My paper Threesomes, With and Without Blame shows how to do this.

Discussion

Each of the three levels comes with some advantages and disadvantages. Level 1 gradual typing is the easiest to implement, an important engineering concern, and it comes with no run-time overhead, as there is no run-time checking. On the other hand, level 1 gradual typing does not provide run-time support for catching broken invariants, such as deriv's expectation that its arguments have type string. Thus, a TypeScript programmer that really wants to enforce such an invariant would need to add code to check the type of the argument, a common practice today in JavaScript.

Level 2 gradual typing ensures that the value stored in a variable is consistent with the variable's static type and it provides the run-time checking to catch when this invariant is about to be broken. Thus, level 2 gradual typing removes the need for hand-written type tests. Also, level 2 gradual typing opens up the possibility of compiling statically-typed regions of a program in a more efficient, type-specific manner. (This is an active area of my current research.) The disadvantages of level 2 gradual typing are the run-time overhead from cast checking and the increased implementation complexity.

Level 3 gradual typing improves on level 2 by adding blame tracking, thereby improving the diagnostic errors reported when a cast fails. The extra cost of blame tracking is not very significant, so I would always suggest level 3 over level 2.

6 comments:

  1. Martin12:09 AM

    The goal of TypeScript is to replace the "Any" type completely, and thus to catch all wrong types. The purpose for the Any type is only to support "backwards"-compatibility. Your code example is just broken, as you should really not use Any if you don't really need it. If you have some old JavaScript code or a framework not made by you, then you should instead introduce type definitions to support type helping for those too, then you don't need to use the Any type.

    ReplyDelete
    Replies
    1. The example is representative of situations in which statically typed code interacts with dynamically typed code. There is a lot of JavaScript code out there, so such interactions are going to be the norm for a very long time.

      Regarding whether all that JavaScript code should eventually be rewritten in TypeScript, my position is that the natural steady state should include both statically and dynamically-typed code. They both have their strengths and weaknesses, as we're finding out in studies by researchers like Hanenberg and Tanter.

      I'd be interested to hear where you heard about this goal to replace "any", or is that your personal interpretation?

      Delete
    2. Martin8:36 AM

      This is my personal interpretation and I should have made that clear, sorry about that. But still I see the Any type only for legacy code, and there you obviously don't have types. TypeScript doesn't want to offer run time type checks, because that would also slow down the code. For JavaScript you really should just write definition files, then you don't have to use the Any-type anymore and you get perfectly good type checking.

      Delete
  2. TypeScript does not have enough type constructs to eliminate 'any' completely.

    Consider this method where I need an option type:

    // bytes: (string|number[])
    var toByteArray = function (bytes) {
    if (bytes.substr === undefined) {
    return bytes;
    }

    var strBytes = bytes;

    var oBytes: number[] = [];

    for (var i = 0; i < strBytes.length; i++) {
    oBytes.push(strBytes.charCodeAt(i));
    }

    return oBytes;
    }

    Since there is no option type (yet?), I am stuck with any.

    However, what is neat about TypeScript is the ability to have a hybrid type:

    var a: { propA:string }; // Type checked
    a['dynamicProp'] = "New property"; // Not-typed checked - programmer's responsibility

    This can help eliminate many use cases for 'any'.

    Also, as TypeScript matures, it should improve its dynamic type checking skills.
    It would be great if the compiler gets mature enough to allow the following:

    var a: { propA:string };
    //a: {propA:string}
    a['dynamicProp'] = "New property";
    //a: {propA:string; dynamicProp:string} - during the rest of local scope

    Thanks for the blog

    ReplyDelete
  3. Forbidding [any] would drastically disconnect TypeScript from JavaScript, and not for he better. A real strength of jS is the ability to understand messages (mostly in JSON) in a very tolerant way. If [any] was not fully supported or deprecated, you'd enforce contracts for messages, and take a look at XML schemes and all the versioning nightmare you gain. The web runs on Postel's law.

    ReplyDelete
  4. Thanks!

    I was referred to your blog from the soundness discussion on TypeScript language forum (http://typescript.codeplex.com/discussions/428572). The cavalier attitude to type soundness in TypeScript has been quite disappointing. Contracts as per Findler & Felleisen would be ideal, but I expected at the minimum to have a language subset with a sound type system; unfortunately there isn't one, or at least this subset does not include functions.

    Here is an example that fails with a type error without even using the `any` type:

    http://gist.github.com/4449351#file-unsound-ts

    I maintain a similar system, WebSharper, that compiles F# to JavaScript. We are really enjoying the sound and reliable ML core of F#. Yes, F# has a universal type (obj), but it has explicit downcasts - so you can stray from the sound core, but you have to be explicit about it, similar to "unsafePerformIO" in Haskell. While this has worked OK, type errors still leak in, mostly through the FFI layer where JavaScript functions are imported with an incorrect type declaration. So now we are looking at implementing contract checking at the FFI boundary. I will be reading your papers on contracts.

    -- toyvo

    ReplyDelete