Last update April 29, 2009

Ownership Types In D

Difference (last change) (Author, normal page display)

Changed: 99,101c99,104

# Objects and structs use the local interface[[code]
f.push(5); // Error: push is not part of the scope interface
temp.push(5); // Okay, push is part of the local interface

# Stack Objects and structs use the local interface, by default. [[code]
auto sf = new auto(Stack!(Foo))(); // declared as class Stack(T) {}, stack interface defaults
auto f1 = new Foo();
auto f2 = new auto(Foo)();
sf.push(f1); // Okay, push(local Foo) is defined
sf.push(f2); // Error: push(stack Foo) is is not defined, use a scope class declaration instead

A proposal for a concurrency type system for D

Highlights

Fixes Bug 2095
Scope delegates do not require heap allocation (i.e. may safely behave like D 1.0).
Permits thread-local garbage collection
Permits multiple threading models.
No costly escape analysis.
No complex function constraints.

Background

Why a type system for concurrency?

Mostly, people talk about threading models, i.e. locks, actors, message passing, which concerns themselves with how one shares state. But, consider David Callahan’s Pillars of concurrency: Isolation, Scalability and Consistency. Isolation is provided by the type system. Essentially, its job is to separate shared state from un-shared state. In general, the former has to be designed in order to prevent deadlocks, races, etc. and has to use slower techniques, such as memory fences or locks. Un-shared state, on the other hand, doesn’t require this overhead and/or can use other algorithms and is therefore faster. The type system can also allow for the garbage collector to use thread-local heaps, which increases both its performance and scalability. The type system also helps the consistency of the language (although Callahan’s meaning was specific to shared state).

What’s thread local heaps?

A thread local heap is just like it sounds: a heap of memory that solely belongs to a single thread. This means that: 1) a lock isn’t needed during allocation, etc. 2) collection can be run by the parent thread so there’s no expensive kernel calls, context switches, etc. 3) each thread’s heap is relatively smaller and only visible by its own stack, reducing spurious pointers and collection time. 4) collection occurs in parallel with other running threads, i.e. a worker thread may collect without pausing an interactive GUI or rendering thread. Thus they increase performance and scalability. The caveat is, that shared state doesn’t get these benefits and must use a single, shared heap.

Where’s D at?

Currently, a ‘shared’ and ‘scope’ type have been implemented in D 2.0, though there are no rules (or even correct error messages) associated with them yet. Walter posted on his blog a while ago about escape analysis. Escape analysis determines how and in what ways a piece of state moves between scopes, i.e. a reference to a ‘local’ class is saved to a ‘shared’ class, and if it is valid, i.e. not in this case. While possible in dynamic languages, in static languages virtual functions and function pointers generally prohibit escape analysis. Therefore, Walter suggested using the type system to document a function’s properties and there was a decent discussion about it in the newsgroup about how to do this. Notably, there was a proposal by Michel Fortin about using constraints on the relative scopes of a function’s parameters, similar to template constraints. Bartosz is currently reading/research threading models their associated concurrency type systems. He has been posting very informative blogs about them for those interested.

Overview

types	Description	Ownership	Transitivity
scope	super-interface	unknown	deep†
stack	current scope	stack	implicit
local	object default	local-heap	deep†
shared	thread safe	shared-heap	deep†
mobile	unique objects	shared-heap	shallow

This proposal concerns using five distinct ownership types to provide the same protection and almost the same flexibility as complete escape analysis. It has the advantage of not requiring complex analysis or ownership inference/propagation. The programmer also need not decorate each function call with complex constraints. ‘Scope’ acts as a common super-like interface, allowing many functions to only be written once and not a combinatorial number of times for each possible type combination. As such, it fills the same role as const does for immutable-mutable type system. However, compared to previously proposed ‘no escape’ types, it’s less conservative and the rules providing its isolation properties are listed in the section on ‘scope’ below. ‘local’ objects are restricted to the current thread while ‘shared’ objects may be shared between threads. An object with a single owner is typically referred to as being unique or ‘mobile’ and allows an object to be shared, but retain the simplicity and performance of local objects. There is also an implicit ownership class for all data located on the stack.

This plugs several well known holes in current stack classes and prevents pointers to variables on the stack from escaping, both to the heap and to shallower locations on the stack. ‘mobile’ is similar to other unique pointer wrappers, such as in std.typecons. The principal reason for making this a language level construct, instead of the current library solution is that one of the major benefits of ‘scope’ is the ability to pass a ‘mobile’ to a function as ‘scope’ without move semantics, in many situations, making writing code easier and more generic.

Classes support qualifier polymorphism, like to Boyapati and Rinard’s GRFJ (and probably others). i.e. each ownership type is considered a template parameter to the class. However, unlike GRFJ, it is a single, implicit property, and the parameters of the classes’ methods may mix and match valid ownership types as needed. Class definers must also declare which ownership classes are supported. Thus, by default a class can not be shared. This creates a strong separation between the ownership types, which results in clean isolation.

†Transitivity

Technically, all ownership types are shallow, as they represent the physical location of an object’s memory. Thus, using transitivity is mostly about syntactic sugar. This doesn’t reduce expressiveness as scope visibility rules always moves the scope broader. i.e. an object on the local heap can not store a reference to the stack without casting.

Issues

scope(T) in a function body conflicts with scope guard statements. This is a general problem with Walter’s choice of using the scope keyword for this concept. A clean solution is to mandate the use of {} in scope guard statements. Others include using an alternative keyword(auto, final, scope!(exit)), let the ambiguity stand, add a keyword, etc.
Clear, concise ddoc documentation is an issue (Multiple entries per class (one per ownership type) vs One large interleaved entry). This is a general problem with templated classes.

scope [ Common Super, Unknown Allocation, Transitive† ]

Use of the scope keyword for the common ownership-type is based upon Walter’s original escape analysis blog. However, this design is based upon using the type system restrictions as opposed to full escape analysis to prevent object escape. Full escape analysis would alleviate the restrictions in rule 6. Basic Rules:

Refers to scope definitions inside a function body.

May only be assigned at declaration

scope Node!(int) n = mySharedNode;
n.next = new Node!(int)(); // Error: Possible escape
n = n.next;                // Error: see relaxation of this rule below

Applies to references taken from scope types

scope int* value = &(n.value);   
scope(int)* v2   = &(n.value);

Implicit conversion is always fully transitive. See Bug 2095
Foo[] y; scope Foo[] x = y;

Mixed implicit conversion is illegal.

See Bug 2095

scope(Foo)[] z = y; // Error: cannot implicitly convert...

Functions with (scope U) ref, out, * or return parameters are said to be scope_ escape(T) where T is U, a member return of U or subtype of U.

Implicit conversions of stack T to scope T are illegal if a function is scope_escape(T). This prevents deep stack objects escaping to shallower contexts.
A mobile T may be passed to a non-scope_escape(T) function _without_ movement if it is not also passed to a another, mobile parameter.

Relaxation of Rule 2

Technically, only the tail of a scope type must obey rule 2). Therefore, assigning to the head of a scope type is valid. This allows for more imperative style programming and for things like swap to be valid, however, I don’t know how difficult this is to implement.

n = n.next;
auto n2 = n;
swap(n, n2);
swap(n, n.next); // Error: Cannot take the reference of a scope tail
Node!(int) m = new Node!(int)();
swap(n, m); // Error: m is local, not scope

Relaxation of Rule 6

Rule 6 may be partially relaxed using local analysis of the function for the escape of each particular variable. Practically, this might not help much since it would have to treat called functions or receiving functions in a conservative manner, i.e. if it could happen assume it does. This is a local escape analysis system; a whole-program escape analysis system, would eliminate the need for this rule.

Interfaces to Scope Objects (or structs)

The interface to scope objects is automatically generated from the intersection of the public shared and local interfaces of the class. Member variables that only differ by ownership and member functions that only differ by their return’s ownership are included and considered of scope ownership.

stack [ Current Scope, Stack Allocation, Implicit ]

This is the ownership type of all things located on a thread’s stack. As the keyword stack should not be reserved, I’m choosing to not have a keyword and just have new scope(Foo) or new auto(Foo) return a stack allocated object, with a type that’s internal to the compiler. Rules:

Refers to all variables located on the stack.

scope Foo f  = new Foo();        // Old syntax. Sugar for
auto  temp   = new auto(Foo)();  // auto used to be used for RAII (DMD 0.043)
auto  temp2  = new scope(Foo)(); // other possible syntax
scope Foo f2 = temp;
int x        = 3;                // Including value types

Shallow, does no alter the tail type
int* y = new int; *y = x;

Applies to references taken from stack types

int* z = &x;                    // Error: can not convert type auto(int)* to int*

Stack Objects and structs use the local interface, by default.

auto sf = new auto(Stack!(Foo))();  // declared as class Stack(T) {}, stack interface defaults
auto f1 = new Foo();
auto f2 = new auto(Foo)();
sf.push(f1);                        // Okay,  push(local Foo) is defined
sf.push(f2);                        // Error: push(stack Foo) is is not defined, use a scope class declaration instead

Note that this catches all of Walter’s examples from the Escape Analysis blog via rule 3:

int* foo()
{
    int x = 3;
    return &x;                         // Error: can not convert type auto(int)* to int*
}

int* bar(int* p) { return p; }
int* foo()
{
    int x = 3;
    return bar(&x);                    // Error: can not convert type auto(int)* to int*
}

void abc(int x, int** p) { *p = &x; } // Error: can not convert type auto(int)* to int*

Heap types

types	Description	Allocation	Transitivity
local	Object Default	Local-Heap?	Deep†
shared	Thread Safe	Shared-Heap?	Deep†
mobile	Unique Objects	Shared-Heap?	Shallow

There are three styles of heap allocated objects: default (a.k.a. local), shared and mobile. A class is implicitly templated on each style of heap allocation and only inherits from the super type of the same style. Thus each style may have very different interfaces, though all implicitly implement the automatically generated ‘scope’ interface. Mobile references implement move semantics, with the exception of scope rule 6b. Thus mobiles do not require garbage collection themselves (though they still need to be scanned), since they can be deterministically deleted on scope exit.

Class Instantiation

auto a = new T();         // object allocated using the class’ default
auto b = new shared(T)(); //   safe shared object
auto c = new mobile(T)(); // unsafe shared object, protected by move semantics

Class Definitions

Class Definitions	Restricted to	Default
class	local (deprecated stack)	local
scope class	stack	stack
shared(_model_) class	shared	shared
shared( mobile) class	mobile	mobile
shared class	local shared	local
mobile class	local mobile	local
shared mobile class	local shared mobile	local
shared scope class	stack local shared	local
mobile scope class	stack local mobile	local
shared mobile scope class	stack local shared mobile	local
shared(_model_) mobile class	shared mobile	shared

Rules: shared(_model_, ...) defines both allocation and protection methodology. It may apply to variable, function or class definitions. Each methodology provides a set of rules and optional syntactic sugar specific to that style of thread-safe programming. This also provides a way of simply adding new styles of thread programming as they are developed. Here are some initial suggestions:

Model	Description
unsafe	Local/stack members are invalid. Members default to shared.
volatile	unsafe + sequential consistency guaranteed.
mobile	volatile + represents the mobile ownership class.
manual	volatile + no public mutable member variables. Default model.
atomic	manual + all methods must be called from STM atomic blocks
synchronized	manual + all methods wrapped in synchronized blocks
actor	manual + methods may only take shared or mobile parameters. Methods are automatically wrapped with the appropriate runtime backend. i.e. task creation and returning of a future/promise	etc.

Conditional compilation Extensions to the is expression syntax, e.g. is(T==mobile), make scopeof(T) templates, etc possible. One possible piece of syntactic sugar, is for scope to mean scopeof(this) inside classes.