Types
Every variable, item and value in a Rust program has a type. The type of a value defines the interpretation of the memory holding it.
Built-in types and type-constructors are tightly integrated into the language, in nontrivial ways that are not possible to emulate in user-defined types. User-defined types have limited capabilities.
Primitive types
The primitive types are the following:
- The boolean type
bool
with valuestrue
andfalse
. - The machine types (integer and floating-point).
- The machine-dependent integer types.
- Arrays
- Tuples
- Slices
- Function pointers
Machine types
The machine types are the following:
-
The unsigned word types
u8
,u16
,u32
andu64
, with values drawn from the integer intervals [0, 2^8 - 1], [0, 2^16 - 1], [0, 2^32 - 1] and [0, 2^64 - 1] respectively. -
The signed two's complement word types
i8
,i16
,i32
andi64
, with values drawn from the integer intervals [-(2^(7)), 2^7 - 1], [-(2^(15)), 2^15 - 1], [-(2^(31)), 2^31 - 1], [-(2^(63)), 2^63 - 1] respectively. -
The IEEE 754-2008
binary32
andbinary64
floating-point types:f32
andf64
, respectively.
Machine-dependent integer types
The usize
type is an unsigned integer type with the same number of bits as the
platform's pointer type. It can represent every memory address in the process.
The isize
type is a signed integer type with the same number of bits as the
platform's pointer type. The theoretical upper bound on object and array size
is the maximum isize
value. This ensures that isize
can be used to calculate
differences between pointers into an object or array and can address every byte
within an object along with one byte past the end.
Textual types
The types char
and str
hold textual data.
A value of type char
is a Unicode scalar value (i.e. a code point that
is not a surrogate) represented as a 32-bit unsigned word in the 0x0000 to
0xD7FF or 0xE000 to 0x10FFFF range. A [char]
array is effectively a UCS-4 /
UTF-32 string.
A value of type str
is a Unicode string, represented as an array of 8-bit
unsigned bytes holding a sequence of UTF-8 code points. Since str
is of
unknown size, it is not a first-class type, but can only be instantiated
through a pointer type, such as &str
.
Tuple types
A tuple type is a heterogeneous product of other types, called the elements of the tuple. It has no nominal name and is instead structurally typed.
Tuple types and values are denoted by listing the types or values of their elements, respectively, in a parenthesized, comma-separated list.
Because tuple elements don't have a name, they can only be accessed by
pattern-matching or by using N
directly as a field to access the
N
th element.
An example of a tuple type and its use:
# #![allow(unused_variables)] #fn main() { type Pair<'a> = (i32, &'a str); let p: Pair<'static> = (10, "ten"); let (a, b) = p; assert_eq!(a, 10); assert_eq!(b, "ten"); assert_eq!(p.0, 10); assert_eq!(p.1, "ten"); #}
For historical reasons and convenience, the tuple type with no elements (()
)
is often called ‘unit’ or ‘the unit type’.
Array, and Slice types
Rust has two different types for a list of items:
[T; N]
, an 'array'&[T]
, a 'slice'
An array has a fixed size, and can be allocated on either the stack or the heap.
A slice is a 'view' into an array. It doesn't own the data it points to, it borrows it.
Examples:
# #![allow(unused_variables)] #fn main() { // A stack-allocated array let array: [i32; 3] = [1, 2, 3]; // A heap-allocated array let vector: Vec<i32> = vec![1, 2, 3]; // A slice into an array let slice: &[i32] = &vector[..]; #}
As you can see, the vec!
macro allows you to create a Vec<T>
easily. The
vec!
macro is also part of the standard library, rather than the language.
All in-bounds elements of arrays and slices are always initialized, and access to an array or slice is always bounds-checked.
Struct types
A struct
type is a heterogeneous product of other types, called the
fields of the type.1
struct
types are analogous to struct
types in C,
the record types of the ML family,
or the struct types of the Lisp family.
New instances of a struct
can be constructed with a struct
expression.
The memory layout of a struct
is undefined by default to allow for compiler
optimizations like field reordering, but it can be fixed with the
#[repr(...)]
attribute. In either case, fields may be given in any order in
a corresponding struct expression; the resulting struct
value will always
have the same memory layout.
The fields of a struct
may be qualified by visibility
modifiers, to allow access to data in a
struct outside a module.
A tuple struct type is just like a struct type, except that the fields are anonymous.
A unit-like struct type is like a struct type, except that it has no fields. The one value constructed by the associated struct expression is the only value that inhabits such a type.
Enumerated types
An enumerated type is a nominal, heterogeneous disjoint union type, denoted
by the name of an enum
item. 2
The enum
type is analogous to a data
constructor declaration in
ML, or a pick ADT in Limbo.
An enum
item declares both the type and a number of variant
constructors, each of which is independently named and takes an optional tuple
of arguments.
New instances of an enum
can be constructed by calling one of the variant
constructors, in a call expression.
Any enum
value consumes as much memory as the largest variant constructor for
its corresponding enum
type.
Enum types cannot be denoted structurally as types, but must be denoted by
named reference to an enum
item.
Recursive types
Nominal types — enumerations and
structs — may be recursive. That is, each enum
constructor or struct
field may refer, directly or indirectly, to the
enclosing enum
or struct
type itself. Such recursion has restrictions:
- Recursive types must include a nominal type in the recursion (not mere type definitions, or other structural types such as arrays or tuples).
- A recursive
enum
item must have at least one non-recursive constructor (in order to give the recursion a basis case). - The size of a recursive type must be finite; in other words the recursive fields of the type must be pointer types.
- Recursive type definitions can cross module boundaries, but not module visibility boundaries, or crate boundaries (in order to simplify the module system and type checker).
An example of a recursive type and its use:
# #![allow(unused_variables)] #fn main() { enum List<T> { Nil, Cons(T, Box<List<T>>) } let a: List<i32> = List::Cons(7, Box::new(List::Cons(13, Box::new(List::Nil)))); #}
Pointer types
All pointers in Rust are explicit first-class values. They can be copied, stored into data structs, and returned from functions. There are two varieties of pointer in Rust:
-
References (
&
) : These point to memory owned by some other value. A reference type is written&type
, or&'a type
when you need to specify an explicit lifetime. Copying a reference is a "shallow" operation: it involves only copying the pointer itself. Releasing a reference has no effect on the value it points to, but a reference of a temporary value will keep it alive during the scope of the reference itself. -
Raw pointers (
*
) : Raw pointers are pointers without safety or liveness guarantees. Raw pointers are written as*const T
or*mut T
, for example*const i32
means a raw pointer to a 32-bit integer. Copying or dropping a raw pointer has no effect on the lifecycle of any other value. Dereferencing a raw pointer or converting it to any other pointer type is anunsafe
operation. Raw pointers are generally discouraged in Rust code; they exist to support interoperability with foreign code, and writing performance-critical or low-level functions.
The standard library contains additional 'smart pointer' types beyond references and raw pointers.
Function item types
When referred to, a function item yields a zero-sized value of its function item type. That type explicitly identifies the function - its name, its type arguments, and its early-bound lifetime arguments (but not its late-bound lifetime arguments, which are only assigned when the function is called) - so the value does not need to contain an actual function pointer, and no indirection is needed when the function is called.
There is currently no syntax that directly refers to a function item type, but
the compiler will display the type as something like fn() {foo::<u32>}
in error
messages.
Because the function item type explicitly identifies the function, the item types of different functions - different items, or the same item with different generics - are distinct, and mixing them will create a type error:
fn foo<T>() { }
let x = &mut foo::<i32>;
*x = foo::<u32>; //~ ERROR mismatched types
However, there is a coercion from function items to function pointers
with the same signature, which is triggered not only when a function item
is used when a function pointer is directly expected, but also when different
function item types with the same signature meet in different arms of the same
if
or match
:
# #![allow(unused_variables)] #fn main() { # let want_i32 = false; # fn foo<T>() { } // `foo_ptr_1` has function pointer type `fn()` here let foo_ptr_1: fn() = foo::<i32>; // ... and so does `foo_ptr_2` - this type-checks. let foo_ptr_2 = if want_i32 { foo::<i32> } else { foo::<u32> }; #}
Function pointer types
Function pointer types, created using the fn
type constructor, refer
to a function whose identity is not necessarily known at compile-time. They
can be created via a coercion from both function items
and non-capturing closures.
A function pointer type consists of a possibly-empty set of function-type
modifiers (such as unsafe
or extern
), a sequence of input types and an
output type.
An example of a fn
type:
# #![allow(unused_variables)] #fn main() { fn add(x: i32, y: i32) -> i32 { x + y } let mut x = add(5,7); type Binop = fn(i32, i32) -> i32; let bo: Binop = add; x = bo(5,7); #}
Closure types
A closure expression produces a closure value with a unique, anonymous type that cannot be written out.
Depending on the requirements of the closure, its type implements one or more of the closure traits:
-
FnOnce
: The closure can be called once. A closure called asFnOnce
can move out values from its environment. -
FnMut
: The closure can be called multiple times as mutable. A closure called asFnMut
can mutate values from its environment.FnMut
inherits fromFnOnce
(i.e. anything implementingFnMut
also implementsFnOnce
). -
Fn
: The closure can be called multiple times through a shared reference. A closure called asFn
can neither move out from nor mutate values from its environment, but read-only access to such values is allowed.Fn
inherits fromFnMut
, which itself inherits fromFnOnce
.
Closures that don't use anything from their environment ("non capturing closures")
can be coerced to function pointers (fn
) with the matching signature.
To adopt the example from the section above:
# #![allow(unused_variables)] #fn main() { let add = |x, y| x + y; let mut x = add(5,7); type Binop = fn(i32, i32) -> i32; let bo: Binop = add; x = bo(5,7); #}
Trait objects
In Rust, a type like &SomeTrait
or Box<SomeTrait>
is called a trait object.
Each instance of a trait object includes:
- a pointer to an instance of a type
T
that implementsSomeTrait
- a virtual method table, often just called a vtable, which contains, for
each method of
SomeTrait
thatT
implements, a pointer toT
's implementation (i.e. a function pointer).
The purpose of trait objects is to permit "late binding" of methods. Calling a method on a trait object results in virtual dispatch at runtime: that is, a function pointer is loaded from the trait object vtable and invoked indirectly. The actual implementation for each vtable entry can vary on an object-by-object basis.
Note that for a trait object to be instantiated, the trait must be object-safe. Object safety rules are defined in RFC 255.
Given a pointer-typed expression E
of type &T
or Box<T>
, where T
implements trait R
, casting E
to the corresponding pointer type &R
or
Box<R>
results in a value of the trait object R
. This result is
represented as a pair of pointers: the vtable pointer for the T
implementation of R
, and the pointer value of E
.
An example of a trait object:
trait Printable { fn stringify(&self) -> String; } impl Printable for i32 { fn stringify(&self) -> String { self.to_string() } } fn print(a: Box<Printable>) { println!("{}", a.stringify()); } fn main() { print(Box::new(10) as Box<Printable>); }
In this example, the trait Printable
occurs as a trait object in both the
type signature of print
, and the cast expression in main
.
Since a trait object can contain references, the lifetimes of those references need to be expressed as part of the trait object. The assumed lifetime of references held by a trait object is called its default object lifetime bound. These were defined in RFC 599 and amended in RFC 1156.
For traits that themselves have no lifetime parameters, the default bound is based on what kind of trait object is used:
// For the following trait...
trait Foo { }
// ...these two are the same:
Box<Foo>
Box<Foo + 'static>
// ...and so are these:
&'a Foo
&'a (Foo + 'a)
The + 'static
and + 'a
refer to the default bounds of those kinds of trait
objects, and also to how you can directly override them. Note that the innermost
object sets the bound, so &'a Box<Foo>
is still &'a Box<Foo + 'static>
.
For traits that have lifetime parameters of their own, the default bound is based on that lifetime parameter:
// For the following trait...
trait Bar<'a>: 'a { }
// ...these two are the same:
Box<Bar<'a>>
Box<Bar<'a> + 'a>
The default for user-defined trait objects is based on the object type itself.
If a type parameter has a lifetime bound, then that lifetime bound becomes the
default bound for trait objects of that type. For example, std::cell::Ref<'a, T>
contains a T: 'a
bound, therefore trait objects of type Ref<'a, SomeTrait>
are the same as Ref<'a, (SomeTrait + 'a)>
.
Type parameters
Within the body of an item that has type parameter declarations, the names of its type parameters are types:
# #![allow(unused_variables)] #fn main() { fn to_vec<A: Clone>(xs: &[A]) -> Vec<A> { if xs.is_empty() { return vec![]; } let first: A = xs[0].clone(); let mut rest: Vec<A> = to_vec(&xs[1..]); rest.insert(0, first); rest } #}
Here, first
has type A
, referring to to_vec
's A
type parameter; and rest
has type Vec<A>
, a vector with element type A
.
Self types
The special type Self
has a meaning within traits and impls. In a trait definition, it refers
to an implicit type parameter representing the "implementing" type. In an impl,
it is an alias for the implementing type. For example, in:
# #![allow(unused_variables)] #fn main() { pub trait From<T> { fn from(T) -> Self; } impl From<i32> for String { fn from(x: i32) -> Self { x.to_string() } } #}
The notation Self
in the impl refers to the implementing type: String
. In another
example:
# #![allow(unused_variables)] #fn main() { trait Printable { fn make_string(&self) -> String; } impl Printable for String { fn make_string(&self) -> String { (*self).clone() } } #}
The notation &self
is a shorthand for self: &Self
. In this case,
in the impl, Self
refers to the value of type String
that is the
receiver for a call to the method make_string
.