Puya Arrays
- Status: Decided
- Owner: Daniel McGregor (MakerX)
- Deciders: Neil Campbell (MakerX), Daniel McGregor (MakerX), Tristan Menzel (MakerX), Larkin Young (Algorand Foundation), David Rojas (Algorand Foundation), Brian Whippo (Algorand Foundation), Joe Polny (Algorand Foundation), Giorgio Ciotti (Algorand Foundation)
- Date created: 2024-12-10
- Date decided: 2024-12-19
- Date updated: 2024-12-19
Context
Section titled “Context”Arrays are a fundamental primitive in a high level languages and as such broad support is desirable, even in a constrained environment like the AVM. Currently Algorand Python only supports ARC-4 encoded arrays that have been designed to be efficient for reading, letting clients interacting with smart contracts manage the additional work required to ARC-4 encode the values. Given they are not efficient for creating or manipulating values, we have room to improve and introduce a more general purpose array.
In addition to how an array is encoded, the other facet to consider is where the array is stored in memory. The AVM supports two different types of memory which can be leveraged when writing a program: the stack and scratch space. These concepts are akin to a stack and heap found in more traditional virtual machine environments like the JVM.
The AVM memory types have some constraints that need to be considered. The stack can hold up to 1000 values, but only 256 are addressable at any point. Scratch slots support up to 256 addressable values. Both support values types of either a single uint64 or a byte slice of up to 4096 bytes. The stack is addressed using relative static offsets whereas scratch slots are addressed using dynamic absolute indexes. The stack is a good fit for value based semantics as each value is a copy, whereas scratch slots are a good fit for reference based semantics as each value can be uniquely addressed.
In our target high level languages (Python and JavaScript), the builtin arrays (or lists) types have reference based semantics and mutable API’s. Meaning that when an array is modified, all references to that array see the modification.
The developer experience with Puya Arrays will depend on the API design (mutable vs immutable) and where those arrays reside in memory (stack vs scratch) this ADR will outline the pros and cons of some different approaches with these facets in mind.
Requirements
Section titled “Requirements”- Provide a more efficient array implementation for general purpose use
- Provide an API that communicates clearly the effect of operations and is semantically correct
- Support a broad range of types (including non ARC-4 types)
Principles
Section titled “Principles”- AlgoKit Guiding Principles - specifically Seamless onramp, Meet developers where they are, Leverage existing ecosystem
- Algorand Python Principles - specifically Least surprise and Leverage existing ecosystem by maintaining semantic compatibility.
- Algorand TypeScript Guiding Principles - specifically TEALScript compatibility and minimise Algorand Python disparity.
Options
Section titled “Options”Option 1. Support mutable array and element API with a stack based implementation (e.g. ARC4 style)
Section titled “Option 1. Support mutable array and element API with a stack based implementation (e.g. ARC4 style)”This option introduces a mutable array API, which is familiar and would closely align with native collection types in all currently supported languages (arrays in TypeScript, lists in Python). A stack based implementation is relatively simple to implement in the AVM as it is a stack based virtual machine. Additionally the existing arc4 arrays are stack based and we’d likely repurpose a large portion of the existing functionality.
However a mutable API implies reference semantics, but a stack based implementation is semantically incompatible with this, so various restrictions are required to make the API semantically compatible.
Supported Operations
const arr = new Array<uint64>();// add elementsarr.push(42);arr.push(123);// replace elementarr[0] = 123;// remove elementarr.pop();// read elementconst value = arr[0];// pass to subroutine without .copy() - additional complexity in compiler to support thissomeFunction(arr);
// awkward consequences// forced copies - enforcing this is complex and has been a source of bugsconst arr2 = arr.copy();structs.push(someStruct.copy());Unsupported Operations
const structs = new Array<MutableStruct>();
//unsupported - introducing additional references without a .copy()const arr2 = arr;structs.push(someStruct);structs[0] = someStruct;
// passing an array multiple timessomeFunction(arr, arr);Pros
- A mutable API supports both mutable and immutable elements and arrays have an API that is similar to the native collection API
- Can repurpose the existing ARC-4 arrays and extend to support non ARC-4 values
Cons
- There is a risk of subtle unknown bugs due to the incompatibility of the API and the implementation
- Requires
.copy()whenever introducing additional references to a mutable array or struct - Requires additional checks in the compiler to detect when
.copy()is needed - To reduce awkwardness, we’d want to reduce the need for
.copy()where possible i.e. implicitly updating array values when passed to a subroutine - The total array memory usage is limited to 4096 bytes
Option 2. Support immutable array and element API with a stack based implementation
Section titled “Option 2. Support immutable array and element API with a stack based implementation”This option introduces an immutable array API, which may be less familiar to developers, but is not uncommon (e.g. Redux), however it is a more natural fit for a stack based implementation. Arrays typically have reference semantics, however with an immutable API there isn’t an observable difference between an immutable reference based array and an immutable value based array.
Supported Operations
let arr = new Array<uint64>();
// add elementsarr = arr.concat(42, 123);// replace elementarr = arr.with(0, 123);// remove elementsarr = arr.slice(0, 1);// read elementconst value = arr[index];// pass to subroutine, does not require additional complexity to support thissomeReadonlyFunction(arr);// for a subroutine to return a modified array, it must be an explicitly returned valuearr = someFunctionThatMutates(arr);
let arr2 = new Array<ImmutableStruct>();arr2 = arr2.concat(someStruct);// mutating elements requires creating a copy of the new element and the arrayarr2 = arr2.with(0, someStruct.evolve((someProp = "someValue")));Unsupported Operations
// unsupported operationsarr.push(...)arr.pop(...)arr.splice(...)arr2[0].someProp = "someValue"
// mutatating an array without returning the modified copysomeFunctionThatMutates(arr)Pros
- The API is semantically compatible with the implementation making it easy to reason about
- It won’t require
.copy()or other “tricks” to ensure semantic compatibility
Cons
- Will result in more verbose code as each modification requires reassignment
- Is less familiar and more verbose for those used to mutable array and element API’s
- The total array memory usage is limited to 4096 bytes
Option 3. Support mutable array with immutable element API with a scratch based implementation
Section titled “Option 3. Support mutable array with immutable element API with a scratch based implementation”This option introduces a scratch based array, which is a better fit for a mutable API as it has reference semantics, which makes for a more familiar developer experience. Because of the constraints that exist with scratch slots, supporting mutable elements could result in exhausting available resources in unpredictable ways. As a result only immutable elements shoud be supported.
Supported Operations
const arr = new Array<uint64>();// add elementsarr.push(42);arr.push(123);// replace elementarr[0] = 123;// remove elementarr.pop();// read elementconst value = arr[0];
const structs = new Array<ImmutableStruct>();// reassigning still points to the same arrayconst sameStructs = structs;// can modify arraystructs.push(someStruct);// can replace elementsstructs[0] = someStruct.evolve((someProp = "someValue"));Unsupported Operations
//unsupported mutable typesconst structs = new Array<MutableStruct>();//modifications to elementsstructs[0].someProp = "someValue";Pros
- A familiar API
- Won’t require
.copy()or other “tricks” to ensure semantic compatibility - Has predicable scratch slot usage
Cons
- Is restricted to immutable elements
- Still has potential to exhaust scratch slot allocations when a large number of arrays are used
- The total array memory usage is limited to 4096 bytes
Option 4. Support mutable array and element API with a scratch based implementation
Section titled “Option 4. Support mutable array and element API with a scratch based implementation”This option is similar to option 3, but expands support to include mutable elements. In this implementation each mutable element would reside in it’s own slot so that multiple references can all address it independently.
Supported Operations
const arr = new Array<uint64>()// add elementsarr.push(42)arr.push(123)// replace elementarr[0] = 123// remove elementarr.pop()// read elementconst value = arr[0]
const structs = new Array<MutableStruct>()structs.push(someStruct)// reassigning still points to the same arrayconst structs2 = structs// modifying an element updates all referenciesstructs2[0].someProp = "someValue"assert structs[0].someProp == structs2[0].someProp// can pass the same reference multiple times// any mutations to either reference will be seen in calling functionsomeFunctionThatMutates(structs, structs2)Pros
- A familiar and fully supported API
- An array of mutable elements is not limited to 4096 bytes, however each element would still have this limit
Cons
- Easily possible to exhaust scratch slots quickly when working with an array of mutable elements as each element would require it’s own slot
- It’s the most complex option to fully implement
Preferred option
Section titled “Preferred option”In an ideal world being able to fully support option 4 would be great, however due to the constrained nature of the AVM it’s not really feasible. The other options on their own will work, however there is some differences to what developers would be used to. As a result we feel that a combination of both option 2 and option 3 would give developers the familiarity and control they would expect. If the front-end language authors choose to leverage a native collection type, having both options available also allows them to choose which implementation makes the most sense.
Selected option
Section titled “Selected option”We have chosen to implement option 2 and option 3 in the form of ImmutableArray and MutableArray variants. The exact naming and API design will be fleshed out as support is built into the Algorand Python and Algorand TypeScript languages.