ARC4 types
Architecture Decision Record - ARC4 types
Section titled “Architecture Decision Record - ARC4 types”- Status: Draft
- Owner: Tristan Menzel
- Deciders: Alessandro Cappellato (Algorand Foundation), Bruno Martins (Algorand Foundation), Rob Moore (MakerX)
- Date created: 2024-06-06
- Date decided: N/A
- Date updated: 2024-06-06
Context
Section titled “Context”ARC4 describes a number of types which can be used when designing the public API of your smart contract. It also prescribes an encoding method for how these types should be represented in binary when being included in a transaction. The ARC describes many more types than are available on the AVM which is limited to just uint64 and byte[]. The encoding method used for these types is optimised for read performance. Write performance for statically sized data is also performant, however updating dynamically sized data (variable length arrays/strings etc) can be less ideal - particularly for variable data nested in compound types.
There are several types such as the variously sized uints between 8 and 512 bits which don’t make a lot of sense to implement natively in Algorand TypeScript. Using a Uint8 over a Uint64 might make sense on an external facing method because it decreases the number of bytes in the transaction payload but there are no efficiencies to be found on the AVM; instead operations are more expensive as it would be necessary to perform an overflow check after each operation to remain semantically correct.
Requirements
Section titled “Requirements”- It must be possible to define a smart contract that uses any of the types defined by ARC4
Principles
Section titled “Principles”- AlgoKit Guiding Principles - specifically Seamless onramp, Leverage existing ecosystem, Meet devs where they are
- Algorand Python Principles
- Algorand TypeScript Guiding Principles
Options
Section titled “Options”Option 1 - Implement only ARC4 types
Section titled “Option 1 - Implement only ARC4 types”Algorand TypeScript defines all types specified by ARC4 and their in memory representation matches that of the ARC4 encoding
Pros:
- Zero cost decoding of ABI args (other than packed args for methods using more than 15 arguments)
- When storing a value in global/local/box storage - the binary format is already well-defined by arc4
Cons:
- Contracts that frequently mutate dynamic structs will consume a large op code budget
- Math operators that work on byte arrays are expensive (10x) so smaller (<=64 bit) uints would need to be converted to uint64 for math, then back again.
Option 2 - Implement only ‘native’ types
Section titled “Option 2 - Implement only ‘native’ types”Algorand TypeScript defines all types specified by ARC4 but chooses its own in memory representation. Values are encoded/decoded automatically at applicable boundaries.
This is the approach taken by TealScript
Pros:
- In memory representation can be optimised
Cons:
- In some scenarios a value may be unnecessarily decoded and re-encoded when it could have been passed through untouched.
- A number of types defined by ARC4 do not make sense to implement natively (eg. UFixedNxM) as the avm has no support for decimal math.
Option 3 - Implement ‘native’ types which make sense + implement a full set of ARC4 types
Section titled “Option 3 - Implement ‘native’ types which make sense + implement a full set of ARC4 types”Algorand TypeScript defines native types which make sense and excludes ones which don’t (eg. UFixedNxM), but also implements all arc4 types in a separate module. It’s possible to convert between the two - where such a conversion exists. Native types can have optimised in memory representations whilst arc4 types follow prescribed encoding.
This is the approach taken by Algorand Python.
Pros:
- Zero cost decoding of ABI args available as an option. (by opting to use the arc4 type)
- “native” types can be used in ABI args and the compiler will automatically decode them
- Arc4 types can optionally be used as a serialization format for box/global/local storage
- Developer has control to choose the solution which works best for their scenario rather than having one prescribed by the compiler
Cons:
- Several types will exist twice (string/uint64/bool etc) and this could be confusing to new developers who aren’t fully across arc4
- Developer will need to be aware of arc4 encoding/decoding and the implications of it.
Preferred option
Section titled “Preferred option”Option 1 can be excluded as it doesn’t allow us to optimise the in memory representation of types independently to that which is prescribed by the arc4 encoding spec.
Option 2 is feasible however puya would need to implement math operators on sized uints < 64bit and that effort would largely be a waste of time as this math would be more expensive than 64bit math. TealScript does not face this problem as it does not attempt to maintain semantic compatability. This option also does not allow power users to deal directly with encoded data.
The roadmap for Puya includes introducing an optimized in memory representation of arrays and structs making use of scratch space as a heap. This will greatly improve the UX and performance of mutating arrays/structs (no more .copy() or .clone() to work around ‘pass by value’ semantics), however the cost of decoding an arc4 structure into a ‘native’ representation will not be inconsequential. As such, the option to avoid this decoding step is desirable (eg. when working with an Address ie. StaticArray<Byte, 32> where mutation is not required)
Option 3 gives us this control with the caveat of introducing confusion between two types with the same/similar names. Algorand TypeScript can alleviate some of this pain by exposing the arc4 types through a namespace that requires them to be referenced via arc4.TypeName. JsDoc on the different types can also offer further explanation of the differences with a suggestion to use the ‘native’ version unless there’s good reason to use the encoded version. This option also aligns with Algorand Python which reduces complexity in the Puya compiler and makes it easier to share documentation/tutorials between the two languages.
For the reasons listed above, Option 3 is the preferred option
Selected option
Section titled “Selected option”Option 3 has been selected.