Data Encoding¶
Introduction¶
The Polkadot SDK uses a lightweight and efficient encoding/decoding mechanism to optimize data transmission across the network. This mechanism, known as the SCALE codec, is used for serializing and deserializing data.
The SCALE codec enables communication between the runtime and the outer node. This mechanism is designed for high-performance, copy-free data encoding and decoding in resource-constrained environments like the Polkadot SDK Wasm runtime.
It is not self-describing, meaning the decoding context must fully know the encoded data types.
Parity's libraries utilize the parity-scale-codec
crate (a Rust implementation of the SCALE codec) to handle encoding and decoding for interactions between RPCs and the runtime.
The codec
mechanism is ideal for Polkadot SDK-based chains because:
- It is lightweight compared to generic serialization frameworks like
serde
, which add unnecessary bulk to binaries. - It doesn’t rely on Rust’s
libstd
, making it compatible withno_std
environments like Wasm runtime. - It integrates seamlessly with Rust, allowing easy derivation of encoding and decoding logic for new types using
#[derive(Encode, Decode)]
.
Defining a custom encoding scheme in the Polkadot SDK-based chains, rather than using an existing Rust codec library, is crucial for enabling cross-platform and multi-language support.
SCALE Codec¶
The codec is implemented using the following traits:
Encode¶
The Encode
trait handles data encoding into SCALE format and includes the following key functions:
size_hint(&self) -> usize
: Estimates the number of bytes required for encoding to prevent multiple memory allocations. This should be inexpensive and avoid complex operations. Optional if the size isn’t known.encode_to<T: Output>(&self, dest: &mut T)
: Encodes the data, appending it to a destination buffer.encode(&self) -> Vec<u8>
: Encodes the data and returns it as a byte vector.using_encoded<R, F: FnOnce(&[u8]) -> R>(&self, f: F) -> R
: Encodes the data and passes it to a closure, returning the result.encoded_size(&self) -> usize
: Calculates the encoded size. Should be used when the encoded data isn’t required.
Tip
For best performance, value types should override using_encoded
, and allocating types should override encode_to
. It's recommended to implement size_hint
for all types where possible.
Decode¶
The Decode
trait handles decoding SCALE-encoded data back into the appropriate types:
fn decode<I: Input>(value: &mut I) -> Result<Self, Error>
: Decodes data from the SCALE format, returning an error if decoding fails.
CompactAs¶
The CompactAs
trait wraps custom types for compact encoding:
encode_as(&self) -> &Self::As
: Encodes the type as a compact type.decode_from(_: Self::As) -> Result<Self, Error>
: decodes from a compact encoded type.
HasCompact¶
The HasCompact
trait indicates a type supports compact encoding.
EncodeLike¶
The EncodeLike
trait is used to ensure multiple types that encode similarly are accepted by the same function. When using derive
, it is automatically implemented.
Data Types¶
The table below outlines how the Rust implementation of the Parity SCALE codec encodes different data types.
Type | Description | Example SCALE Decoded Value | SCALE Encoded Value |
---|---|---|---|
Boolean | Boolean values are encoded using the least significant bit of a single byte. | false / true | 0x00 / 0x01 |
Compact/general integers | A "compact" or general integer encoding is sufficient for encoding large integers (up to 2^536) and is more efficient at encoding most values than the fixed-width version. | unsigned integer 0 / unsigned integer 1 / unsigned integer 42 / unsigned integer 69 / unsigned integer 65535 / BigInt(100000000000000) | 0x00 / 0x04 / 0xa8 / 0x1501 / 0xfeff0300 / 0x0b00407a10f35a |
Enumerations (tagged-unions) | A fixed number of variants, each mutually exclusive and potentially implying a further value or series of values. Encoded as the first byte identifying the index of the variant that the value is. Any further bytes are used to encode any data that the variant implies. Thus, no more than 256 variants are supported. | Int(42) and Bool(true) where enum IntOrBool { Int(u8), Bool(bool) } | 0x002a and 0x0101 |
Fixed-width integers | Basic integers are encoded using a fixed-width little-endian (LE) format. | signed 8-bit integer 69 / unsigned 16-bit integer 42 / unsigned 32-bit integer 16777215 | 0x45 / 0x2a00 / 0xffffff00 |
Options | One or zero values of a particular type. | Some / None | 0x01 followed by the encoded value / 0x00 |
Results | Results are commonly used enumerations which indicate whether certain operations were successful or unsuccessful. | Ok(42) / Err(false) | 0x002a / 0x0100 |
Strings | Strings are Vectors of bytes (Vec | ||
Structs | For structures, the values are named, but that is irrelevant for the encoding (names are ignored - only order matters). | SortedVecAsc::from([3, 5, 2, 8]) | [3, 2, 5, 8] |
Tuples | A fixed-size series of values, each with a possibly different but predetermined and fixed type. This is simply the concatenation of each encoded value. | Tuple of compact unsigned integer and boolean: (3, false) | 0x0c00 |
Vectors (lists, series, sets) | A collection of same-typed values is encoded, prefixed with a compact encoding of the number of items, followed by each item's encoding concatenated in turn. | Vector of unsigned 16 -bit integers: [4, 8, 15, 16, 23, 42] | 0x18040008000f00100017002a00 |
Encode and Decode Rust Trait Implementations¶
Here's how the Encode
and Decode
traits are implemented:
use parity_scale_codec::{Encode, Decode};
[derive(Debug, PartialEq, Encode, Decode)]
enum EnumType {
#[codec(index = 15)]
A,
B(u32, u64),
C {
a: u32,
b: u64,
},
}
let a = EnumType::A;
let b = EnumType::B(1, 2);
let c = EnumType::C { a: 1, b: 2 };
a.using_encoded(|ref slice| {
assert_eq!(slice, &b"\x0f");
});
b.using_encoded(|ref slice| {
assert_eq!(slice, &b"\x01\x01\0\0\0\x02\0\0\0\0\0\0\0");
});
c.using_encoded(|ref slice| {
assert_eq!(slice, &b"\x02\x01\0\0\0\x02\0\0\0\0\0\0\0");
});
let mut da: &[u8] = b"\x0f";
assert_eq!(EnumType::decode(&mut da).ok(), Some(a));
let mut db: &[u8] = b"\x01\x01\0\0\0\x02\0\0\0\0\0\0\0";
assert_eq!(EnumType::decode(&mut db).ok(), Some(b));
let mut dc: &[u8] = b"\x02\x01\0\0\0\x02\0\0\0\0\0\0\0";
assert_eq!(EnumType::decode(&mut dc).ok(), Some(c));
let mut dz: &[u8] = &[0];
assert_eq!(EnumType::decode(&mut dz).ok(), None);
SCALE Codec Libraries¶
Several SCALE codec implementations are available in various languages. Here's a list of them:
- AssemblyScript:
LimeChain/as-scale-codec
- C:
MatthewDarnell/cScale
- C++:
qdrvm/scale-codec-cpp
- JavaScript:
polkadot-js/api
- Dart:
leonardocustodio/polkadart
- Haskell:
airalab/hs-web3
- Golang:
itering/scale.go
- Java:
splix/polkaj
- Python:
polkascan/py-scale-codec
- Ruby:
wuminzhe/scale_rb
- TypeScript:
parity-scale-codec-ts
,scale-ts
,soramitsu/scale-codec-js-library
,subsquid/scale-codec
| Created: October 16, 2024