multicodec package

Submodules

multicodec.code module

Multicodec Code type and core functionality.

This module provides: - Code class for type-safe codec handling - ReservedStart and ReservedEnd constants - KnownCodes() function to list all registered codes

class multicodec.code.Code(value: int)[source]

Bases: object

Code describes an integer reserved in the multicodec table.

This class provides: - Type-safe codec handling - String conversion (name lookup) - Set from string (name or hex number) - Tag lookup

Example:
>>> code = Code(0x12)
>>> str(code)
'sha2-256'
>>> code = Code.from_string("sha2-256")
>>> int(code)
18
classmethod from_string(text: str) Code[source]

Create a Code from a string, interpreting it as a multicodec name or number.

The input string can be the name or number for a known code. A number can be in decimal or hexadecimal format (with 0x prefix).

Numbers in the reserved range 0x300000-0x3FFFFF are also accepted.

Parameters:

text (str) – The codec name or number

Returns:

A Code instance

Return type:

Code

Raises:

ValueError – If the text is not a valid codec

property name: str

Return the codec name, or ‘<unknown>’ if not found.

set(text: str) None[source]

Set the code from a string.

Parameters:

text (str) – The codec name or number

Raises:

ValueError – If the text is not a valid codec

tag() str[source]

Return the tag for this codec.

Tags categorize codecs (e.g., “multihash”, “multiaddr”, “ipld”, etc.)

multicodec.code.is_reserved(code: int | Code) bool[source]

Check if a code falls within the reserved range.

The reserved range (0x300000-0x3FFFFF) is designated for internal and experimental use.

Parameters:

code – The codec code to check

Returns:

True if the code is in the reserved range

Return type:

bool

multicodec.code.known_codes() list[Code][source]

Return a list of all codes registered in the multicodec table.

The returned list should be treated as read-only.

Returns:

List of all known Code objects

Return type:

list[Code]

multicodec.code_table module

multicodec.constants module

multicodec.exceptions module

Exception classes for multicodec.

This module defines the exception hierarchy for multicodec operations, providing specific error types for different failure modes.

exception multicodec.exceptions.CodecError[source]

Bases: MulticodecError

Base exception for codec-related errors.

exception multicodec.exceptions.DecodeError[source]

Bases: CodecError

Raised when decoding fails.

exception multicodec.exceptions.EncodeError[source]

Bases: CodecError

Raised when encoding fails.

exception multicodec.exceptions.MulticodecError[source]

Bases: Exception

Base exception for all multicodec-related errors.

exception multicodec.exceptions.UnknownCodecError[source]

Bases: CodecError

Raised when an unknown codec is requested.

multicodec.multicodec module

multicodec.multicodec.add_prefix(multicodec: str, bytes_: bytes) bytes[source]

Adds multicodec prefix to the given bytes input

Parameters:
  • multicodec (str) – multicodec to use for prefixing

  • bytes (bytes) – data to prefix

Returns:

prefixed byte data

Return type:

bytes

multicodec.multicodec.extract_prefix(bytes_: bytes) int[source]

Extracts the prefix from multicodec prefixed data

Parameters:

bytes (bytes) – multicodec prefixed data

Returns:

prefix for the prefixed data

Return type:

int

Raises:

ValueError – when incorrect varint is provided

multicodec.multicodec.get_codec(bytes_: bytes) str[source]

Gets the codec used for prefix the multicodec prefixed data

Parameters:

bytes (bytes) – multicodec prefixed data bytes

Returns:

name of the multicodec used to prefix

Return type:

str

multicodec.multicodec.get_prefix(multicodec: str) bytes[source]

Returns prefix for a given multicodec

Parameters:

multicodec (str) – multicodec codec name

Returns:

the prefix for the given multicodec

Return type:

bytes

Raises:

ValueError – if an invalid multicodec name is provided

multicodec.multicodec.is_codec(name: str) bool[source]

Check if the codec is a valid codec or not

Parameters:

name (str) – name of the codec

Returns:

if the codec is valid or not

Return type:

bool

multicodec.multicodec.remove_prefix(bytes_: bytes) bytes[source]

Removes prefix from a prefixed data

Parameters:

bytes (bytes) – multicodec prefixed data bytes

Returns:

prefix removed data bytes

Return type:

bytes

multicodec.serialization module

Serialization module for multicodec.

This module provides a codec interface for serializing and deserializing data with multicodec prefixes. It includes built-in codecs for common formats: - JSON: Structured data serialization - Raw: Pass-through codec for binary data

The design follows a similar pattern to js-multiformats and rust-multicodec, providing a clean interface for encoding/decoding operations.

Example usage:
>>> from multicodec.serialization import json_codec, raw_codec, encode, decode
>>> # Using JSON codec
>>> data = {"hello": "world"}
>>> encoded = json_codec.encode(data)
>>> decoded = json_codec.decode(encoded)
>>> assert decoded == data
>>>
>>> # Using the generic encode/decode with codec name
>>> encoded = encode("json", {"key": "value"})
>>> decoded = decode(encoded)
class multicodec.serialization.Codec[source]

Bases: ABC, Generic[T]

Abstract base class for multicodec serialization codecs.

A codec provides methods to encode data to bytes and decode bytes back to data. Each codec is identified by its multicodec name and code.

Subclasses must implement: - name: The multicodec name (e.g., ‘json’, ‘raw’) - code: The multicodec code (e.g., 0x0200 for json) - _encode: Transform data to bytes (without prefix) - _decode: Transform bytes to data (without prefix)

abstract property code: int

Return the multicodec code for this codec.

decode(data: bytes) T[source]

Decode multicodec-prefixed bytes to data.

Parameters:

data – Multicodec-prefixed bytes to decode

Returns:

Decoded data

Raises:

DecodeError – If decoding fails or codec mismatch

decode_raw(data: bytes) T[source]

Decode bytes without expecting a multicodec prefix.

Parameters:

data – Raw bytes to decode (no prefix)

Returns:

Decoded data

Raises:

DecodeError – If decoding fails

encode(data: T) bytes[source]

Encode data to bytes with multicodec prefix.

Parameters:

data – Data to encode

Returns:

Multicodec-prefixed encoded bytes

Raises:

EncodeError – If encoding fails

abstract property name: str

Return the multicodec name for this codec.

exception multicodec.serialization.CodecError[source]

Bases: MulticodecError

Base exception for codec-related errors.

exception multicodec.serialization.DecodeError[source]

Bases: CodecError

Raised when decoding fails.

exception multicodec.serialization.EncodeError[source]

Bases: CodecError

Raised when encoding fails.

class multicodec.serialization.JSONCodec[source]

Bases: Codec[Any]

JSON codec for encoding/decoding JSON-serializable data.

Uses the standard library json module with UTF-8 encoding. The multicodec code for JSON is 0x0200.

Example:
>>> codec = JSONCodec()
>>> encoded = codec.encode({"hello": "world"})
>>> decoded = codec.decode(encoded)
>>> assert decoded == {"hello": "world"}
property code: int

Return the multicodec code for this codec.

property name: str

Return the multicodec name for this codec.

class multicodec.serialization.RawCodec[source]

Bases: Codec[bytes]

Raw codec for pass-through binary data.

This codec performs no transformation on the data, useful for binary data that should be stored as-is with a multicodec prefix. The multicodec code for raw is 0x55.

Example:
>>> codec = RawCodec()
>>> data = b"binary data"
>>> encoded = codec.encode(data)
>>> decoded = codec.decode(encoded)
>>> assert decoded == data
property code: int

Return the multicodec code for this codec.

property name: str

Return the multicodec name for this codec.

exception multicodec.serialization.UnknownCodecError[source]

Bases: CodecError

Raised when an unknown codec is requested.

multicodec.serialization.decode(data: bytes, codec_name: str | None = None) Any[source]

Decode multicodec-prefixed data.

If codec_name is provided, uses that specific codec (and verifies prefix matches). If codec_name is None, auto-detects codec from the prefix.

Parameters:
  • data – Multicodec-prefixed bytes to decode

  • codec_name – Optional codec name to use for decoding

Returns:

Decoded data

Raises:
multicodec.serialization.encode(codec_name: str, data: Any) bytes[source]

Encode data using a registered codec by name.

Parameters:
  • codec_name – Name of the codec to use (e.g., ‘json’, ‘raw’)

  • data – Data to encode

Returns:

Multicodec-prefixed encoded bytes

Raises:
multicodec.serialization.get_registered_codec(name: str) Codec[Any][source]

Get a registered codec by name.

Parameters:

name – The codec name

Returns:

The codec instance

Raises:

UnknownCodecError – If codec is not registered

multicodec.serialization.is_codec_registered(name: str) bool[source]

Check if a codec is registered.

Parameters:

name – The codec name to check

Returns:

True if codec is registered, False otherwise

multicodec.serialization.list_registered_codecs() list[str][source]

List all registered codec names.

Returns:

List of registered codec names

multicodec.serialization.register_codec(codec: Codec[Any]) None[source]

Register a custom codec in the global registry.

Parameters:

codec – The codec instance to register

Raises:

ValueError – If codec name is already registered

multicodec.serialization.unregister_codec(name: str) None[source]

Unregister a codec from the global registry.

Parameters:

name – The codec name to unregister

Raises:

KeyError – If codec is not registered

Module contents

Top-level package for py-multicodec.

class multicodec.Code(value: int)[source]

Bases: object

Code describes an integer reserved in the multicodec table.

This class provides: - Type-safe codec handling - String conversion (name lookup) - Set from string (name or hex number) - Tag lookup

Example:
>>> code = Code(0x12)
>>> str(code)
'sha2-256'
>>> code = Code.from_string("sha2-256")
>>> int(code)
18
classmethod from_string(text: str) Code[source]

Create a Code from a string, interpreting it as a multicodec name or number.

The input string can be the name or number for a known code. A number can be in decimal or hexadecimal format (with 0x prefix).

Numbers in the reserved range 0x300000-0x3FFFFF are also accepted.

Parameters:

text (str) – The codec name or number

Returns:

A Code instance

Return type:

Code

Raises:

ValueError – If the text is not a valid codec

property name: str

Return the codec name, or ‘<unknown>’ if not found.

set(text: str) None[source]

Set the code from a string.

Parameters:

text (str) – The codec name or number

Raises:

ValueError – If the text is not a valid codec

tag() str[source]

Return the tag for this codec.

Tags categorize codecs (e.g., “multihash”, “multiaddr”, “ipld”, etc.)

class multicodec.Codec[source]

Bases: ABC, Generic[T]

Abstract base class for multicodec serialization codecs.

A codec provides methods to encode data to bytes and decode bytes back to data. Each codec is identified by its multicodec name and code.

Subclasses must implement: - name: The multicodec name (e.g., ‘json’, ‘raw’) - code: The multicodec code (e.g., 0x0200 for json) - _encode: Transform data to bytes (without prefix) - _decode: Transform bytes to data (without prefix)

abstract property code: int

Return the multicodec code for this codec.

decode(data: bytes) T[source]

Decode multicodec-prefixed bytes to data.

Parameters:

data – Multicodec-prefixed bytes to decode

Returns:

Decoded data

Raises:

DecodeError – If decoding fails or codec mismatch

decode_raw(data: bytes) T[source]

Decode bytes without expecting a multicodec prefix.

Parameters:

data – Raw bytes to decode (no prefix)

Returns:

Decoded data

Raises:

DecodeError – If decoding fails

encode(data: T) bytes[source]

Encode data to bytes with multicodec prefix.

Parameters:

data – Data to encode

Returns:

Multicodec-prefixed encoded bytes

Raises:

EncodeError – If encoding fails

abstract property name: str

Return the multicodec name for this codec.

exception multicodec.CodecError[source]

Bases: MulticodecError

Base exception for codec-related errors.

exception multicodec.DecodeError[source]

Bases: CodecError

Raised when decoding fails.

exception multicodec.EncodeError[source]

Bases: CodecError

Raised when encoding fails.

class multicodec.JSONCodec[source]

Bases: Codec[Any]

JSON codec for encoding/decoding JSON-serializable data.

Uses the standard library json module with UTF-8 encoding. The multicodec code for JSON is 0x0200.

Example:
>>> codec = JSONCodec()
>>> encoded = codec.encode({"hello": "world"})
>>> decoded = codec.decode(encoded)
>>> assert decoded == {"hello": "world"}
property code: int

Return the multicodec code for this codec.

property name: str

Return the multicodec name for this codec.

exception multicodec.MulticodecError[source]

Bases: Exception

Base exception for all multicodec-related errors.

class multicodec.RawCodec[source]

Bases: Codec[bytes]

Raw codec for pass-through binary data.

This codec performs no transformation on the data, useful for binary data that should be stored as-is with a multicodec prefix. The multicodec code for raw is 0x55.

Example:
>>> codec = RawCodec()
>>> data = b"binary data"
>>> encoded = codec.encode(data)
>>> decoded = codec.decode(encoded)
>>> assert decoded == data
property code: int

Return the multicodec code for this codec.

property name: str

Return the multicodec name for this codec.

exception multicodec.UnknownCodecError[source]

Bases: CodecError

Raised when an unknown codec is requested.

multicodec.add_prefix(multicodec: str, bytes_: bytes) bytes[source]

Adds multicodec prefix to the given bytes input

Parameters:
  • multicodec (str) – multicodec to use for prefixing

  • bytes (bytes) – data to prefix

Returns:

prefixed byte data

Return type:

bytes

multicodec.decode(data: bytes, codec_name: str | None = None) Any[source]

Decode multicodec-prefixed data.

If codec_name is provided, uses that specific codec (and verifies prefix matches). If codec_name is None, auto-detects codec from the prefix.

Parameters:
  • data – Multicodec-prefixed bytes to decode

  • codec_name – Optional codec name to use for decoding

Returns:

Decoded data

Raises:
multicodec.encode(codec_name: str, data: Any) bytes[source]

Encode data using a registered codec by name.

Parameters:
  • codec_name – Name of the codec to use (e.g., ‘json’, ‘raw’)

  • data – Data to encode

Returns:

Multicodec-prefixed encoded bytes

Raises:
multicodec.extract_prefix(bytes_: bytes) int[source]

Extracts the prefix from multicodec prefixed data

Parameters:

bytes (bytes) – multicodec prefixed data

Returns:

prefix for the prefixed data

Return type:

int

Raises:

ValueError – when incorrect varint is provided

multicodec.get_codec(bytes_: bytes) str[source]

Gets the codec used for prefix the multicodec prefixed data

Parameters:

bytes (bytes) – multicodec prefixed data bytes

Returns:

name of the multicodec used to prefix

Return type:

str

multicodec.get_prefix(multicodec: str) bytes[source]

Returns prefix for a given multicodec

Parameters:

multicodec (str) – multicodec codec name

Returns:

the prefix for the given multicodec

Return type:

bytes

Raises:

ValueError – if an invalid multicodec name is provided

multicodec.get_registered_codec(name: str) Codec[Any][source]

Get a registered codec by name.

Parameters:

name – The codec name

Returns:

The codec instance

Raises:

UnknownCodecError – If codec is not registered

multicodec.is_codec(name: str) bool[source]

Check if the codec is a valid codec or not

Parameters:

name (str) – name of the codec

Returns:

if the codec is valid or not

Return type:

bool

multicodec.is_codec_registered(name: str) bool[source]

Check if a codec is registered.

Parameters:

name – The codec name to check

Returns:

True if codec is registered, False otherwise

multicodec.is_reserved(code: int | Code) bool[source]

Check if a code falls within the reserved range.

The reserved range (0x300000-0x3FFFFF) is designated for internal and experimental use.

Parameters:

code – The codec code to check

Returns:

True if the code is in the reserved range

Return type:

bool

multicodec.known_codes() list[Code][source]

Return a list of all codes registered in the multicodec table.

The returned list should be treated as read-only.

Returns:

List of all known Code objects

Return type:

list[Code]

multicodec.list_registered_codecs() list[str][source]

List all registered codec names.

Returns:

List of registered codec names

multicodec.register_codec(codec: Codec[Any]) None[source]

Register a custom codec in the global registry.

Parameters:

codec – The codec instance to register

Raises:

ValueError – If codec name is already registered

multicodec.remove_prefix(bytes_: bytes) bytes[source]

Removes prefix from a prefixed data

Parameters:

bytes (bytes) – multicodec prefixed data bytes

Returns:

prefix removed data bytes

Return type:

bytes

multicodec.unregister_codec(name: str) None[source]

Unregister a codec from the global registry.

Parameters:

name – The codec name to unregister

Raises:

KeyError – If codec is not registered