Skip to content

japhb/CBOR-Simple

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Actions Status

NAME

CBOR::Simple - Simple codec for the CBOR serialization format

SYNOPSIS

use CBOR::Simple;

# Encode a Raku value to CBOR, or vice-versa
my $cbor = cbor-encode($value);
my $val1 = cbor-decode($cbor);               # Fails if more data past first decoded value
my $val2 = cbor-decode($cbor, my $pos = 0);  # Updates $pos after decoding first value

# By default, cbor-decode() marks partially corrupt parsed structures with
# Failure nodes at the point of corruption
my $bad  = cbor-decode(buf8.new(0x81 xx 3));  # [[[Failure]]]

# Callers can instead force throwing exceptions on any error
my $*CBOR_SIMPLE_FATAL_ERRORS = True;
my $bad  = cbor-decode(buf8.new(0x81 xx 3));  # BOOM!

# Decode CBOR into diagnostic text, used for checking encodings and complex structures
my $diag = cbor-diagnostic($cbor);

# Force the encoder to tag a value with a particular tag number
my $tagged = CBOR::Simple::Tagged.new(:$tag-number, :$value);
my $cbor   = cbor-encode($tagged);

DESCRIPTION

CBOR::Simple is an easy-to-use implementation of the core functionality of the CBOR serialization format, implementing the standard as of RFC 8949, plus a collection of common tag extensions as described below in TAG IMPLEMENTATION STATUS.

PERFORMANCE

CBOR::Simple is one of the fastest data structure serialization codecs available for Raku. It is comparable in round-trip speed to JSON::Fast for data structures that are the most JSON-friendly. For all other cases tested, CBOR::Simple produces smaller, higher fidelity encodings, faster. For more detail, and comparison with other Raku serialization codecs, see serializer-perf.

NYI

Currently known NOT to work:

  • Any tag marked '✘' (valid but not yet supported) or 'D' (deprecated spec) in the ENCODE or DECODE column of the Tag Status Details table below, or any tag not explicitly listed therein, will be treated as an opaque tagged value rather than treated as a native type.

  • Packed arrays of 128-bit floats (num128); these are not supported in Rakudo yet.

  • Encoding finite 16-bit floats (num16); encoding 16-bit NaN and ±Inf, as well as decoding any num16 all work. This is a performance tradeoff rather than a technical limitation; detecting whether a finite num32 can be shrunk to 16 bits without losing information is costly and rarely results in space savings except in trivial cases (e.g. Nums containing only small integers).

TAG CONTENT STRICTNESS

When encoding, CBOR::Simple makes every attempt to encode tagged content strictly within the tag standards as written, always producing spec-compliant encoded values.

When decoding, CBOR::Simple will often slightly relax the allowed content types in tagged content, especially when later tag proposals made no change other than to extend the allowed content types and allocate a new tag number for that. In the extension case CBOR::Simple is likely to allow both the old and new tag to accept the same content domain when decoding.

For example, when encoding CBOR::Simple will always encode Instant or DateTime as a CBOR epoch-based date/time (tag 1), using standard integer or floating point content data. But when decoding, CBOR::Simple will accept any content that decodes properly as a Raku Real value -- and in particular will handle a CBOR Rational (tag 30) as another valid content type.

DATE, DATETIME, INSTANT

Raku's builtin time handling is richer than the default CBOR data model (though certain tag extensions improve this), so the following mappings apply:

  • Encoding

    • Instant and DateTime are both written as tag 1 (epoch-based date/time) with integer (if lossless) or floating point content.

    • Other Dateish are written as tag 100 (RFC 8943 days since 1970-01-01).

  • Decoding

    • Tag 0 (date/time string) is parsed as a DateTime.

    • Tag 1 (epoch-based date/time) is parsed via Instant.from-posix(), and handles any Real type in the tag content.

    • Tag 100 (days since 1970-01-01) is parsed via Date.new-from-daycount().

    • Tag 1004 (date string) is parsed as a Date.

UNDEFINED VALUES

  • CBOR's null is translated as Any in Raku.

  • CBOR's undefined is translated as Mu in Raku.

  • A real Nil in an array (which must be bound, not assigned) is encoded as a CBOR Absent tag (31). Absent values will be recognized on decode as well, but since array contents are assigned into their parent array during decoding, a Nil in an array will be translated to Any by Raku's array assignment semantics.

OTHER SPECIAL CASES

  • To mark a substructure for lazy decoding (treating it as an opaque Blob until explicitly decoded), use the tagged value idiom in the SYNOPSIS with :tag-number(24) (encoded CBOR value) or :tag-number(63) (encoded CBOR Sequence).

  • CBOR strings claiming to be longer than 2⁶³-1 are treated as malformed.

  • Bigfloats and decimal fractions (tags 4, 5, 264, 265) with very large exponents may result in numeric overflow when decoded.

  • Keys for Associative types are sorted using Raku's internal sort method rather than the RFC 8949 default sort, because the latter is much slower.

  • cbor-diagnostic() always adds encoding indicators for float values.

TAG IMPLEMENTATION STATUS

Note that unrecognized tags will decode to their contents wrapped with a CBOR::Simple::Tagged object that records its tag-number; check marks in the details table indicate conversion to/from an appropriate native Raku type rather than this default behavior.

Tag Status Overview: Native Raku Types
GROUP SUPPORT NOTES
Core Good Core RFC 8949 CBOR data model and syntax
Collections Good Sets, maps with only object or only string keys
Graph NONE Cyclic, indirected, and self-referential structures
Numbers Good Rational/BigInt/BigFloat support except non-finite triplets
Packed Arrays Partial Packed num16/32/64 arrays supported; packed int arrays not
Special Arrays NONE Explicit multi-dim/homogenous arrays
Tag Fallbacks Good Round tripping of unknown tagged content
Date/Time Partial All but tagged time (tags 1001-1003) supported
Tag Status Overview: Specialty Types
GROUP SUPPORT NOTES
Encodings NONE baseN, MIME, YANG, BER, non-UTF-8 strings
Geo NONE Geographic coordinates and shapes
Identifiers NONE URI, IRI, UUID, IPLD CID, general identifiers
Networking NONE IPv4/IPv6 addresses, subnets, and masks
Security NONE COSE and CWT
Specialty NONE IoT data, Openswan, PlatformV, DOTS, ERIS, RAINS
String Hints NONE JSON conversions, language tags, regex
Tag Status Details
SPEC TAGS ENCODE DECODE NOTES
RFC 8949 0 DateTime strings → Encoded as tag 1
RFC 8949 1 DateTime/Instant
RFC 8949 2,3 (Big) Int
RFC 8949 4,5 Big fractions → Encoded as tag 30
unassigned 6-15
COSE 16-18 MAC/Signatures
unassigned 19-20
RFC 8949 21-23 Expected JSON conversion to baseN
RFC 8949 24 T Encoded CBOR data item
[Lehmann] 25 String backrefs
[Lehmann] 26,27 General serialized objects
[Lehmann] 28,29 Shareable referenced values
[Occil] 30 Rational numbers
[Vaarala] 31 * Absent values
RFC 8949 32-34 URIs and base64 encoding
RFC 7094 35 D D PCRE/ECMA 262 regex (DEPRECATED)
RFC 8949 36 Text-based MIME message
[Clemente] 37 Binary UUID
[Occil] 38 Language-tagged string
[Clemente] 39 Identifier semantics
RFC 8746 40 Row-major multidim array
RFC 8746 41 Homogenous array
[Mische] 42 IPLD content identifier
[YANG] 43-47 YANG datatypes
unassigned 48-51
draft 52 D D IPv4 address/network (DEPRECATED)
unassigned 53
draft 54 D D IPv6 address/network (DEPRECATED)
unassigned 55-60
RFC 8392 61 CBOR Web Token (CWT)
unassigned 62
[Bormann] 63 T Encoded CBOR Sequence
RFC 8746 64-79 ✘! ✘! Packed int arrays
RFC 8746 80-87 Packed num arrays (except 128-bit)
unassigned 88-95
COSE 96-98 Encryption/MAC/Signatures
unassigned 99
RFC 8943 100 Date
unassigned 101-102
[Vidovic] 103 Geo coords
[Clarke] 104 Geo coords ref system WKT/EPSG
unassigned 105-109
RFC 9090 110-112 BER-encoded object ID
unassigned 113-119
[Vidovic] 120 IoT data point
unassigned 121-255
[Lehmann] 256 String backrefs (see tag 25)
[Occil] 257 Binary MIME message
[Napoli] 258 Set
[Holloway] 259 T Map with object keys
[Raju] 260-261 IPv4/IPv6/MAC address/network
[Raju] 262-263 Embedded JSON/hex strings
[Occil] 264-265 * Extended fractions -> Encoded as tag 30
[Occil] 266-267 IRI/IRI reference
[Occil] 268-270 ✘✘ ✘✘ Triplet non-finite numerics
RFC 9132 271 ✘✘ ✘✘ DDoS Open Threat Signaling (DOTS)
[Vaarala] 272-274 Non-UTF-8 strings
[Cormier] 275 T Map with only string keys
[ERIS] 276 ERIS binary read capability
[Meins] 277-278 Geo area shape/velocity
unassigned 279-1000
[Bormann] 1001-1003 Extended time representations
RFC 8943 1004 → Encoded as tag 100
unassigned 1005-1039
RFC 8746 1040 Column-major multidim array
unassigned 1041-22097
[Lehmann] 22098 Hint for additional indirection
unassigned 22099-25440
[Broadwell] 25441 Capture: reference implementation
unassigned 25442-49999
[Tongzhou] 50000-50011 ✘✘ ✘✘ PlatformV
unassigned 50012-55798
RFC 8949 55799 Self-described CBOR
[Richardson] 55800 Self-described CBOR Sequence
unassigned 55801-65534
invalid 65535 Invalid tag detected
unassigned 65536-15309735
[Trammell] 15309736 ✘✘ ✘✘ RAINS message
unassigned 15309737-1330664269
[Hussain] 1330664270 ✘✘ ✘✘ CBOR-encoded Openswan config file
unassigned 1330664271-4294967294
invalid 4294967295 Invalid tag detected
unassigned ...
invalid 18446744073709551615 Invalid tag detected
Tag Table Symbol Key
SYMBOL MEANING
Fully supported
* Supported, but see notes below
T Encoding supported by explicitly tagging contents
Raku values will be encoded using a different tag
D Deprecated and unsupported tag spec; may eventually be decodable
Not yet implemented
✘! Not yet implemented, but already requested
✘? Not yet implemented, but may be easy to add
✘✘ Probably won't be implemented in CBOR::Simple

AUTHOR

Geoffrey Broadwell gjb@sonic.net

COPYRIGHT AND LICENSE

Copyright 2021 Geoffrey Broadwell

This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.

About

Simple codec for the CBOR serialization format

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages