<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc [
  <!ENTITY nbsp "&#160;">
]>
<rfc version="3"
     ipr="trust200902"
     submissionType="IETF"
     category="info"
     consensus="false"
     docName="draft-smith-uber-00"
     sortRefs="true"
     symRefs="true"
     tocDepth="4">
  <front>
    <title abbrev="UBER Format">The Universal Basic Element Representation (ÜBER) Format</title>
    <seriesInfo name="Internet-Draft" value="draft-smith-uber-00"/>
    <author fullname="Curtis Allen Smith" initials="C.A." surname="Smith">
      <organization>Independent</organization>
      <address>
        <email>curtis.allen.smith@gmail.com</email>
      </address>
    </author>
    <date day="20" month="March" year="2026"/>
    <workgroup>Network Working Group</workgroup>
    <keyword>UBER</keyword>
    <keyword>serialization</keyword>
    <keyword>configuration</keyword>
    <keyword>grammar</keyword>
    <abstract>
      <t>
        This document defines the Universal Basic Element Representation
        (ÜBER), a language-independent, lightweight, text-based
        serialization format designed for the portable representation,
        transmission, and storage of structured data.
      </t>
      <t>
        ÜBER employs a unified node architecture in which each node serves
        simultaneously as a value-carrier and a structural container. This
        recursive model allows any element to encapsulate a discrete data
        payload while concurrently functioning as a parent to nested
        members. The same structure also provides the granularity needed by
        implementations that target high-concurrency access patterns with
        localized, low-contention updates.
      </t>
      <t>
        A top-level ÜBER profile consists of either an explicit root object
        or a sequence of members and directives. To prioritize
        human-centric design and authoring ergonomics, the grammar supports
        comments, flexible arbitrary-length key-value separators, optional
        commas, and dotted member names.
      </t>
      <t>
        The ÜBER type system is defined by recursive containers and scalar
        forms. Associative arrays (<tt>{}</tt>) and collections
        (<tt>[]</tt>) support arbitrary nesting, while concrete ordering
        behavior remains implementation-dependent. Scalar forms include
        deterministic numbers with arbitrary-precision integer syntax and
        Java-aligned unsuffixed floating-point lexical forms, automatic
        magnitude-based fitting and promotion,
        versatile string representations including text blocks,
        flexible boolean literals, and both explicit and omitted null
        states.
      </t>
      <t>
        This specification defines the lexical and syntactic grammar of ÜBER.
        It does not define implementation-specific runtime facilities except
        where a grammar element, such as directives, requires a
        brief statement of purpose.
      </t>
    </abstract>
  </front>
  <middle>
    <section anchor="introduction">
      <name>Introduction</name>
      <t>
        ÜBER is a lightweight, text-based, language-independent
        data-serialization and data-interchange format for structured data. A
        complete top-level document is a profile, written either as an
        explicit root object or as a top-level sequence of profile
        statements, where each profile statement is a member or directive.
        The grammar is deliberately deterministic:
        prefixed forms are distinguished by leading syntax, and bare-token
        scalar forms are interpreted using a fixed fallback order in which
        unquoted text is terminal. This keeps parsing predictable across the
        broader authoring surface.
      </t>
      <t>
        The core object model is recursive. A member can hold a scalar value,
        nested child members, or both at the same time. This unified node
        architecture is what allows valued members to exist without
        introducing a separate node taxonomy for container and scalar forms.
      </t>
      <t>
        The format is intentionally human-oriented at the surface-syntax level.
        Comments, optional commas, unquoted forms, and implicit top-level
        profile forms are provided so that human-authored documents can remain
        concise without changing the deterministic underlying parse model.
      </t>
      <t>
        The grammar defines recursive containers and scalar forms.
        Associative arrays and collections may nest arbitrarily. Scalar forms
        include deterministic numbers with arbitrary-precision integer syntax
        and Java-aligned unsuffixed floating-point lexical forms, multiple
        string syntaxes including text blocks,
        flexible boolean literals, and a distinction between explicit
        <tt>null</tt> and omitted values in member position.
      </t>
      <t>
        The grammar defined here has the following high-level properties:
      </t>
      <ul spacing="normal">
        <li>All valid JSON texts are valid ÜBER texts.</li>
        <li>
          The core value model consists of objects, arrays, strings, numbers,
          booleans, null, and unquoted strings.
        </li>
        <li>
          Objects and arrays permit optional commas between elements or members.
        </li>
        <li>
          Object members support both explicit separators and implicit
          whitespace separation.
        </li>
        <li>
          Object member names can be dotted to express hierarchical
          composition.
        </li>
        <li>
          The profile model supports both top-level members and top-level
          directives.
        </li>
      </ul>
      <t>
        This document is restricted to the grammar of the format. Features such
        as defaults chaining, macro resolution, and concrete directive effects
        are outside the scope of this specification.
      </t>
    </section>

    <section anchor="conventions">
      <name>Requirements Language and Conventions</name>
      <t>
        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
        "OPTIONAL" in this document are to be interpreted as described in
        BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and
        only when, they appear in all capitals, as shown here.
      </t>
      <t>
        For convenience, this document also cites the combined requirements
        language specification as <xref target="BCP14"/>.
      </t>
      <t>
        ABNF in this document uses the notation of <xref target="RFC5234"/>.
        Case-sensitive literal strings use the extensions defined by
        <xref target="RFC7405"/>.
      </t>
      <t>
        For readability, some productions use ABNF prose values for character
        classes that are more naturally stated as exclusions, such as
        "any character except DQUOTE or reverse solidus". Such prose values are
        normative.
      </t>
    </section>

    <section anchor="overview">
      <name>Overview of the Grammar Model</name>
      <t>
        A complete top-level ÜBER document is a <tt>profile</tt>. A
        <tt>profile</tt> is either:
      </t>
      <ol spacing="normal">
        <li>an explicit root object, or</li>
        <li>
          an implicit object formed from a top-level sequence of profile
          statements.
        </li>
      </ol>
      <t>
        The profile is the complete document unit. In the implicit form, the
        top level consists of profile statements, and a profile statement is
        either a member or a directive. The implicit form exists so that
        configuration-oriented documents can omit outer braces. Commas between
        statements are optional both at the top level and inside objects, and
        whitespace alone can separate adjacent statements. Nested objects,
        however, are always explicit.
      </t>
      <t>
        Directives therefore belong to the profile model itself rather than to
        some separate outer feature layer. The grammar defines their shape but
        does not define directive names or effects. Specific directive
        vocabularies are left to implementations or to future specifications.
      </t>
      <figure anchor="fig-overall-grammar">
        <name>Overall Top-Level Grammar</name>
        <sourcecode markers="true" type="abnf"><![CDATA[
profile             = *ws object *ws
                    / *ws profile-statements *ws

profile-statement   = member / directive

profile-statements  = profile-statement
                    / profile-statements [ "," ] profile-statement
]]></sourcecode>
      </figure>
    </section>

    <section anchor="lexical-elements">
      <name>Lexical Elements</name>

      <section anchor="whitespace-and-comments">
        <name>Whitespace and Comments</name>
        <t>
          Whitespace in ÜBER includes inline spacing, line terminators, and
          comments. Comments are treated as whitespace wherever the grammar
          permits whitespace.
        </t>
        <figure anchor="fig-whitespace-abnf">
          <name>Whitespace and Comment Grammar</name>
          <sourcecode markers="true" type="abnf"><![CDATA[
ws                  = whitespace / comment
whitespace          = inline-space / line-terminator
inline-space        = SP / HTAB / %x0B / %x0C
line-terminator     = LF / CR / CRLF
line-end            = line-terminator / eof
eof                 = <end of input>
LF                  = %x0A
CR                  = %x0D
control-character   = %x00-1F

comment             = single-line-comment / block-comment
single-line-comment = single-line-marker [ comment-chars ] line-end
single-line-marker  = "//" / "#" / "!"
block-comment       = "/*" [ block-comment-chars ] "*/"
comment-chars       =
                      <characters not containing CR or LF>
block-comment-chars = <any character sequence not containing "*/">
]]></sourcecode>
        </figure>
        <t>
          The line-oriented and block delimiter forms are defined by Figure 2.
          The block form terminates at the next matching closing delimiter.
        </t>
      </section>

      <section anchor="digits-and-literals">
        <name>Digits and Literal Keywords</name>
        <figure anchor="fig-digit-abnf">
          <name>Digits and Basic Literals</name>
          <sourcecode markers="true" type="abnf"><![CDATA[
digit                      = DIGIT
onenine                    = %x31-39
octdigit                   = %x30-37
bindigit                   = %x30 / %x31
hexdigit                   = HEXDIG

digit-or-underscore        = digit    / "_"
hex-digit-or-underscore    = hexdigit / "_"
octal-digit-or-underscore  = octdigit / "_"
binary-digit-or-underscore = bindigit / "_"

decimal-digits             = digit-or-underscore
                           / decimal-digits digit-or-underscore
hex-digits                 = hex-digit-or-underscore
                           / hex-digits hex-digit-or-underscore
octal-digits               = octal-digit-or-underscore
                           / octal-digits octal-digit-or-underscore
binary-digits              = binary-digit-or-underscore
                           / binary-digits binary-digit-or-underscore

true-literal               = %s"true" / %s"yes" / %s"on"
false-literal              = %s"false" / %s"no" / %s"off"
null-literal               = %s"null"
]]></sourcecode>
        </figure>
        <t>
          Underscores are permitted inside digit runs as visual separators.
          The accepted placement is the one defined by these productions,
          including the permissive underscore handling defined by this
          specification.
        </t>
      </section>
    </section>

    <section anchor="values">
      <name>Values</name>
      <t>
        A value is parsed deterministically. For prefixed forms, the leading
        syntax selects the parse path directly. Precedence-sensitive fallback
        applies only among the bare-token scalar candidates. The grammar
        accepts the following value forms:
      </t>
      <figure anchor="fig-value-abnf">
        <name>Value Grammar</name>
        <sourcecode markers="true" type="abnf"><![CDATA[
element      = *ws value *ws

value        = text-block
             / object
             / array
             / double-quoted-string
             / single-quoted-string
             / number
             / boolean
             / null-literal
             / unquoted-string

scalar-value = text-block
             / array
             / double-quoted-string
             / single-quoted-string
             / number
             / boolean
             / null-literal
             / unquoted-string

boolean      = true-literal / false-literal
]]></sourcecode>
      </figure>
      <t>
        The <tt>object</tt> alternative is not present in <tt>scalar-value</tt>
        because member syntax separately permits a trailing object after an
        optional scalar value. This is what allows a member to hold both a
        scalar value and nested members.
      </t>
      <t>
        In the reference implementation, precedence matters only after the
        prefixed forms have been excluded. The bare-token scalar branch is then
        interpreted as <tt>number</tt>, <tt>boolean</tt>, <tt>null</tt>, and
        finally unquoted-string fallback.
      </t>
    </section>

    <section anchor="objects">
      <name>Objects and Members</name>
      <t>
        An object is a collection of members enclosed in braces. Members can be
        separated by commas, whitespace, or both. Trailing commas are not
        permitted.
      </t>
      <figure anchor="fig-object-abnf">
        <name>Object and Member Grammar</name>
        <sourcecode markers="true" type="abnf"><![CDATA[
object              = "{" [ members ] "}"
members             = member
                    / members [ "," ] member

member              = *ws dotted-name *ws separator
                      [ scalar-element ] *ws [ object ]

scalar-element      = *ws scalar-value *ws

separator           = explicit-separator / ws
explicit-separator  = separator-char
                    / explicit-separator separator-char
separator-char      = ":" / "="
]]></sourcecode>
      </figure>
      <t>
        A member MAY therefore hold only a scalar value, only an object, or a
        scalar value followed by an object. The last form creates a valued
        member: the member simultaneously has a scalar value and child members.
      </t>
      <t>
        The <tt>explicit-separator</tt> production permits any non-empty run of
        <tt>:</tt> and <tt>=</tt> characters. Accordingly, <tt>:</tt>,
        <tt>=</tt>, <tt>::</tt>, <tt>:=</tt>, <tt>==</tt>, <tt>::=</tt>, and
        other mixed runs are all valid explicit separators.
      </t>
      <t>
        This grammar assigns no distinct meaning to different explicit
        separator runs. They are syntactically equivalent ways to separate a
        member name from its value. An implementation or profile layered on top
        of ÜBER SHOULD NOT assign different semantics to different separator
        runs unless it intentionally defines a narrower grammar than the one
        specified here.
      </t>
      <figure anchor="fig-separator-runs">
        <name>Equivalent Explicit Separator Runs</name>
        <sourcecode markers="true"><![CDATA[
alpha   :   1
beta    =   2
gamma   :=  3
delta   ::  4
epsilon ==  5
zeta    ::= 6
]]></sourcecode>
      </figure>

      <section anchor="member-names">
        <name>Member Names and Dotted Composition</name>
        <t>
          Member names are parsed as dotted paths. Each name atom contributes
          one level of path structure, and dots separate adjacent atoms unless
          the dot is escaped or appears inside a single-quoted name atom.
        </t>
        <figure anchor="fig-name-abnf">
          <name>Name and Separator Grammar</name>
          <sourcecode markers="true" type="abnf"><![CDATA[
dotted-name         = name-atom
                    / dotted-name *ws "." *ws name-atom

name-atom           = single-quoted-string / dq-name-atom / uq-name

dq-name-atom        = DQUOTE dq-name-content DQUOTE
dq-name-content     = dq-name-segment
                    / dq-name-content "." dq-name-segment
dq-name-segment     = *dq-name-char
dq-name-char        = dq-name-unescaped / "\" escape-sequence
dq-name-unescaped   = <any character except DQUOTE,
                      "\" , "." or control-character>

uq-name             = *uq-name-char
uq-name-char        = uq-name-unescaped / "\" escape-sequence
uq-name-unescaped   = <any character except whitespace,
                      ",", "{", "}", "[", "]",
                      ":", "=", DQUOTE, "'", "\", or ".">
]]></sourcecode>
        </figure>
      <t>
          Name atoms may use any of the three non-text-block string forms:
          single-quoted, double-quoted, or unquoted. Empty name atoms are
          therefore permitted in all three forms. This means that a dotted
          name can legally begin with a dot, end with a dot, contain adjacent
          dots, or contain an explicit empty quoted atom, each of which yields
          an empty-string key at that path level.
      </t>
      <t>
          In the double-quoted and unquoted name forms, a bare dot is a chain
          separator. A literal dot in those forms therefore requires escaping.
          In the single-quoted name form, escapes are not processed, so
          <tt>\.</tt> remains two literal characters, while a single-quoted
          atom such as <tt>'a.b'</tt> still contributes multiple path
          segments.
      </t>
      </section>
    </section>

    <section anchor="arrays">
      <name>Arrays</name>
      <t>
        An array is an ordered sequence of elements enclosed in brackets.
        Commas between elements are optional. As with objects, trailing commas
        are not permitted.
      </t>
      <figure anchor="fig-array-abnf">
        <name>Array Grammar</name>
        <sourcecode markers="true" type="abnf"><![CDATA[
array             = "[" [ elements ] "]"
elements          = element
                  / elements [ "," ] element
]]></sourcecode>
      </figure>
    </section>

    <section anchor="strings">
      <name>Strings and Escapes</name>
      <t>
        ÜBER supports four string forms:
      </t>
      <ul spacing="normal">
        <li>double-quoted strings,</li>
        <li>single-quoted strings,</li>
        <li>text blocks, and</li>
        <li>unquoted strings.</li>
      </ul>
      <t>
        Double-quoted strings, text blocks, and unquoted strings recognize the
        general escape syntax. Single-quoted strings are literal except for the
        closing quote.
      </t>
      <t>
        Text blocks follow the Java text-block model described by
        JEP&nbsp;378.
      </t>
      <figure anchor="fig-string-abnf">
        <name>String Grammar</name>
        <sourcecode markers="true" type="abnf"><![CDATA[
text-block           = %s"\"\"\""
                       line-terminator
                       [ text-block-chars ]
                       %s"\"\"\""
text-block-chars     = text-block-char
                     / text-block-chars text-block-char
text-block-char      = text-block-unescaped
                     / line-terminator
                     / "\" escape-sequence
text-block-unescaped = <any character except "\" ,
                        control-character,
                        or the closing three-DQUOTE delimiter>

double-quoted-string = DQUOTE [ dq-string-chars ] DQUOTE
dq-string-chars      = dq-string-char
                     / dq-string-chars dq-string-char
dq-string-char       = dq-string-unescaped / "\" escape-sequence
dq-string-unescaped  = <any character except DQUOTE,
                        "\" or control-character>

quoted-string        = double-quoted-string / single-quoted-string

single-quoted-string = "'" [ sq-string-chars ] "'"
sq-string-chars      = sq-string-char
                     / sq-string-chars sq-string-char
sq-string-char       = <any character except "'"
                        or control-character>

unquoted-string      = uq-string-chars
uq-string-chars      = uq-string-char
                     / uq-string-chars uq-string-char
uq-string-char       = uq-string-unescaped / "\" escape-sequence
uq-string-unescaped  = <any character except whitespace,
                        control-character, ",", "{", "}",
                        "[", "]", ":", "=", DQUOTE,
                        "'", or "\">
]]></sourcecode>
      </figure>

      <section anchor="escapes">
        <name>Escape Syntax</name>
        <figure anchor="fig-escape-abnf">
          <name>Escape Grammar</name>
          <sourcecode markers="true" type="abnf"><![CDATA[
escape-sequence       = simple-escape
                      / unicode-escape
                      / hex-escape
                      / octal-escape
                      / ws-escape
                      / ","
                      / "{"
                      / "}"
                      / "["
                      / "]"
                      / ":"
                      / "="

simple-escape         = %s"a" / %s"b" / %s"e" / %s"f" / %s"n" / %s"r"
                      / %s"s" / %s"t" / %s"v"
                      / "\" / "'" / DQUOTE / "/"
                      / %s"0" / "." / "#" / "!" / "@"

ws-escape             = SP

unicode-escape        = %s"u" 4hexdigit
                      / %s"u" 6hexdigit
                      / %s"u" 8hexdigit
                      / %s"u{" braced-unicode-digits "}"

braced-unicode-digits = hexdigit
                      / braced-unicode-digits hex-digit-or-underscore

hex-escape            = %s"x" hexdigit
                      / hex-escape hexdigit
octal-escape          = octdigit
                      / octdigit octdigit
                      / octdigit octdigit octdigit
]]></sourcecode>
        </figure>
        <t>
          The punctuation escapes and the whitespace escape each yield the
          literal character represented by the escaped form.
        </t>
        <t>
          In unquoted strings, punctuation characters that would otherwise
          terminate the token MUST be escaped to be included literally.
        </t>
      </section>
    </section>

    <section anchor="numbers">
      <name>Numbers</name>
      <t>
        ÜBER numbers include ordinary decimal numbers, hexadecimal integers,
        octal integers in both legacy and explicit-prefixed forms, binary
        integers, decimal floating-point literals, hexadecimal floating-point
        literals, and special floating-point keywords. This section defines
        only the grammar of those forms.
      </t>
        <t>
          The digit productions used here are the underscore-permissive lexical
          productions defined earlier in this document. In particular,
          <tt>decimal-digits</tt>, <tt>hex-digits</tt>, <tt>octal-digits</tt>,
          and <tt>binary-digits</tt> each include underscore forms exactly as
          specified by this grammar.
        </t>
      <figure anchor="fig-number-abnf">
        <name>Number Grammar</name>
        <sourcecode markers="true" type="abnf"><![CDATA[
number                 = special-value / numeric-literal

special-value          = [ sign ] special-keyword
special-keyword        = %s"NaN" / %s"Infinity"

numeric-literal        = integer-literal / floating-point-literal

integer-literal        = [ sign ] integer-value
integer-value          = decimal-integer
                       / hexadecimal-integer
                       / octal-integer
                       / binary-integer

decimal-integer        = onenine [ decimal-digits ]
                       / "0"
hexadecimal-integer    = "0" hex-indicator hex-digits
octal-integer          = "0" octal-digits
                       / "0" octal-indicator octal-digits
binary-integer         = "0" binary-indicator binary-digits

hex-indicator          = %s"x" / %s"X"
binary-indicator       = %s"b" / %s"B"
octal-indicator        = %s"o" / %s"O"

floating-point-literal = [ sign ] decimal-floating-point-literal
                       / [ sign ]
                         hexadecimal-floating-point-literal
decimal-floating-point-literal
                       = decimal-digits decimal-float-tail
                       / "." decimal-digits [ exponent ]
hexadecimal-floating-point-literal
                       = hex-significand binary-exponent
hex-significand        = "0" hex-indicator hex-digits
                       / "0" hex-indicator
                         hex-digits "." [ hex-digits ]
                       / "0" hex-indicator "." hex-digits
decimal-float-tail     = "." [ decimal-digits ] [ exponent ]
                       / exponent

exponent               = exponent-indicator [ sign ] decimal-digits
exponent-indicator     = %s"e" / %s"E"
binary-exponent        = binary-exponent-indicator
                         [ sign ]
                         decimal-digits
binary-exponent-indicator
                       = %s"p" / %s"P"

sign                   = "+" / "-"
]]></sourcecode>
      </figure>
      <t>
        A decimal integer either consists of the single digit zero or begins
        with a non-zero digit. This avoids alternative decimal spellings with
        leading zeros, leaving the leading-zero form available for the distinct
        octal grammar. Octal literals are accepted both in that legacy
        leading-zero form and in explicit <tt>0o</tt> or <tt>0O</tt>-prefixed
        form. Floating-point literals follow Java-style unsuffixed decimal and
        hexadecimal spellings, including leading-dot decimal forms such as
        <tt>.5</tt> and hexadecimal forms such as <tt>0x1.fp3</tt>; numeric
        type suffixes such as <tt>L</tt>, <tt>l</tt>, <tt>F</tt>, <tt>f</tt>,
        <tt>D</tt>, and <tt>d</tt> are not part of the ÜBER grammar.
      </t>
      <section anchor="numeric-interpretation">
        <name>Numeric Interpretation and Promotion</name>
        <t>
          The grammar places no fixed upper bound on the length of integer
          digit sequences or decimal digit sequences. Accordingly, ÜBER
          syntactically permits arbitrarily large integers and arbitrarily
          precise decimal literals.
        </t>
        <t>
          An implementation that maps numeric literals into host-language
          numeric types SHOULD preserve the value of the literal as faithfully
          as possible. This specification does not require any particular set
          of runtime numeric classes, so the concrete representation of values
          outside common fixed-width ranges or common floating-point precision
          is implementation-defined. A common strategy is automatic fitting and
          promotion of unsuffixed literals based on magnitude and required
          precision.
        </t>
        <ul spacing="normal">
          <li>
            Unsuffixed integer literals are promoted from smaller fixed-width
            integer types to larger ones and then, when available, to an
            arbitrary-precision integer type or other implementation-defined
            exact representation when necessary.
          </li>
          <li>
            Unsuffixed floating-point literals are promoted to an
            implementation-defined higher-precision or exact decimal
            representation when they cannot be represented within the range or
            precision of the implementation's ordinary binary floating-point
            type.
          </li>
          <li>
            Implementations are not required to expose distinct runtime
            classes for every numeric magnitude. Typed-access APIs or
            equivalent implementation mechanisms MAY provide additional
            control over requested numeric categories.
          </li>
        </ul>
        <t>
          The Java reference implementation, for example, promotes unsuffixed
          integers through <tt>Integer</tt> and <tt>Long</tt> before using
          <tt>BigInteger</tt>. For decimal floating-point literals, it selects
          <tt>Float</tt>, <tt>Double</tt>, or <tt>BigDecimal</tt> based on the
          authored mantissa precision together with range. For hexadecimal
          floating-point literals, it selects <tt>Float</tt> or
          <tt>Double</tt>. Other implementations MAY choose different
          concrete numeric classes while preserving the same grammar.
        </t>
        <t>
          For example, a very large integer such as
          <tt>999999999999999999999999999999</tt> remains grammatically valid,
          and an unsuffixed literal such as <tt>1e400</tt> remains
          grammatically valid even when it exceeds the range of common binary
          floating-point types.
        </t>
      </section>
    </section>

    <section anchor="directives">
      <name>Directive Facility</name>
      <t>
        A directive is a top-level statement that begins with the at-sign
        character. The directive grammar is part of this specification because
        it affects the accepted top-level syntax of the format.
      </t>
      <t>
        This document does not define any directive names or directive effects.
        Implementations MAY define directive vocabularies, and future
        specifications MAY standardize such vocabularies, but those semantics
        are outside the scope of this document.
      </t>
      <figure anchor="fig-directive-abnf">
        <name>Directive Grammar</name>
        <sourcecode markers="true" type="abnf"><![CDATA[
directive      = "@"
                 [ inline-space ]
                 directive-name
                 1*inline-space
                 value

directive-name = 1*LOWALPHA
LOWALPHA       = %x61-7A
]]></sourcecode>
      </figure>
      <t>
        Because the directive payload is a normal <tt>value</tt>, the grammar
        permits any value form after a directive name, including structured
        values such as arrays or objects.
      </t>
    </section>

    <section anchor="json-compatibility">
      <name>Compatibility with JSON and Additional ÜBER Features</name>
      <t>
        All valid JSON texts as defined by <xref target="RFC8259"/> are valid
        ÜBER texts. An implementation or
        protocol that wishes to remain within the JSON subset can do so by
        using explicit root objects or arrays, comma-separated collections,
        explicit colon separators, double-quoted strings, decimal numbers, and
        the JSON literals <tt>true</tt>, <tt>false</tt>, and <tt>null</tt>.
      </t>
      <t>
        ÜBER additionally defines the following grammar features, none of which
        are available in JSON:
      </t>
      <ul spacing="normal">
        <li>comments treated as whitespace,</li>
        <li>implicit top-level objects,</li>
        <li>optional commas in objects and arrays,</li>
        <li>whitespace as a member separator,</li>
        <li>single-quoted strings, unquoted strings, and text blocks,</li>
        <li>dotted member names,</li>
        <li>valued members that carry both a scalar and child members,</li>
        <li>non-decimal integer syntaxes, hexadecimal floating-point syntax, leading-dot decimal floats, and underscore separators,</li>
        <li>bare <tt>NaN</tt> and <tt>Infinity</tt> numeric tokens,</li>
        <li>additional boolean keywords such as <tt>yes</tt> and <tt>on</tt>, and</li>
        <li>top-level directive syntax.</li>
      </ul>
      <t>
        These extensions improve authoring ergonomics while preserving a
        deterministic parse order and a grammar that remains tractable for
        implementers.
      </t>
    </section>

    <section anchor="character-encoding">
      <name>Character Encoding</name>
      <t>
        For interchange between systems that are not part of a closed
        ecosystem, an ÜBER text MUST be encoded using UTF-8.
      </t>
      <t>
        Implementations MAY accept other encodings when those encodings are
        selected out of band, by local policy, or by an explicit API
        parameter. This specification does not define an in-band charset
        declaration.
      </t>
      <t>
        Generators intended for interoperable exchange MUST emit UTF-8 when
        producing octets.
      </t>
    </section>

    <section anchor="unicode-and-string-comparison">
      <name>Unicode and String Comparison</name>
      <t>
        Strings and member-name atoms denote Unicode character sequences after
        quote handling and escape processing have been applied.
      </t>
      <t>
        Implementations comparing strings or member names for semantic equality
        MUST compare the resulting character sequences, not the original source
        spellings. For example, <tt>"a"</tt> and <tt>"\u0061"</tt> compare
        equal, and <tt>"\\"</tt> and <tt>"\u005C"</tt> compare equal.
      </t>
      <t>
        Single-quoted strings do not process escapes. Accordingly,
        <tt>'\u0061'</tt> is six literal characters, not the single character
        <tt>a</tt>.
      </t>
      <t>
        This specification does not require Unicode normalization. Distinct
        Unicode character sequences remain distinct unless another
        specification defines additional normalization rules.
      </t>
    </section>

    <section anchor="parsers">
      <name>Parsers</name>
      <t>
        A parser MUST accept every text that conforms to the grammar in this
        document.
      </t>
      <t>
        A parser MAY accept extensions in addition to ÜBER. Such extensions are
        outside this specification. An implementation that accepts extensions
        SHOULD make it possible to disable them when strict conformance is
        required.
      </t>
      <t>
        Parsers MAY set limits on input size, nesting depth, collection size,
        numeric length, comment length, text-block length, and other resource
        dimensions. A parser MAY reject texts that exceed such limits.
      </t>
      <t>
        Because directives are specified here only as syntax, a parser MAY
        recognize directive-shaped input yet reject or ignore particular
        directive names at a higher semantic layer.
      </t>
    </section>

    <section anchor="generators">
      <name>Generators</name>
      <t>
        A generator MUST emit texts that conform to this grammar.
      </t>
      <t>
        When more than one surface spelling is available for the same data, a
        generator MAY choose any conforming spelling. For maximum
        interoperability, generators SHOULD prefer explicit root objects,
        explicit separators, comma-separated collections, and UTF-8 octet
        output.
      </t>
      <t>
        Generators intended for recipients that only understand the JSON subset
        SHOULD avoid comments, directives, implicit top-level objects,
        single-quoted strings, unquoted strings, text blocks, non-decimal
        numbers, non-JSON boolean keywords, dotted-name shorthand, and valued
        members.
      </t>
    </section>

    <section anchor="interoperability">
      <name>Interoperability Considerations</name>
      <t>
        ÜBER permits several syntactic conveniences that can reduce
        interoperability when documents are exchanged with implementations that
        support only a narrower subset.
      </t>
      <t>
        In particular, repeated effective member paths, mixing dotted-name
        shorthand with explicit nested objects, empty name segments, valued
        members, directive usage, non-decimal numbers, and additional boolean
        keywords can all reduce interoperability unless another specification
        defines their use more narrowly.
      </t>
      <t>
        Different syntactic spellings can denote the same effective tree. For
        example, dotted-name shorthand and explicitly nested objects can encode
        the same hierarchical result. Specifications that require stable
        round-tripping SHOULD define a canonical surface form if this matters.
      </t>
      <t>
        Valued members are part of the ÜBER grammar but do not map directly to
        ordinary JSON objects. Protocols that rely on plain JSON object models
        SHOULD avoid valued members.
      </t>
      <t>
        For maximum interoperability, documents SHOULD avoid duplicate
        effective member paths, empty name segments, directives, valued
        members, and non-JSON literal forms unless those features are required
        by the application protocol.
      </t>
    </section>

    <section anchor="media-type-registration">
      <name>Media Type Registration</name>
      <t>
        This document requests registration of the media type
        <tt>application/uber</tt>.
      </t>
      <ul spacing="normal">
        <li>Type name: application</li>
        <li>Subtype name: uber</li>
        <li>Required parameters: none</li>
        <li>Optional parameters: none</li>
        <li>Encoding considerations: binary</li>
        <li>Security considerations: see <xref target="security-considerations"/></li>
        <li>Interoperability considerations: see <xref target="interoperability"/></li>
        <li>Published specification: this document</li>
        <li>Applications that use this media type: configuration and data-interchange systems that consume ÜBER texts</li>
        <li>Fragment identifier considerations: none defined by this document</li>
        <li>File extension(s): <tt>.uber</tt></li>
        <li>Person and email address to contact for further information: Curtis Allen Smith, <tt>curtis.allen.smith@gmail.com</tt></li>
        <li>Intended usage: COMMON</li>
        <li>Restrictions on usage: none</li>
        <li>Author: IETF</li>
        <li>Change controller: IETF</li>
      </ul>
    </section>

    <section anchor="conformance">
      <name>Conformance Requirements</name>
      <t>
        A conforming ÜBER parser:
      </t>
      <ul spacing="normal">
        <li>MUST accept all grammar productions defined in this document.</li>
        <li>
          MUST treat comments as whitespace in every grammar position where
          whitespace is permitted.
        </li>
        <li>
          MUST preserve deterministic bare-token scalar recognition after the
          prefixed forms have been excluded.
        </li>
        <li>
          MUST accept both explicit-object and statement-sequence top-level
          forms.
        </li>
        <li>
          MUST recognize directive-shaped top-level syntax as defined by the
          grammar.
        </li>
        <li>
          MUST compare strings and member-name atoms using the resulting
          character sequence after escape processing.
        </li>
        <li>
          MUST use UTF-8 for interoperable octet exchange.
        </li>
      </ul>
      <t>
        A conforming specification of directive semantics MUST define directive
        names and their processing behavior separately from this grammar
        document.
      </t>
    </section>

    <section anchor="examples">
      <name>Grammar Examples</name>
      <t>
        The following examples illustrate the grammar in more detail. They are
        examples of syntax and structure only; they do not standardize any
        directive semantics or other implementation-defined runtime behavior.
      </t>

      <section anchor="example-json-subset">
        <name>JSON-Compatible Subset</name>
        <figure anchor="fig-example-json-subset">
          <name>Pure JSON Form</name>
          <sourcecode markers="true"><![CDATA[
{
  "server"    : {
    "host"    : "127.0.0.1",
    "port"    : 8080,
    "enabled" : true
  },
  "paths": [ "/srv/app", "/srv/log" ]
}
]]></sourcecode>
        </figure>
      </section>

      <section anchor="example-human-oriented">
        <name>Human-Oriented Implicit Object</name>
        <figure anchor="fig-example-human-oriented">
          <name>Implicit Top-Level Object with Mixed Separators</name>
          <sourcecode markers="true"><![CDATA[
// host and port use different separator forms
server.host : "127.0.0.1"
server.port = 8080
enabled       yes

# commas remain optional
paths [
  /srv/app
  /srv/log,
  /srv/cache
]
]]></sourcecode>
        </figure>
      </section>

      <section anchor="example-comments-and-commas">
        <name>Comments and Optional Commas</name>
        <figure anchor="fig-example-comments-and-commas">
          <name>Collections with Comments and Mixed Delimiters</name>
          <sourcecode markers="true"><![CDATA[
{
  users: [
    alice
    bob,   // comma allowed but not required
    carol
  ]

  retry-count : 3
  timeout-ms  = 5000
}
]]></sourcecode>
        </figure>
      </section>

      <section anchor="example-separators">
        <name>Explicit Separator Variants</name>
        <figure anchor="fig-example-separators">
          <name>Separator Runs with Equivalent Grammar Meaning</name>
          <sourcecode markers="true"><![CDATA[
alpha       1
beta    :   2
gamma   =   3
delta   :=  4
epsilon ::  5
zeta    ==  6
eta     ::= 7
]]></sourcecode>
        </figure>
      </section>

      <section anchor="example-names">
        <name>Member Names and Dotted Composition</name>
        <figure anchor="fig-example-names">
          <name>Quoted, Unquoted, and Empty-Segment Names</name>
          <sourcecode markers="true"><![CDATA[
{
  simple.name           : 1
  "quoted.segment".name : 2
  'literal.dot.name'    : 3
  escaped\.dot.name     : 4
  .leading.empty        : 5
  trailing.empty.       : 6
}
]]></sourcecode>
        </figure>
      </section>

      <section anchor="example-valued-members">
        <name>Valued Members</name>
        <figure anchor="fig-example-valued-member">
          <name>Scalar Value and Nested Members on the Same Key</name>
          <sourcecode markers="true"><![CDATA[
entry: scalar {
  child: 1
  nested.flag: on
}
]]></sourcecode>
        </figure>
      </section>

      <section anchor="example-strings">
        <name>String Forms and Escapes</name>
        <figure anchor="fig-example-strings">
          <name>Double-Quoted, Single-Quoted, Text-Block, and Unquoted Strings</name>
          <sourcecode markers="true"><![CDATA[
{
  dq : "line\nbreak and escaped \{ braces \}"
  sq : 'backslash sequences stay literal: \n \u0041'
  block
    """
      multi-line text block
      with "quotes" and embedded line breaks
    """
  uq : bareword
}
]]></sourcecode>
        </figure>
      </section>

      <section anchor="example-numbers">
        <name>Numeric Forms</name>
        <figure anchor="fig-example-numbers">
          <name>Decimal, Hexadecimal, Octal, Binary, Floating-Point, and Special Values</name>
          <sourcecode markers="true"><![CDATA[
{
  decimal       = 1_000_000
  hexadecimal   = 0xFF_EC_DE_5E
  octal         = 0755
  octal-alt     = 0o755
  binary        = 0b1010_0110
  leading-dot   = .5
  scientific    = 6.022e23
  hex-float     = 0x1.fp3
  wider-int     = 3_000_000_000
  big-integer   = 999999999999999999999999999999
  big-decimal   = 1e400
  not-a-number  = NaN
  infinity      = -Infinity
}
]]></sourcecode>
        </figure>
      </section>

      <section anchor="example-directives">
        <name>Directive Shape</name>
        <figure anchor="fig-example-directive">
          <name>Directive Syntax Only</name>
          <sourcecode markers="true"><![CDATA[
@import imports/user.profile  # implementation-defined semantics
@ example {
  payload : true,
  note    : "semantics are implementation-defined"
}
]]></sourcecode>
        </figure>
      </section>

      <section anchor="example-composite">
        <name>Composite Example</name>
        <figure anchor="fig-example-composite">
          <name>Representative ÜBER Document</name>
          <sourcecode markers="true"><![CDATA[
# human-oriented top-level form
app.name    : "Example Service"
app.version : 1.2.0
app.enabled   yes

server {
  host   : "127.0.0.1"
  port   = 8080
  banner : """
    Example Service
    ready for requests
    """
}

paths.static /srv/www
paths.logs   /srv/log

limits {
  retries    : 3
  backoff-ms : 1_500
  mask       : 0xFF00
}

feature: on {
  child.flag: on
}

@example [alpha beta gamma]
]]></sourcecode>
        </figure>
      </section>
    </section>

    <section anchor="security-considerations">
      <name>Security Considerations</name>
      <t>
        This document defines only grammar. Even so, parsers for the format are
        exposed to the usual risks of processing nested and potentially large
        textual inputs. Implementations SHOULD apply reasonable limits to input
        size, nesting depth, string length, numeric length, and comment length
        in order to mitigate resource-exhaustion attacks.
      </t>
      <t>
        Implementations that define directive semantics should additionally
        consider the security implications of those semantics. Such semantics
        are not specified here.
      </t>
    </section>

    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>
        IANA is requested to register the media type
        <tt>application/uber</tt> as described in
        <xref target="media-type-registration"/>.
      </t>
    </section>
  </middle>
  <back>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <reference anchor="BCP14" target="https://www.rfc-editor.org/info/bcp14">
          <front>
            <title>Best Current Practice 14</title>
            <author fullname="Internet Engineering Task Force" initials="IETF" surname=""/>
            <date month="May" year="2017"/>
          </front>
          <seriesInfo name="BCP" value="14"/>
        </reference>
        <reference anchor="RFC2119" target="https://www.rfc-editor.org/info/rfc2119">
          <front>
            <title>Key words for use in RFCs to Indicate Requirement Levels</title>
            <author fullname="Scott Bradner" initials="S." surname="Bradner"/>
            <date month="March" year="1997"/>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="2119"/>
          <seriesInfo name="DOI" value="10.17487/RFC2119"/>
        </reference>
        <reference anchor="RFC5234" target="https://www.rfc-editor.org/info/rfc5234">
          <front>
            <title>Augmented BNF for Syntax Specifications: ABNF</title>
            <author fullname="D. Crocker" initials="D." surname="Crocker"/>
            <author fullname="P. Overell" initials="P." surname="Overell"/>
            <date month="January" year="2008"/>
          </front>
          <seriesInfo name="STD" value="68"/>
          <seriesInfo name="RFC" value="5234"/>
          <seriesInfo name="DOI" value="10.17487/RFC5234"/>
        </reference>
        <reference anchor="RFC7405" target="https://www.rfc-editor.org/info/rfc7405">
          <front>
            <title>Case-Sensitive String Support in ABNF</title>
            <author fullname="P. Kyzivat" initials="P." surname="Kyzivat"/>
            <date month="December" year="2014"/>
          </front>
          <seriesInfo name="RFC" value="7405"/>
          <seriesInfo name="DOI" value="10.17487/RFC7405"/>
        </reference>
        <reference anchor="RFC8174" target="https://www.rfc-editor.org/info/rfc8174">
          <front>
            <title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title>
            <author fullname="B. Leiba" initials="B." surname="Leiba"/>
            <date month="May" year="2017"/>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="8174"/>
          <seriesInfo name="DOI" value="10.17487/RFC8174"/>
        </reference>
      </references>
      <references>
        <name>Informative References</name>
        <reference anchor="RFC8259" target="https://www.rfc-editor.org/info/rfc8259">
          <front>
            <title>The JavaScript Object Notation (JSON) Data Interchange Format</title>
            <author fullname="T. Bray" initials="T." surname="Bray"/>
            <date month="December" year="2017"/>
          </front>
          <seriesInfo name="RFC" value="8259"/>
          <seriesInfo name="DOI" value="10.17487/RFC8259"/>
        </reference>
      </references>
    </references>
    <section anchor="appendix-complete-abnf">
      <name>Complete ABNF Grammar</name>
      <t>
        This appendix consolidates the full grammar into one ABNF block. The
        production inventory and rule names are the ones defined by this
        document.
      </t>
      <sourcecode markers="true" type="abnf"><![CDATA[
; Lexical structure

ws                         = whitespace / comment
whitespace                 = inline-space / line-terminator
inline-space               = SP / HTAB / %x0B / %x0C
line-terminator            = CR / LF / CRLF
line-end                   = line-terminator / eof
eof                        = <end of input>
LF                         = %x0A
CR                         = %x0D
control-character          = %x00-1F

comment                    = single-line-comment / block-comment
single-line-comment        = single-line-marker
                             [ comment-chars ]
                             line-end
single-line-marker         = "//" / "#" / "!"
block-comment              = "/*" [ block-comment-chars ] "*/"
comment-chars              =
                             <characters not containing CR or LF>
block-comment-chars        =
                             <any character sequence
                             not containing "*/">

digit                      = DIGIT
onenine                    = %x31-39
octdigit                   = %x30-37
bindigit                   = %x30 / %x31
hexdigit                   = HEXDIG

digit-or-underscore        = digit / "_"
hex-digit-or-underscore    = hexdigit / "_"
octal-digit-or-underscore  = octdigit / "_"
binary-digit-or-underscore = bindigit / "_"

decimal-digits             = digit-or-underscore
                           / decimal-digits digit-or-underscore
hex-digits                 = hex-digit-or-underscore
                           / hex-digits hex-digit-or-underscore
octal-digits               = octal-digit-or-underscore
                           / octal-digits octal-digit-or-underscore
binary-digits              = binary-digit-or-underscore
                           / binary-digits binary-digit-or-underscore

true-literal               = %s"true" / %s"yes" / %s"on"
false-literal              = %s"false" / %s"no" / %s"off"
null-literal               = %s"null"

; Syntactic structure

; Top-level profile

profile                    = *ws object *ws
                           / *ws profile-statements *ws
profile-statement          = member / directive
profile-statements         = profile-statement
                           / profile-statements
                             [ "," ]
                             profile-statement
directive                  = "@"
                             [ inline-space ]
                             directive-name
                             1*inline-space
                             value
directive-name             = 1*LOWALPHA
LOWALPHA                   = %x61-7A

; Values

element                    = *ws value *ws
value                      = text-block
                           / object
                           / array
                           / double-quoted-string
                           / single-quoted-string
                           / number
                           / boolean
                           / null-literal
                           / unquoted-string
scalar-value               = text-block
                           / array
                           / double-quoted-string
                           / single-quoted-string
                           / number
                           / boolean
                           / null-literal
                           / unquoted-string
boolean                    = true-literal / false-literal

; Objects and members

object                     = "{" [ members ] "}"
members                    = member
                           / members [ "," ] member
member                     = *ws dotted-name *ws separator
                             [ scalar-element ] *ws [ object ]
scalar-element             = *ws scalar-value *ws

dotted-name                = name-atom
                           / dotted-name *ws "." *ws name-atom
name-atom                  = single-quoted-string
                           / dq-name-atom
                           / uq-name
dq-name-atom               = DQUOTE dq-name-content DQUOTE
dq-name-content            = dq-name-segment
                           / dq-name-content "." dq-name-segment
dq-name-segment            = *dq-name-char
dq-name-char               =
                             <any character except DQUOTE, "\",
                             ".", or control-character>
                           / "\" escape-sequence
uq-name                    = *uq-name-char
uq-name-char               =
                             <any character except whitespace, ",",
                             "{", "}", "[", "]", ":", "=",
                             DQUOTE, "'", "\", or ".">
                           / "\" escape-sequence

separator                  = explicit-separator / ws
explicit-separator         = separator-char
                           / explicit-separator separator-char
separator-char             = ":" / "="

; Arrays

array                      = "[" [ elements ] "]"
elements                   = element
                           / elements [ "," ] element

; Strings

text-block                 = %s"\"\"\""
                             line-terminator
                             [ text-block-chars ]
                             %s"\"\"\""
text-block-chars           = text-block-char
                           / text-block-chars text-block-char
text-block-char            =
                             <any character except "\",
                             control-character,
                             or the closing three-DQUOTE delimiter>
                           / line-terminator
                           / "\" escape-sequence

double-quoted-string       = DQUOTE [ dq-string-chars ] DQUOTE
dq-string-chars            = dq-string-char
                           / dq-string-chars dq-string-char
dq-string-char             =
                             <any character except DQUOTE, "\",
                             or control-character>
                           / "\" escape-sequence

quoted-string              = double-quoted-string
                           / single-quoted-string
single-quoted-string       = "'" [ sq-string-chars ] "'"
sq-string-chars            = sq-string-char
                           / sq-string-chars sq-string-char
sq-string-char             =
                             <any character except "'"
                             or control-character>

unquoted-string            = uq-string-chars
uq-string-chars            = uq-string-char
                           / uq-string-chars uq-string-char
uq-string-char             =
                             <any character except whitespace,
                             control-character, ",", "{", "}",
                             "[", "]", ":", "=", DQUOTE, "'",
                             or "\">
                           / "\" escape-sequence

ws-escape                  = SP
escape-sequence            = simple-escape
                           / unicode-escape
                           / hex-escape
                           / octal-escape
                           / ws-escape
                           / ","
                           / "{"
                           / "}"
                           / "["
                           / "]"
                           / ":"
                           / "="
simple-escape              = %s"a" / %s"b" / %s"e"
                           / %s"f" / %s"n" / %s"r"
                           / %s"s" / %s"t" / %s"v"
                           / "\" / "'" / DQUOTE / "/"
                           / %s"0" / "." / "#" / "!" / "@"
unicode-escape             = %s"u" 4hexdigit
                           / %s"u" 6hexdigit
                           / %s"u" 8hexdigit
                           / %s"u{" braced-unicode-digits "}"
braced-unicode-digits      = hexdigit
                           / braced-unicode-digits
                             hex-digit-or-underscore
hex-escape                 = %s"x" hexdigit
                           / hex-escape hexdigit
octal-escape               = octdigit
                           / octdigit octdigit
                           / octdigit octdigit octdigit

; Numbers

number                     = special-value / numeric-literal
special-value              = [ sign ] special-keyword
special-keyword            = %s"NaN" / %s"Infinity"
numeric-literal            = integer-literal / floating-point-literal
integer-literal            = [ sign ] integer-value
integer-value              = decimal-integer
                           / hexadecimal-integer
                           / octal-integer
                           / binary-integer
decimal-integer            = onenine [ decimal-digits ]
                           / "0"
hexadecimal-integer        = "0" hex-indicator hex-digits
octal-integer              = "0" octal-digits
                           / "0" octal-indicator octal-digits
binary-integer             = "0" binary-indicator binary-digits
hex-indicator              = %s"x" / %s"X"
binary-indicator           = %s"b" / %s"B"
octal-indicator            = %s"o" / %s"O"
floating-point-literal     = [ sign ] decimal-floating-point-literal
                           / [ sign ]
                             hexadecimal-floating-point-literal
decimal-floating-point-literal
                           = decimal-digits decimal-float-tail
                           / "." decimal-digits [ exponent ]
hexadecimal-floating-point-literal
                           = hex-significand binary-exponent
hex-significand            = "0" hex-indicator hex-digits
                           / "0" hex-indicator
                             hex-digits "." [ hex-digits ]
                           / "0" hex-indicator "." hex-digits
decimal-float-tail         = "." [ decimal-digits ] [ exponent ]
                           / exponent
exponent                   = exponent-indicator
                             [ sign ]
                             decimal-digits
exponent-indicator         = %s"e" / %s"E"
binary-exponent            = binary-exponent-indicator
                             [ sign ]
                             decimal-digits
binary-exponent-indicator  = %s"p" / %s"P"
sign                       = "+" / "-"
]]></sourcecode>
    </section>
  </back>
</rfc>
