<?xml version="1.0" encoding="UTF-8"?>
<rfc xmlns:xi="http://www.w3.org/2001/XInclude"
     category="std"
     consensus="true"
     docName="draft-agent-gw-01"
     ipr="trust200902"
     submissionType="IETF"
     version="3">

  <front>
    <title>Agent Communication Gateway for Semantic Routing and Working Memory</title>

    <seriesInfo name="Internet-Draft" value="draft-agent-gw-01"/>

    <author initials="Xiaohui" surname="Xie" fullname="Xiaohui Xie">
      <organization>Tsinghua University</organization>
      <address>
        <email>xiexiaohui@tsinghua.edu.cn</email>
      </address>
    </author>

    <author initials="Zian" surname="Wang" fullname="Zian Wang">
      <organization>Beijing University of Posts and Telecommunications</organization>
      <address>
        <email>zianwang@bupt.edu.cn</email>
      </address>
    </author>

    <author initials="Tianshuo" surname="Hu" fullname="Tianshuo Hu">
      <organization>Tsinghua University</organization>
      <address>
        <email>huts22@mails.tsinghua.edu.cn</email>
      </address>
    </author>

    <author initials="Yong" surname="Cui" fullname="Yong Cui">
      <organization>Tsinghua University</organization>
      <address>
        <email>cuiyong@tsinghua.edu.cn</email>
      </address>
    </author>

    <date day="1" month="March" year="2026"/>
    <workgroup>Agent-GW</workgroup>

    <abstract>
      <t>
        This document presents an architectural framework for an Agent Communication Gateway (Agent-GW),
        designed to support large-scale, heterogeneous, and dynamic multi-agent collaboration across
        administrative and protocol boundaries.
      </t>
      <t>
        As agents evolve from isolated entities to a collaborative digital workforce, the infrastructure must
        transition from rigid, endpoint-based connectivity to intent-based interaction. This draft proposes Agent-GW
        as an infrastructure hub that provides native primitives for Semantic Routing (dispatching tasks by intent
        and capability), Working Memory (shared structured context across multi-step workflows), automated protocol
        adaptation (normalizing heterogeneous interfaces into a unified agent-facing protocol), oracle-free agent
        evaluation, and collaborative inference acceleration via a Knowledge Delivery Network (KDN).
      </t>
    </abstract>
  </front>

  <middle>

    <section anchor="introduction">
      <name>Introduction</name>
      <t>
        The rapid advancement of Large Language Models (LLMs) has catalyzed the emergence of an "Internet of Agents",
        where autonomous software entities and tool-like services interconnect to form collaborative workflows.
        Unlike traditional microservices, agents have varying degrees of autonomy, reasoning capabilities, and diverse
        interface standards. Early deployments were often siloed within proprietary frameworks, limiting cross-domain
        collaboration.
      </t>
      <t>
        As these systems scale, the bottleneck shifts from basic connectivity to context management and efficient
        orchestration. Delivering the right context to the right agent at the right time, while controlling the cost
        of inference, becomes an infrastructure challenge. Existing gateways optimized for static endpoints and
        stateless forwarding lack semantic awareness to interpret intents or manage multi-step task lifecycles.
      </t>
      <t>
        This document introduces the Agent Communication Gateway (Agent-GW), situated between agents and external tools
        or services. Agent-GW elevates the network from a passive transport layer to an active semantic intermediary by
        introducing two core primitives: Semantic Routing (intent/capability-based dispatch) and Working Memory
        (shared, incrementally updated context). It further defines protocol adaptation, evaluation, observability,
        and KDN-based inference acceleration.
      </t>
    </section>

    <section anchor="conventions">
      <name>Conventions used in this document</name>
      <t>
        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",
        "MAY", and "OPTIONAL" in this document are to be interpreted as described in <xref target="RFC2119"/>
        and <xref target="RFC8174"/> when, and only when, they appear in all capitals, as shown here.
      </t>
    </section>

    <section anchor="terminology">
      <name>Terminology</name>
      <t>The following terms are defined in this draft:</t>

      <dl spacing="normal">
        <dt>Agent-GW (Agent Communication Gateway)</dt>
        <dd>
          An infrastructure component coordinating multi-agent communication, responsible for protocol adaptation,
          semantic routing, and context management.
        </dd>

        <dt>Internal Semantic Domain (ISD)</dt>
        <dd>
          The internal network domain, typically a LAN or on-prem cluster, where agent-to-agent messages follow
          standardized or controlled protocols (e.g., A2A, MCP, or natural language over a controlled channel).
          The ISD is considered within a primary trust boundary.
        </dd>

        <dt>External Heterogeneous Ecosystem (EHE)</dt>
        <dd>
          External networks and services outside the LAN trust boundary, often with diverse and unstructured protocols
          and varying security postures (e.g., public SaaS APIs, Internet services, third-party tools).
        </dd>

        <dt>Semantic Routing</dt>
        <dd>
          Routing a request based on the semantic intent of a task and the capabilities/trust state of available agents,
          rather than static endpoint addresses.
        </dd>

        <dt>Working Memory</dt>
        <dd>
          A structured, temporary storage mechanism that maintains context and state across a multi-step or multi-turn
          workflow. It can be session-scoped and policy-controlled.
        </dd>

        <dt>KDN (Knowledge Delivery Network)</dt>
        <dd>
          A mechanism that treats inference artifacts (e.g., LLM KV caches) as reusable and distributable objects,
          enabling cooperative acceleration across agents or gateways.
        </dd>

        <dt>APA (Adaptive Protocol Adapter)</dt>
        <dd>
          An automated protocol adaptation function that discovers/infers external interface schemas and normalizes
          heterogeneous protocols (e.g., HTTP, gRPC, MQTT) into an internal standardized format.
        </dd>

        <dt>MCP (Model Context Protocol)</dt>
        <dd>
          A reference standard for connecting AI assistants/agents to tools and data sources, used here as an example
          of a normalized internal interaction format.
        </dd>

        <dt>A2A</dt>
        <dd>
          Agent-to-agent messaging format/protocol used inside the ISD. This draft treats A2A as a generic class of
          agent messaging and illustrates how Agent-GW routes and adapts it.
        </dd>

        <dt>Peer Agent-GW</dt>
        <dd>
          Another Agent-GW instance in a different node/site/domain. Peer Agent-GWs may synchronize selected state or
          inference artifacts subject to policy.
        </dd>
      </dl>
    </section>

    <section anchor="deployment-boundary">
      <name>Deployment Model and Trust Boundary</name>

      <t>
        Many deployments distinguish "internal" versus "external" entities by a network boundary aligned with a LAN.
        In this draft, the Internal Semantic Domain (ISD) refers to the internal LAN/on-prem cluster where agents and
        enterprise tools operate under shared governance, while the External Heterogeneous Ecosystem (EHE) refers to
        networks and services outside that boundary.
      </t>

      <t>
        Agent-GW logically sits at the intersection of these domains. It provides (1) semantic routing and state functions
        within the ISD, and (2) border adaptation functions for controlled egress/ingress across the trust boundary.
        Policies MAY restrict what context, memory, or inference artifacts can cross from ISD to EHE.
      </t>

      <t>
        The internal/external distinction is a deployment choice. The same Agent-GW architecture can be deployed as a
        pure internal hub (no external egress), a border gateway, or a hybrid topology with peer synchronization.
      </t>
    </section>

    <section anchor="requirements">
      <name>Network and Infrastructure Requirements</name>

      <t>
        Agent interactions are typically context-heavy, short-lived, and driven by high-level goals. To support this,
        the infrastructure SHOULD satisfy the following requirements:
      </t>

      <t><strong>Intent-Based Addressing:</strong> The infrastructure SHOULD support addressing based on capabilities
        and intent rather than topology.</t>

      <t><strong>Stateful Context Management:</strong> Agentic workflows often involve multi-step reasoning where context
        accumulates. The gateway MUST support policy-controlled state retention and retrieval.</t>

      <t><strong>Heterogeneous Interoperability:</strong> The ecosystem includes diverse protocols. The gateway SHOULD
        provide automated adaptation layers (e.g., APA) and normalization into standardized internal formats.</t>

      <t><strong>Dynamic Capability Discovery:</strong> The gateway SHOULD provide real-time capability discovery and
        health/status tracking for dispatch decisions.</t>

      <t><strong>Inference Efficiency:</strong> The gateway MAY cache and share inference artifacts such as KV caches
        (KDN) to reduce redundant computation and improve TTFT.</t>

      <t><strong>Trust Boundary Enforcement:</strong> The gateway MUST enforce policies for data privacy, context
        leakage prevention, and capability spoofing mitigation, especially across ISD/EHE boundaries.</t>
    </section>

    <section anchor="architecture">
      <name>Architecture Overview</name>

      <t>
        This section describes the reference architecture of the Agent Communication Gateway (Agent-GW). Agent-GW functions as
        a semantic intermediary operating at the application and cognitive layers, with explicit separation between
        core semantic/state functions and border adaptation functions.
      </t>

      <section anchor="arch-entity-relationship">
        <name>Internal/External Entity Relationship</name>

        <t>
          Figure 1 illustrates Agent-GW within a LAN-scoped Internal Semantic Domain (ISD) and its controlled interfaces to
          an External Heterogeneous Ecosystem (EHE). This figure is intentionally structured to highlight (a) internal
          agents and clients, (b) Agent-GW core state/routing functions, (c) border adaptation functions, and (d) southbound
          targets including legacy APIs, native agents, and peer gateways.
        </t>

        <figure anchor="fig-isd-ehe">
          <name>Agent-GW Entity Relationship across Internal Semantic Domain and External Ecosystem</name>

          <!-- Preferred: upload SVG and reference it here -->
          <!-- <artwork src="https://example.com/fig/isd-ehe.svg" type="image/svg+xml"/> -->

          <!-- ASCII fallback -->
          <artwork type="ascii-art"><![CDATA[
.................................................................................
Internal Semantic Domain (Standardized A2A / MCP / NL)
| +----------------------+----------------------+
| | Client Agent (A1)    | User Interface (U1)  |
| +----------+-----------+-----------+----------+
|            | (A2A Msg)             | (A2A Msg)
|            v                       v
| +-----------------------------+-----------------------+-------------------------+
| |    Agent Communication Gateway (Agent-GW)                                     |
| |                                                                               |
| |  [ Core State & Routing Functions ]                                            |
| |  +----------------+   +--------------------+   +---------------------------+ |
| |  | Capability Dir |   | Semantic Router    |   | Working Memory & KDN       | |
| |  | (Trust State)  |   | (Intent -> Target) |   | (Context / KV Cache)       | |
| |  +-------+--------+   +---------+----------+   +-----------+---------------+ |
| |          |                      |                          |                 |
| |  [ Border Adaptation Functions ]                                              |
| |  +--------------------+  +--------------------+  +--------------------------+|
| |  | Auto-Adapter (APA) |  | Native Passthrough |  | Sync & State Transfer    ||
| |  | (Protocol Trans.)  |  | (Direct Routing)   |  | (Knowledge Trans.)       ||
| |  +---------+----------+  +---------+----------+  +-----------+--------------+|
| +------------+-----------------------+-------------------------+---------------+
|              | (REST/RPC)           | (MCP/A2A)               | (A2A+KV Cache)
|              v                      v                         v
|        +-----------+          +-------------+           +-------------+
|        | Legacy    |          | Native      |           | Peer Agent-GW|
|        | APIs T1   |          | Agents T2   |           | Node N2      |
|        +-----------+          +-------------+           +-------------+
.................................................................................
External Heterogeneous Ecosystem (Unstructured / Diverse Protocols)
]]></artwork>
        </figure>

        <t>
          Operationally, messages originating from internal clients/agents enter Agent-GW via standardized internal
          formats (e.g., A2A or MCP). If a target resides in the EHE, Agent-GW invokes APA for protocol adaptation and
          applies egress policies to prevent unintended context leakage.
        </t>
      </section>

      <section anchor="arch-functional-planes">
        <name>Functional Planes</name>
        <t>
          Agent-GW can be described as four logical planes:
        </t>
        <t>
          (1) Ingress/Access: protocol detection, authentication, sandboxing, normalization.
          (2) Cognitive Orchestration: intent parsing, planning, semantic routing, dispatch, observability hooks.
          (3) Knowledge &amp; State: working memory, experience/evolutionary memory, KDN cache and artifact management.
          (4) Egress/Ecosystem Interface: drivers for legacy systems, native agents, physical-world bridges, and peer sync.
        </t>
      </section>
    </section>

    <section anchor="protocol-and-io">
      <name>Protocol Model and Input/Output Examples</name>

      <t>
        Agent-GW supports a mixed protocol environment. Within the ISD, interactions are RECOMMENDED to use standardized
        agent messaging (e.g., A2A or MCP). For southbound access to targets, Agent-GW MAY translate into external protocols
        such as REST/HTTP, gRPC, MQTT, OPC UA, ROS, or vendor-specific SDKs.
      </t>

      <t>
        The following subsections provide illustrative (non-normative) examples of message shapes and I/O mapping.
        These examples are intended to clarify how semantic routing, working memory, and adaptation interact.
      </t>

      <section anchor="proto-a2a-example">
        <name>Example: A2A Request Normalization</name>
        <t>Illustrative A2A message (ingress) that Agent-GW normalizes into an internal semantic request:</t>
        <artwork type="json"><![CDATA[
{
  "a2a_version": "1",
  "session_id": "s-123",
  "from": "agent:A1",
  "intent": "Inspect Assembly Line B",
  "constraints": {
    "latency_ms": 800,
    "privacy": "internal_only"
  },
  "context_ref": ["wm://s-123/ctx"]
}
]]></artwork>

        <t>Illustrative normalized semantic request inside Agent-GW (after parsing and policy checks):</t>
        <artwork type="json"><![CDATA[
{
  "session_id": "s-123",
  "intent": {
    "task": "inspect",
    "target": "assembly_line",
    "id": "B"
  },
  "routing_hints": {
    "privacy_scope": "ISD",
    "required_capabilities": ["iot_read", "robot_navigation"]
  },
  "context": {
    "working_memory_keys": ["ctx", "last_actions"],
    "kdn_cache_allowed": true
  }
}
]]></artwork>
      </section>

      <section anchor="proto-egress-example">
        <name>Example: Southbound Output Mapping</name>

        <t>
          When dispatching to a legacy IoT array, Agent-GW MAY translate a sub-task into MQTT or REST. When dispatching to
          an embodied agent, Agent-GW MAY translate into a ROS bridge. These mappings are policy-controlled and can be
          produced by APA or pre-registered drivers.
        </t>

        <t>Example REST payload for a legacy API target:</t>
        <artwork type="json"><![CDATA[
{
  "req": "temp_read",
  "loc": "Line B"
}
]]></artwork>

        <t>Example ROS command topic for an embodied agent target:</t>
        <artwork type="ascii-art"><![CDATA[
/cmd_vel
/navigate_to
]]></artwork>
      </section>
    </section>

    <section anchor="infrastructure-functions">
      <name>Infrastructure Functions</name>

      <section anchor="agent-id-directory">
        <name>Agent Identification and Capability Directory</name>

        <t>
          This function establishes a root of trust for the agent network and mitigates capability spoofing. Agent-GW
          maintains a dynamic directory where entries represent active, verified agent states rather than static records.
        </t>

        <t>
          <strong>Cryptographic Identity:</strong> Participating agents SHOULD possess a cryptographic Agent ID (AID)
          bound to credentials (e.g., X.509 certificate). An agent registers by submitting an AgentCard that binds
          identity to a capability descriptor (e.g., capability hash, policy tags).
        </t>

        <t>
          <strong>Capability Claim and Verification (CCV):</strong> To reduce malicious registration, Agent-GW MAY implement
          challenge-response verification based on metamorphic testing principles (semantic variants of a task) to evaluate
          functional consistency without requiring access to internal model weights.
        </t>

        <t>
          <strong>Semantic Heartbeat:</strong> To maintain freshness, Agent-GW MAY periodically verify Layer-7 functional
          integrity (beyond L3 keep-alives). Agents failing challenges MAY be dynamically quarantined or pruned.
        </t>
      </section>

      <section anchor="protocol-adaptation">
        <name>Automated Protocol Adaptation and Interface Normalization (APA)</name>

        <t>
          Residing at the border adaptation functions, APA normalizes heterogeneous external protocols (HTTP, MQTT, gRPC,
          proprietary SDKs) into an internal standardized request format (e.g., MCP or A2A). For poorly documented interfaces,
          APA MAY apply active probing to infer schemas and refine bindings with feedback loops.
        </t>
      </section>

      <section anchor="agent-evaluation">
        <name>Infrastructure-Level Agent Evaluation and Compliance</name>

        <t>
          Agents are often black boxes. Agent-GW introduces infrastructure-level evaluation to estimate reliability and compliance
          without access to model weights. Using oracle-free metamorphic testing, Agent-GW generates semantic variants of tasks
          and evaluates response consistency. Results MAY contribute to a dynamic reliability score used in routing.
        </t>
      </section>

      <section anchor="semantic-routing-detail">
        <name>Dynamic Orchestration and Semantic Routing</name>

        <t>
          Static routing tables are insufficient for dynamic collaboration. Agent-GW performs semantic routing by decomposing
          complex intents into a DAG of sub-tasks and dispatching them to suitable targets based on capability matching,
          trust score, privacy constraints, and operational metrics.
        </t>
      </section>

      <section anchor="evolutionary-memory-detail">
        <name>Evolutionary Knowledge Management</name>

        <t>
          Agent-GW MAY incorporate evolutionary memory that captures execution traces, success/failure outcomes, and user corrections.
          This enables continuous improvement in routing policies and can provide feedback guidance to terminal agents.
        </t>
      </section>

      <section anchor="kdn-detail">
        <name>Collaborative Inference Acceleration (KDN)</name>

        <t>
          Multi-agent workflows often repeat reasoning over shared context. KDN treats inference artifacts (e.g., LLM KV caches)
          as reusable objects, enabling sharing across co-located agents or peer Agent-GWs subject to policy. This can reduce TTFT
          and total compute.
        </t>
      </section>
    </section>

    <section anchor="scenarios">
      <name>Representative Deployment Scenarios</name>

      <t>
        This section provides representative scenarios with explicit internal/external boundaries, and concrete input/output
        protocol examples. These scenarios are illustrative and non-normative.
      </t>

      <section anchor="scenario-hetero-copilot">
        <name>Scenario 1: Enterprise Copilot with Local Secure Routing and External Egress</name>

        <t>
          An employee copilot receives a natural language request that requires both private on-prem data and public market
          information. Agent-GW routes sensitive processing to an internal SLM/LLM while allowing limited external API egress for
          public data, enforcing privacy and context minimization.
        </t>

        <figure anchor="fig-scenario-copilot">
          <name>Enterprise Copilot: Split Routing by Privacy and Capability Tags</name>

          <!-- Preferred: upload SVG and reference it here -->
          <!-- <artwork src="https://example.com/fig/scenario-copilot.svg" type="image/svg+xml"/> -->

          <artwork type="ascii-art"><![CDATA[
[ Employee Copilot Agent ]
  (NL/MCP: "Summarize Q3 Private Report & compare with global markets")
=========================================|=========================================
[ Agent-GW ]  (Semantic Router evaluates Data Privacy & Capability Tags)
     +----------------------------+----------------------------+
     | (Contains sensitive Data)  | (Needs external info)      |
     | [Local Secure Routing]     | [External Egress]          |
     | (Read KV Cache from KDN)   | (APA to API)               |
     v                            v
+--------------------+        +--------------------+
| Local Secure SLM   |        | Public Cloud API   |
| (Data stays on-prem|        | (Global Markets)   |
+--------------------+        +--------------------+
==================================================================================
]]></artwork>
        </figure>

        <t>
          Example ingress request (MCP-like) and split dispatch:
        </t>
        <artwork type="json"><![CDATA[
{
  "protocol": "MCP",
  "task": "summarize_and_compare",
  "inputs": {
    "private_doc_ref": "vault://reports/q3-private",
    "public_topic": "global markets"
  },
  "policy": {
    "private_data_scope": "ISD",
    "allow_external_egress": true,
    "egress_context_budget_tokens": 200
  }
}
]]></artwork>
      </section>

      <section anchor="scenario-industrial">
        <name>Scenario 2: Industrial Planning with IoT (MQTT/REST) and Embodied Agent (ROS Bridge)</name>

        <t>
          A factory planning agent issues an A2A request: "Inspect Assembly Line B". Agent-GW decomposes the intent into
          two sub-tasks: (1) read sensor data from a legacy IoT array and (2) command a robotic dog to navigate to the
          location. Agent-GW uses APA to translate A2A into MQTT/REST for IoT, and native passthrough/bridge for ROS control.
        </t>

        <figure anchor="fig-scenario-industrial">
          <name>Industrial Scenario: Intent Decomposition and Protocol Translation</name>

          <!-- Preferred: upload SVG and reference it here -->
          <!-- <artwork src="https://example.com/fig/scenario-industrial.svg" type="image/svg+xml"/> -->

          <artwork type="ascii-art"><![CDATA[
[ Factory Planning Agent (The "Brain") ]
  (A2A Protocol: "Inspect Assembly Line B")
=========================================|=========================================
[ Agent-GW ] (Semantic Router decomposes intent into 2 sub-tasks)
     +----------------------------+----------------------------+
     | [Auto-Adapter (APA)]       | [Native Passthrough]       |
     | (A2A -> MQTT/REST)         | (A2A -> ROS Bridge)        |
     | Payload: {"req":"temp_read",| Payload: /cmd_vel,         |
     |          "loc":"Line B"}    |          /navigate_to      |
     v                            v
+--------------------+        +--------------------+
| Legacy IoT Array   |        | Robotic Dog        |
| (Temp/Vision Sensor|        | (Embodied Agent)   |
+--------------------+        +--------------------+
==================================================================================
]]></artwork>
        </figure>

        <t>Illustrative sub-task outputs:</t>
        <artwork type="json"><![CDATA[
{
  "subtasks": [
    {
      "id": "t1",
      "target": "LegacyIoTArray",
      "protocol": "MQTT/REST",
      "payload": {"req":"temp_read","loc":"Line B"}
    },
    {
      "id": "t2",
      "target": "RoboticDog",
      "protocol": "ROS",
      "payload": {"topic":"/navigate_to","args":{"loc":"Line B"}}
    }
  ]
}
]]></artwork>
      </section>

      <section anchor="scenario-peer-sync">
        <name>Scenario 3: Peer Agent-GW Synchronization and Inference Artifact Transfer</name>

        <t>
          For multi-site deployments, a local Agent-GW MAY synchronize selected working memory snapshots or KDN artifacts
          with a peer Agent-GW. This supports mobility, disaster recovery, and cooperative acceleration. Synchronization
          MUST be policy-gated and can be limited to anonymized summaries or encrypted artifacts.
        </t>

        <t>
          Example: transfer a session context digest and a KV cache handle rather than full raw prompts.
        </t>

        <artwork type="json"><![CDATA[
{
  "sync": {
    "peer": "agent-gw://node-n2",
    "session_id": "s-123",
    "transfer": {
      "working_memory_digest": "sha256:...",
      "kdn_artifact_handle": "kdn://artifact/kv/abc",
      "encryption": "HPKE",
      "policy_tags": ["no_raw_pii", "ttl_10m"]
    }
  }
}
]]></artwork>
      </section>

    </section>

    <section anchor="security">
      <name>Security Considerations</name>

      <t>
        Introducing an active Agent-GW raises specific security challenges including agent identity spoofing, capability poisoning,
        context leakage, inference artifact theft, and cross-boundary data exfiltration.
      </t>

      <t>
        Agent-GW deployments MUST define explicit trust boundaries (e.g., ISD vs EHE) and enforce policies for:
        (1) authentication/authorization for agent registration and dispatch,
        (2) privacy scoping for working memory,
        (3) egress filtering and context minimization,
        (4) encryption and access control for KDN artifacts,
        (5) observability and audit trails for routing decisions and protocol adaptation.
      </t>
    </section>

    <section anchor="iana">
      <name>IANA Considerations</name>
      <t>This document has no IANA actions at this time.</t>
    </section>

    <section anchor="ack">
      <name>Acknowledgements</name>
      <t>TBD</t>
    </section>

  </middle>

  <back>
    <references>
      <name>References</name>

      <references>
        <name>Normative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
      </references>

    </references>
  </back>

</rfc>