|
15 | 15 | latestVersion: null,
|
16 | 16 | edDraftURI: "https://w3c-cg.github.io/webagents/TaskForces/Interoperability/Reports/report-interoperability.html",
|
17 | 17 | editors: [{ name: "Your Name", url: "https://your-site.com" }],
|
| 18 | + authors: [ |
| 19 | + { |
| 20 | + name: "Jérémy Lemée", |
| 21 | + url: "https://www.alexandria.unisg.ch/entities/person/Jeremy_Lemee" |
| 22 | + } |
| 23 | + ], |
18 | 24 | github: "https://github.com/w3c-cg/webagents/",
|
19 | 25 | shortName: "webagents-interop",
|
20 | 26 | xref: "web-platform",
|
@@ -131,11 +137,8 @@ <h2>Terminology</h2>
|
131 | 137 | <dt><dfn id="dfn-artifact">Artifact</dfn> or <dfn id="dfn-tool">Tool</dfn></dt>
|
132 | 138 | <dd>A <a href="https://www.w3.org/TR/webarch/#def-resource">resource</a> [[WEBARCH]] that can be shared and used by <a href="#dfn-agent">agents</a> to support their activities. In some <a href="#dfn-mas">multi-agent systems</a>, agents can construct artifacts to instrument their environments [[JACAMO]].In the context of agentic AI, a tool is a is a functional interface to a program that a language model can use. A tool can enable an LLM to perceive or act in an environment or to perform computations. [[TOOL]]</dd>
|
133 | 139 |
|
134 |
| - <dt><dfn id="dfn-augmented-llm">Augmented Language Model</dfn></dt> |
135 |
| - <dd>A language model augmented with abilities such as reasoning, tool use, information retrieval, or storing context across interactions. Unlike an <a href="#dfn-agent">agent</a>, an augmented language model does not actively pursue goals and is not <a href="#dfn-situated">situated</a> in an environment. See also [[TMLR23]] and [[ANTHROPIC24]].</dd> |
136 |
| - |
137 |
| - <dt><dfn id="dfn-language-agent">Language Agent</dfn></dt> |
138 |
| - <dd>A language agent is an agent that relies on a language model to interact with their environment. [[COALA]] The language model can be used to process observations represented in natural or formal languages, generate the actions to perform, and make decisions [[COALA]] </dd> |
| 140 | + <dt><dfn id="dfn-augmented-llm">Augmented Language Model</dfn> or <dfn id="dfn-language-agent">Language Agent</dfn></dt> |
| 141 | + <dd>A language model augmented with abilities such as reasoning, tool use, information retrieval, or storing context across interactions. Unlike an <a href="#dfn-agent">agent</a>, an augmented language model does not actively pursue goals and is not <a href="#dfn-situated">situated</a> in an environment. See also [[TMLR23]] and [[ANTHROPIC24]]. A Language agent is an <a href="#dfn-agent">agent</a> that relies on a language model to interact with their environment. The language model can be used to process observations represented in natural or formal languages, generate the actions to perform, and make decisions [[COALA]]. These agents can be created using an <a href="#dfn-augmented-llm">augmented language model</a> as a building block [[ANTHROPIC24]].</dd> |
139 | 142 |
|
140 | 143 | <dt><dfn id="dfn-mas">Multi-Agent System (MAS)</dfn></dt>
|
141 | 144 | <dd>A system composed of <a href="#dfn-agent">agents</a> that are situated in a shared environment and interact with one another to achieve individual or collective goals. Agents can work in collaboration, cooperation, and/or competition. A MAS can be either an open or a closed system. This report is primarily concerned with open MAS.</dd>
|
@@ -337,9 +340,9 @@ <h3>Agentic AI</h3>
|
337 | 340 | <p>This section is to summarize relevant developments around AI agents and agentic AI (e.g., MCP, A2A, ANP, LMOS, etc.).</p>
|
338 | 341 | </aside>
|
339 | 342 | <p>The concept of Agentic AI refers to AI systems that are able to take autonomous decisions in order to achieve goals. The term is commonly used to refer more specifically to autonomous generative AI systems. </p>
|
340 |
| - <p>Large Language Models (LLMs) are a core technology to create agentic AI systems. More precisely, a core component to create <a>language agents</a>, is an <a>Augmented Language Model</a> (ALM), which is an LLM extended with the ability to reason and the ability to use <a>tools</a> [[TMLR23]]. These ALMs are building blocks to create agents [[ANTHROPIC24]]. The <a href="https://modelcontextprotocol.io/">Model Context Protocol (MCP)</a> is a protocol to enable ALMs and language agents to connect with external tools and data sources. The protocol thus enables a separation of concerns between agents and tools/data sources. In practice, MCP servers can be run on the same machine or can be accessed through the Internet via streamable HTTP. <a href="https://github.com/nlweb-ai/NLWeb">NLWeb</a> relies on MCP to integrate conversational interfaces within websites, thus aiming to become the HTML of the agentic web. </p> |
| 343 | + <p>Large Language Models (LLMs) are a core technology to create agentic AI systems. More precisely, a core component to create <a href="#dfn-language-agent">language agents</a>, is an <a href="#dfn-augmented-llm">Augmented Language Model</a> (ALM), which is an LLM extended with the ability to reason and the ability to use <a>tools</a> [[TMLR23]]. These ALMs are building blocks to create agents [[ANTHROPIC24]]. The <a href="https://modelcontextprotocol.io/">Model Context Protocol (MCP)</a> is a protocol to enable ALMs and language agents to connect with external tools and data sources. The protocol thus enables a separation of concerns between agents and tools/data sources. In practice, MCP servers can be run on the same machine or can be accessed through the Internet via streamable HTTP. <a href="https://news.microsoft.com/source/features/company-news/introducing-nlweb-bringing-conversational-interfaces-directly-to-the-web/">NLWeb</a> relies on MCP to integrate conversational interfaces within websites, thus aiming to become the HTML of the Agentic Web. </p> |
341 | 344 |
|
342 |
| - <p> Agentic AI is also considering communication among language agents. Different protocols are being developed to enable communication of language agents on the Web. The <a href="https://www.a2aprotocol.net/docs/introduction">Agent to Agent (A2A)</a> protocol is a protocol that is meant as a complement to MCP for agent communication. Agents using this protocol describe themselves and their capabilities in an Agent Card that is available on the Web for other agents to read and use. The protocol defines tasks that an agent can achieve on behalf of another and messages to support communication among agents. The protocol relies on <a href="https://www.jsonrpc.org/specification">JSON-RPC</a> for communication. The <a href="">Agora protocol</a> is protocol meant to be as versatile, efficient, and portable as possible, within the limit of the Agent Communication Dilemma between these three properties [[AGORA]]. The <a href="https://agent-network-protocol.com/specs/white-paper.html"> Agent Network Protcol (ANP)</a> is another protocol for agents on the Web. ANP defines three layers: the Identity layer, the Meta-Protocol layer, and the Application layer. The Identity layer relies on <a href="https://www.w3.org/TR/did-1.0/">Decentralized Identifiers (DID)</a> to identity the agents. ANP defines a custom DID method <code>did:wba</code>, for Web-based Agents, to enable agents to prove their identities without relying on a central authority. The Meta-Protocol layer enables agents to select which protocol to use for communication. Once a protocol has been selected, the agents communicate using that protocol. Finally, the Application layer defines a JSON-LD Agent Description (AD) to enable agents to provide information about themselves to other agents and an Agent Discovery Protocol to enable agents to discover the ADs of other agents. <a href="https://eclipse.dev/lmos/">Eclipse LMOS (Language Model Operating System)</a> is another project to build an Internet of Agents. Eclipse LMOS relies on DIDs to identify software agents. It also defines an Agent Description Format to describe agents and a Tool Description Format to describe tools. Both description formats are defined as built on top of the <a href="https://www.w3.org/TR/wot-thing-description/"> Thing Description (TD) Format</a>. Eclipse LMOS also defines mecanisms for discovery, and a communication protocol that relies on WebSocket. </p> |
| 345 | + <p> Agentic AI is also considering communication among language agents. Different protocols are being developed to enable communication of language agents on the Web. The <a href="https://www.a2aprotocol.net/docs/introduction">Agent to Agent (A2A)</a> protocol is a protocol that is meant as a complement to MCP for agent communication. Agents using this protocol describe themselves and their capabilities in an Agent Card that is available on the Web for other agents to read and use. The protocol defines tasks that an agent can achieve on behalf of another and messages to support communication among agents. The protocol relies on <a href="https://www.jsonrpc.org/specification">JSON-RPC</a> for communication. The <a href="https://agoraprotocol.org/docs/getting-started">Agora protocol</a> is protocol for communication among language agents meant to be as versatile, efficient, and portable as possible, within the limit of the Agent Communication Dilemma between these three properties [[AGORA]]. The Agora protocol enables agents to choose at run time which specific protocol to use for interaction [[AGORA]]. The <a href="https://agent-network-protocol.com/specs/white-paper.html"> Agent Network Protocol (ANP)</a> is another protocol for agents on the Web. ANP defines three layers: the Identity layer, the Meta-Protocol layer, and the Application layer. The Identity layer relies on <a href="https://www.w3.org/TR/did-1.0/">Decentralized Identifiers (DID)</a> to identity the agents. ANP defines a custom DID method <code>did:wba</code>, for Web-based Agents, to enable agents to prove their identities without relying on a central authority. The Meta-Protocol layer enables agents to select which protocol to use for communication. Once a protocol has been selected, the agents communicate using that protocol. Finally, the Application layer defines a JSON-LD Agent Description (AD) to enable agents to provide information about themselves to other agents and an Agent Discovery Protocol to enable agents to discover the ADs of other agents. <a href="https://eclipse.dev/lmos/">Eclipse LMOS (Language Model Operating System)</a> is another project to build an Internet of Agents. Eclipse LMOS relies on DIDs to identify software agents. It also defines an Agent Description Format to describe agents and a Tool Description Format to describe tools. Both description formats are defined as built on top of the <a href="https://www.w3.org/TR/wot-thing-description/"> Thing Description (TD) Format</a>. Eclipse LMOS also defines mecanisms for discovery, and a communication protocol that relies on WebSocket. </p> |
343 | 346 | </section>
|
344 | 347 |
|
345 | 348 | </section>
|
|
0 commit comments