MVP Design Doc - (WIP, incrementally updating) #9
xintongsong
started this conversation in
Ideas
Replies: 2 comments 2 replies
-
Shall we avoid using the term |
Beta Was this translation helpful? Give feedback.
2 replies
-
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
The above figure shows the overall idea about how we expect users to use Flink Agents.
Flink Agents provides APIs in various programming languages for users to define agents.
a. We start with Python and Java APIs, and may support more languages in future.
b. APIs in different programming languages should share a common set of concepts and primitives. For a certain language, we may only support a subset of the concepts and primitives. E.g., with declarative languages like Yaml, we may only support the ReAct style agent but not the workflow style agent.
c. Users may even use multiple languages. E.g., build an agent in Java that includes a custom action defined in Python. (Not necessarily supported in the MVP.)
During building the agents, users may leverage built-in or 3rd-party libraries to integrate their agents with a rich ecosystem.
a. This is similar to Flink's connectors or filesystems. The framework defines the standard interfaces for common building blocks of agents (LLMs, tools, vector stores, etc.), of which different pluggable implementations can be provided to integrate with various systems and providers.
Once the agents are defined, we need to connect them with inputs and outputs.
a. An agent can either be part of a Flink job that consumes from an input datastream and generates an output datastream, or a standalone service that accepts REST/RPC requests and replies with responses (not necessarily supported in the MVP).
b. How we connect agents with inputs / outputs (using DataStream or Table API, in Java or Python, or using SQL), should be independent from how we define the agents (using Java or Python Agent API).
The agents can be executed in various runtimes.
a. A local python runtime is helpful for easy trying out and debugging, without relying on a Flink cluster, Docker / Kubernetes environment, or even JVM. The limitation is that it cannot support agents implemented in Java, using datastreams as inputs and outputs, nor Flink's state management and fault tolerance.
b. A Flink (StateFun) runtime for running the agent as a Flink Job, in a distributed manner and with Flink's state management and fault tolerant supports.
Core Concepts
This section discusses the core concepts and primitives for defining an agent.
We divide the concepts into 2 categories:
Regardless of which Agent API / programming language is used, a user-defined agent will be translated into a declarative, language-independent Agent Plan, which will be further executed by different runtimes.
Fundamental Concepts
Extended Concepts
We propose to introduce the following most commonly needed extended concepts for the MVP. More concepts can be added in future if needed.
How to build extended concepts on top of fundamental concepts?
Taking ChatModel as an example, we'll have the following built-in implementations:
To use this feature, users need to:
API
Runtime
We propose to have 2 runtimes:
Flink Runtime
Execution
The above figure shows how an agent is executed in the Flink Runtime.
The main components are:
An agent run works as follows:
a. During the execution of actions, we may need to access resources. For resources implemented in the same language, we can directly invoke the resource provider. For resources implemented in another language, we usePemja for cross-language object access.
a. If the event is an OutputEvent, the dispatcher will emit its payload to the output datastream.
b. For other events, the dispatcher will loop back to 2).
Deployment
The whole Flink Runtime is like a large custom operator (to be precise, a topology of multiple operators) to Flink.
During the job compiling, the above code snippet will wrap the user-defined agent into the Flink Runtime operators, append them to the input datastream, and returns an output datastream for adding more downstream operators. Here job compiling refers to the process of translating the user codes into a Flink StreamGraph / JobGraph.
During the task initialization, the Flink Runtime operators will bring up the dispatcher, java executor and python executor. For the python executor, we can either launch a Python process, or run it in the JVM with Pemja. See the Process and Thread modes of Flink Python UDF for more details.
Local Runtime
Omitted as the design of Local Runtime is quite straightforward.
Project Structure
Beta Was this translation helpful? Give feedback.
All reactions