From 757ddd48bdbe4cb74446b40c834013b437789e5a Mon Sep 17 00:00:00 2001 From: Alexa Kreizinger Date: Fri, 21 Nov 2025 10:08:40 -0800 Subject: [PATCH 1/3] consolidate YAML / processor info Signed-off-by: Alexa Kreizinger --- .gitbook.yaml | 1 + SUMMARY.md | 5 +- administration/configuring-fluent-bit/yaml.md | 22 +- .../yaml/configuration-file.md | 282 ------------------ .../yaml/pipeline-section.md | 202 ++++++------- .../yaml/service-section.md | 44 +-- pipeline/processors.md | 54 +++- pipeline/processors/filters.md | 85 +++++- 8 files changed, 258 insertions(+), 437 deletions(-) delete mode 100644 administration/configuring-fluent-bit/yaml/configuration-file.md diff --git a/.gitbook.yaml b/.gitbook.yaml index bb883ebf1..9ee9d0179 100644 --- a/.gitbook.yaml +++ b/.gitbook.yaml @@ -100,3 +100,4 @@ redirects: installation/supported-platforms: ./installation/downloads.md about/sandbox-and-lab-resources: ./about/resources.md installation/downloads/amazon-ec2: ./installation/downloads/linux/amazon-linux.md + administration/configuring-fluent-bit/yaml/configuration-file: ./administration/configuring-fluent-bit/yaml.md diff --git a/SUMMARY.md b/SUMMARY.md index 12245f3ba..b1457feb2 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -39,7 +39,6 @@ * [Configure Fluent Bit](administration/configuring-fluent-bit.md) * [YAML configuration](administration/configuring-fluent-bit/yaml.md) - * [Configuration file](administration/configuring-fluent-bit/yaml/configuration-file.md) * [Environment variables](administration/configuring-fluent-bit/yaml/environment-variables-section.md) * [Includes](administration/configuring-fluent-bit/yaml/includes-section.md) * [Service](administration/configuring-fluent-bit/yaml/service-section.md) @@ -133,14 +132,14 @@ * [LTSV](pipeline/parsers/ltsv.md) * [Regular expression](pipeline/parsers/regular-expression.md) * [Processors](pipeline/processors.md) - * [Conditional processing](pipeline/processors/conditional-processing.md) * [Content modifier](pipeline/processors/content-modifier.md) - * [Filters as processors](pipeline/processors/filters.md) * [Labels](pipeline/processors/labels.md) * [Metrics selector](pipeline/processors/metrics-selector.md) * [OpenTelemetry envelope](pipeline/processors/opentelemetry-envelope.md) * [Sampling](pipeline/processors/sampling.md) * [SQL](pipeline/processors/sql.md) + * [Filters as processors](pipeline/processors/filters.md) + * [Conditional processing](pipeline/processors/conditional-processing.md) * [Filters](pipeline/filters.md) * [AWS metadata](pipeline/filters/aws-metadata.md) * [CheckList](pipeline/filters/checklist.md) diff --git a/administration/configuring-fluent-bit/yaml.md b/administration/configuring-fluent-bit/yaml.md index 8c87887ef..d498f0c64 100644 --- a/administration/configuring-fluent-bit/yaml.md +++ b/administration/configuring-fluent-bit/yaml.md @@ -2,15 +2,27 @@ -## Before you get started +In Fluent Bit v3.2 and later, YAML configuration files support all of the settings +and features that [classic configuration files](../administration/configuring-fluent-bit/classic-mode.md) support, plus additional features that classic configuration files +don't support, like processors. -YAML has become essential in a cloud ecosystem. To minimize friction and provide a more intuitive experience for creating data pipelines, users are encouraged to transition to YAML. +YAML configuration files support the following top-level sections: -Fluent Bit traditionally offered a `classic` configuration mode, a custom configuration format that's phasing out. While `classic` mode has served well for many years, it has several limitations. Its basic design only supports grouping sections with key-value pairs and lacks the ability to handle sub-sections or complex data structures like lists. +- `env`: Configures [environment variables](../administration/configuring-fluent-bit/yaml/environment-variables-section). +- `includes`: Specifies additional YAML configuration files to [include as part of a parent file](../administration/configuring-fluent-bit/yaml/includes-section). +- `service`: Configures global properties of the Fluent Bit [service](../administration/configuring-fluent-bit/yaml/service-section). +- `pipeline`: Configures active [`inputs`, `filters`, and `outputs`](../administration/configuring-fluent-bit/yaml/pipeline-section). +- `parsers`: Defines [custom parsers](../administration/configuring-fluent-bit/yaml/parsers-section). +- `multiline_parsers`: Defines [custom multiline parsers](../administration/configuring-fluent-bit/yaml/multiline-parsers-section). +- `plugins`: Defines paths for [custom plugins](../administration/configuring-fluent-bit/yaml/plugins-section). +- `upstream_servers`: Defines [nodes](../administration/configuring-fluent-bit/yaml/upstream-servers-section) for output plugins. -The YAML format enables features, such as processors, that aren't possible to configure in `classic` mode. +{% hint style="info" %} +YAML configuration is used in the smoke tests for containers. An always-correct up-to-date example is here: . +{% endhint %} + +---- -As of Fluent Bit v3.2, you can configure everything in YAML. ## List of available sections diff --git a/administration/configuring-fluent-bit/yaml/configuration-file.md b/administration/configuring-fluent-bit/yaml/configuration-file.md deleted file mode 100644 index 0f41ef1f1..000000000 --- a/administration/configuring-fluent-bit/yaml/configuration-file.md +++ /dev/null @@ -1,282 +0,0 @@ ---- -description: Learn about the YAML configuration file used by Fluent Bit ---- - -# YAML configuration file - - - -One of the ways to configure Fluent Bit is using a YAML configuration file that works at a global scope. These YAML configuration files support the following top-level sections: - -- `env`: Configures [environment variables](../administration/configuring-fluent-bit/yaml/environment-variables-section). -- `includes`: Specifies additional YAML configuration files to [include as part of a parent file](../administration/configuring-fluent-bit/yaml/includes-section). -- `service`: Configures global properties of the Fluent Bit [service](../administration/configuring-fluent-bit/yaml/service-section). -- `pipeline`: Configures active [`inputs`, `filters`, and `outputs`](../administration/configuring-fluent-bit/yaml/pipeline-section). -- `parsers`: Defines [custom parsers](../administration/configuring-fluent-bit/yaml/parsers-section). -- `multiline_parsers`: Defines [custom multiline parsers](../administration/configuring-fluent-bit/yaml/multiline-parsers-section). -- `plugins`: Defines paths for [custom plugins](../administration/configuring-fluent-bit/yaml/plugins-section). -- `upstream_servers`: Defines [nodes](../administration/configuring-fluent-bit/yaml/upstream-servers-section) for output plugins. - -{% hint style="info" %} -YAML configuration is used in the smoke tests for containers. An always-correct up-to-date example is here: . -{% endhint %} - -## `env` - -The `env` section allows the definition of configuration variables that will be used later in the configuration file. - -Example: - -```yaml -# Set up a local environment variable -env: - flush_interval: 1 - -# service configuration -service: - flush: ${flush_interval} - log_level: info - http_server: on -``` - -## Includes - -The `includes` section allows the files to be merged into the YAML configuration to be identified as a list of filenames. If no path is provided, then the file is assumed to be in a folder relative to the file referencing it. - -Example: - -```yaml -# defining file(s) to include into the current configuration. This includes illustrating using a relative path reference -includes: - - inclusion-1.yaml - - subdir/inclusion-2.yaml - -``` - -## Service - -The `service` section defines the global properties of the service. The Service keys available as of this version are described in the following table: - -| Key | Description | Default Value | -| --- | ----------- | ------------- | -| `flush` | Set the flush time in `seconds.nanoseconds`. The engine loop uses a Flush timeout to define when to flush the records ingested by input plugins through the defined output plugins. | `5` | -| `grace` | Set the grace time in `seconds` as an Integer value. The engine loop uses a Grace timeout to define the wait time on exit. | `5` | -| `daemon` | Boolean value to set if Fluent Bit should run as a Daemon (background) or not. Allowed values are: `yes`, `no`, `on`, and `off`. If you are using a Systemd based unit like the one provided in the Fluent Bit packages, don't turn on this option. | `Off` | -| `dns.mode` | Sets the primary transport layer protocol used by the asynchronous DNS resolver, which can be overridden on a per plugin basis | `UDP` | -| `log_file` | Absolute path for an optional log file. By default, all logs are redirected to the standard error interface(`stderr`). | _none_ | -| `log_level` | Set the logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Values are accumulative. For example, if `debug` is set, it will include `error`, `warning`, `info`, and `debug`. `trace` mode is only available if Fluent Bit was built with the `WITH_TRACE` option enabled. | `info` | -| `parsers_file` | Path for a file that defines custom parsers. Only a single entry is supported. | _none_ | -| `plugins_file` | Path for a `plugins` configuration file. A `plugins` configuration file allows the definition of paths for external plugins; for an example, [see here](https://github.com/fluent/fluent-bit/blob/master/conf/plugins.conf). | _none_ | -| `streams_file` | Path for the Stream Processor configuration file. Learn more about [Stream Processing configuration](../../../stream-processing/overview.md). | _none_ | -| `http_server` | Enable built-in HTTP server. | `Off` | -| `http_listen` | Set listening interface for HTTP server when it's enabled. | `0.0.0.0` | -| `http_port` | Set TCP Port for the HTTP server | `2020` | -| `coro_stack_size` | Set the coroutines stack size in bytes. The value must be greater than the page size of the running system. Don't set too small a value (for example, `4096`), or coroutine threads can overrun the stack buffer. Don't change the default value of this parameter unless you know what you are doing. | `24576` | -| `scheduler.cap` | Set a maximum retry time in seconds. Supported from v1.8.7. | `2000` | -| `scheduler.base` | Sets the base of exponential backoff. Supported from v1.8.7. | `5` | -| `json.convert_nan_to_null` | If enabled, NaN is converted to null when Fluent Bit converts `msgpack` to JSON. | `false` | -| `json.escape_unicode` | Controls how Fluent Bit serializes non‑ASCII / multi‑byte Unicode characters in JSON strings. When enabled, Unicode characters are escaped as `\uXXXX` sequences (characters outside BMP become surrogate pairs). When disabled, Fluent Bit emits raw UTF‑8 bytes. | `true` | -| `sp.convert_from_str_to_num` | If enabled, Stream processor converts from number string to number type. | `true` | -| `windows.maxstdio` | If specified, the limit of stdio is adjusted. Only provided for Windows. From 512 to 2048 is allowed. | `512` | - -The following is an example of a `service` section: - -```yaml -service: - flush: 5 - daemon: off - log_level: debug -``` - -For scheduler and retry details, see [scheduling and retries](../../scheduling-and-retries.md#Scheduling-and-Retries) - -## Pipeline - -A `pipeline` section will define a complete pipeline configuration, including `inputs`, `filters`, and `outputs` subsections. - -```yaml -pipeline: - inputs: - ... - filters: - ... - outputs: - ... -``` - -Each of the subsections for `inputs`, `filters`, and `outputs` constitutes an array of maps that has the parameters for each. Most properties are either strings or numbers and can be defined directly. - -For example: - -```yaml -pipeline: - inputs: - - name: tail - tag: syslog - path: /var/log/syslog - - name: http - tag: http_server - port: 8080 -``` - -This pipeline consists of two `inputs`: a tail plugin and an HTTP server plugin. Each plugin has its own map in the array of `inputs` consisting of basic properties. To use more advanced properties that consist of multiple values the property itself can be defined using an array, such as the `record` and `allowlist_key` properties for the `record_modifier` `filter`: - -```yaml -pipeline: - inputs: - - name: tail - tag: syslog - path: /var/log/syslog - filters: - - name: record_modifier - match: syslog - record: - - powered_by calyptia - - name: record_modifier - match: syslog - allowlist_key: - - powered_by - - message -``` - -In the cases where each value in a list requires two values they must be separated by a space, such as in the `record` property for the `record_modifier` filter. - -### Input - -An `input` section defines a source (related to an input plugin). Each section has a base configuration. Each input plugin can add it own configuration keys: - -| Key | Description | -| --- |------------ | -| `Name` | Name of the input plugin. Defined as subsection of the `inputs` section. | -| `Tag` | Tag name associated to all records coming from this plugin. | -| `Log_Level` | Set the plugin's logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Defaults to the `SERVICE` section's `Log_Level`. | - -The `Name` is mandatory and defines for Fluent Bit which input plugin should be loaded. `Tag` is mandatory for all plugins except for the `input forward` plugin which provides dynamic tags. - -#### Example input - -The following is an example of an `input` section for the `cpu` plugin. - -```yaml -pipeline: - inputs: - - name: cpu - tag: my_cpu -``` - -### Filter - -A `filter` section defines a filter (related to a filter plugin). Each section has a base configuration and each filter plugin can add its own configuration keys: - -| Key | Description | -| ----------- | ------------------------------------------------------------ | -| `Name` | Name of the filter plugin. Defined as a subsection of the `filters` section. | -| `Match` | A pattern to match against the tags of incoming records. It's case-sensitive and supports the star (`*`) character as a wildcard. | -| `Match_Regex` | A regular expression to match against the tags of incoming records. Use this option if you want to use the full regular expression syntax. | -| `Log_Level` | Set the plugin's logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Defaults to the `SERVICE` section's `Log_Level`. | - -`Name` is mandatory and lets Fluent Bit know which filter plugin should be loaded. The `Match` or `Match_Regex` is mandatory for all plugins. If both are specified, `Match_Regex` takes precedence. - -#### Example filter - -The following is an example of a `filter` section for the `grep` plugin: - -```yaml -pipeline: - filters: - - name: grep - match: '*' - regex: log aa -``` - -### Output - -The `outputs` section specifies a destination that certain records should follow after a `Tag` match. Fluent Bit can route up to 256 `OUTPUT` plugins. The configuration supports the following keys: - -| Key | Description | -| ----------- | ------------------------------------------------------------ | -| `Name` | Name of the output plugin. Defined as a subsection of the `outputs` section. | -| `Match` | A pattern to match against the tags of incoming records. It's case-sensitive and supports the star (`*`) character as a wildcard. | -| `Match_Regex` | A regular expression to match against the tags of incoming records. Use this option if you want to use the full regular expression syntax. | -| `Log_Level` | Set the plugin's logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. The output log level defaults to the `SERVICE` section's `Log_Level`. | - -#### Example output - -The following is an example of an `output` section: - -```yaml -pipeline: - outputs: - - name: stdout - match: 'my*cpu' -``` - -#### Collecting `cpu` metrics example - -The following configuration file example demonstrates how to collect CPU metrics and flush the results every five seconds to the standard output: - -```yaml -service: - flush: 5 - daemon: off - log_level: debug - -pipeline: - inputs: - - name: cpu - tag: my_cpu - outputs: - - name: stdout - match: 'my*cpu' -``` - -## Processors - -Fluent Bit 2.1.2 and greater implements an interface called `processor` to extend the processing capabilities in input and output plugins directly without routing the data. The input and output plugins can run in separate threads. This interface allows users to apply data transformations and filtering to incoming data records before they're processed further in the pipeline. - -This capability is only exposed in YAML configuration and not in classic configuration mode due to the restriction of nested levels of configuration. - -[Processor example](configuration-file.md#example-using-processors) - -### Example: Using processors - -The following configuration file example demonstrates the use of processors to change the log record in the input plugin section by adding a new key `hostname` with the value `monox`. It uses Lua to append the tag to the log record. The output plugin section adds a new key named `output` with the value `new data`. - -```yaml - service: - log_level: info - http_server: on - http_listen: 0.0.0.0 - http_port: 2021 - pipeline: - inputs: - - name: random - tag: test-tag - interval_sec: 1 - processors: - logs: - - name: modify - add: hostname monox - - name: lua - call: append_tag - code: | - function append_tag(tag, timestamp, record) - new_record = record - new_record["tag"] = tag - return 1, timestamp, new_record - end - outputs: - - name: stdout - match: '*' - processors: - logs: - - name: lua - call: add_field - code: | - function add_field(tag, timestamp, record) - new_record = record - new_record["output"] = "new data" - return 1, timestamp, new_record - end -``` diff --git a/administration/configuring-fluent-bit/yaml/pipeline-section.md b/administration/configuring-fluent-bit/yaml/pipeline-section.md index 73dd9374d..3dbaf9f9a 100644 --- a/administration/configuring-fluent-bit/yaml/pipeline-section.md +++ b/administration/configuring-fluent-bit/yaml/pipeline-section.md @@ -14,156 +14,135 @@ Unlike filters, processors and parsers aren't defined within a unified section o {% endhint %} -## Example configuration +## Format -Here's an example of a pipeline configuration: +A `pipeline` section will define a complete pipeline configuration, including `inputs`, `filters`, and `outputs` subsections. -{% tabs %} -{% tab title="fluent-bit.yaml" %} +```yaml +pipeline: + inputs: + ... + filters: + ... + outputs: + ... +``` + +Each of the subsections for `inputs`, `filters`, and `outputs` constitutes an array of maps that has the parameters for each. Most properties are either strings or numbers and can be defined directly. + +For example: ```yaml pipeline: inputs: - name: tail - path: /var/log/example.log - parser: json + tag: syslog + path: /var/log/syslog + - name: http + tag: http_server + port: 8080 +``` - processors: - logs: - - name: record_modifier +This pipeline consists of two `inputs`: a tail plugin and an HTTP server plugin. Each plugin has its own map in the array of `inputs` consisting of basic properties. To use more advanced properties that consist of multiple values the property itself can be defined using an array, such as the `record` and `allowlist_key` properties for the `record_modifier` `filter`: +```yaml +pipeline: + inputs: + - name: tail + tag: syslog + path: /var/log/syslog filters: - - name: grep - match: '*' - regex: key pattern - - outputs: - - name: stdout - match: '*' + - name: record_modifier + match: syslog + record: + - powered_by calyptia + - name: record_modifier + match: syslog + allowlist_key: + - powered_by + - message ``` -{% endtab %} -{% endtabs %} +In the cases where each value in a list requires two values they must be separated by a space, such as in the `record` property for the `record_modifier` filter. -## Pipeline processors +### Input -Processors operate on specific signals such as logs, metrics, and traces. They're attached to an input plugin and must specify the signal type they will process. +An `input` section defines a source (related to an input plugin). Each section has a base configuration. Each input plugin can add it own configuration keys: -### Example of a processor +| Key | Description | +| --- |------------ | +| `Name` | Name of the input plugin. Defined as subsection of the `inputs` section. | +| `Tag` | Tag name associated to all records coming from this plugin. | +| `Log_Level` | Set the plugin's logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Defaults to the `SERVICE` section's `Log_Level`. | -In the following example, the `content_modifier` processor inserts or updates (upserts) the key `my_new_key` with the value `123` for all log records generated by the tail plugin. This processor is only applied to log signals: +The `Name` is mandatory and defines for Fluent Bit which input plugin should be loaded. `Tag` is mandatory for all plugins except for the `input forward` plugin which provides dynamic tags. -{% tabs %} -{% tab title="fluent-bit.yaml" %} +#### Example input -```yaml -parsers: - - name: json - format: json +The following is an example of an `input` section for the `cpu` plugin. +```yaml pipeline: - inputs: - - name: tail - path: /var/log/example.log - parser: json - - processors: - logs: - - name: content_modifier - action: upsert - key: my_new_key - value: 123 - - filters: - - name: grep - match: '*' - regex: key pattern - - outputs: - - name: stdout - match: '*' + inputs: + - name: cpu + tag: my_cpu ``` -{% endtab %} -{% endtabs %} +### Filter -Here is a more complete example with multiple processors: +A `filter` section defines a filter (related to a filter plugin). Each section has a base configuration and each filter plugin can add its own configuration keys: -{% tabs %} -{% tab title="fluent-bit.yaml" %} +| Key | Description | +| ----------- | ------------------------------------------------------------ | +| `Name` | Name of the filter plugin. Defined as a subsection of the `filters` section. | +| `Match` | A pattern to match against the tags of incoming records. It's case-sensitive and supports the star (`*`) character as a wildcard. | +| `Match_Regex` | A regular expression to match against the tags of incoming records. Use this option if you want to use the full regular expression syntax. | +| `Log_Level` | Set the plugin's logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Defaults to the `SERVICE` section's `Log_Level`. | -```yaml -service: - log_level: info - http_server: on - http_listen: 0.0.0.0 - http_port: 2021 +`Name` is mandatory and lets Fluent Bit know which filter plugin should be loaded. The `Match` or `Match_Regex` is mandatory for all plugins. If both are specified, `Match_Regex` takes precedence. -pipeline: - inputs: - - name: random - tag: test-tag - interval_sec: 1 +#### Example filter - processors: - logs: - - name: modify - add: hostname monox - - - name: lua - call: append_tag - code: | - function append_tag(tag, timestamp, record) - new_record = record - new_record["tag"] = tag - return 1, timestamp, new_record - end +The following is an example of a `filter` section for the `grep` plugin: - outputs: - - name: stdout +```yaml +pipeline: + filters: + - name: grep match: '*' - - processors: - logs: - - name: lua - call: add_field - code: | - function add_field(tag, timestamp, record) - new_record = record - new_record["output"] = "new data" - return 1, timestamp, new_record - end + regex: log aa ``` -{% endtab %} -{% endtabs %} - -Processors can be attached to inputs and outputs. - -### How processors are different from filters +### Output -While processors and filters are similar in that they can transform, enrich, or drop data from the pipeline, there is a significant difference in how they operate: +The `outputs` section specifies a destination that certain records should follow after a `Tag` match. Fluent Bit can route up to 256 `OUTPUT` plugins. The configuration supports the following keys: -- Processors: Run in the same thread as the input plugin when the input plugin is configured to be threaded (threaded: true). This design provides better performance, especially in multi-threaded setups. +| Key | Description | +| ----------- | ------------------------------------------------------------ | +| `Name` | Name of the output plugin. Defined as a subsection of the `outputs` section. | +| `Match` | A pattern to match against the tags of incoming records. It's case-sensitive and supports the star (`*`) character as a wildcard. | +| `Match_Regex` | A regular expression to match against the tags of incoming records. Use this option if you want to use the full regular expression syntax. | +| `Log_Level` | Set the plugin's logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. The output log level defaults to the `SERVICE` section's `Log_Level`. | -- Filters: Run in the main event loop. When multiple filters are used, they can introduce performance overhead, particularly under heavy workloads. +#### Example output -## Running filters as processors +The following is an example of an `output` section: -You can configure existing [Filters](https://docs.fluentbit.io/manual/pipeline/filters) to run as processors. There are no specific changes needed; you use the filter name as if it were a native processor. +```yaml +pipeline: + outputs: + - name: stdout + match: 'my*cpu' +``` -### Example of a filter running as a processor +## Example configuration -In the following example, the `grep` filter is used as a processor to filter log events based on a pattern: +Here's an example of a pipeline configuration: {% tabs %} {% tab title="fluent-bit.yaml" %} ```yaml -parsers: - - name: json - format: json - pipeline: inputs: - name: tail @@ -172,8 +151,13 @@ pipeline: processors: logs: - - name: grep - regex: log aa + - name: record_modifier + + filters: + - name: grep + match: '*' + regex: key pattern + outputs: - name: stdout match: '*' diff --git a/administration/configuring-fluent-bit/yaml/service-section.md b/administration/configuring-fluent-bit/yaml/service-section.md index 367f6c4b8..ef2aeabc9 100644 --- a/administration/configuring-fluent-bit/yaml/service-section.md +++ b/administration/configuring-fluent-bit/yaml/service-section.md @@ -2,27 +2,31 @@ The `service` section defines global properties of the service. The available configuration keys are: -| Key | Description | Default | -|---|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---| -| `flush` | Sets the flush time in `seconds.nanoseconds`. The engine loop uses a flush timeout to determine when to flush records ingested by input plugins to output plugins. | `1` | -| `grace` | Sets the grace time in `seconds` as an integer value. The engine loop uses a grace timeout to define the wait time before exiting. | `5` | -| `daemon` | Boolean. Specifies whether Fluent Bit should run as a daemon (background process). Allowed values are: `yes`, `no`, `on`, and `off`. Don't enable when using a Systemd-based unit, such as the one provided in Fluent Bit packages. | `off` | -| `dns.mode` | Sets the primary transport layer protocol used by the asynchronous DNS resolver. Can be overridden on a per-plugin basis. | `UDP` | -| `log_file` | Absolute path for an optional log file. By default, all logs are redirected to the standard error interface (stderr). | _none_ | -| `log_level` | Sets the logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Values are cumulative. If `debug` is set, it will include `error`, `warn`, `info`, and `debug`. Trace mode is only available if Fluent Bit was built with the _`WITH_TRACE`_ option enabled. | `info` | -| `parsers_file` | Path for a `parsers` configuration file. Multiple `parsers_file` entries can be defined within the section. However, with the new YAML configuration schema, defining parsers using this key is now optional. Parsers can be declared directly in the `parsers` section of your YAML configuration, offering a more streamlined and integrated approach. | _none_ | -| `plugins_file` | Path for a `plugins` configuration file. This file specifies the paths to external plugins (.so files) that Fluent Bit can load at runtime. With the new YAML schema, the `plugins_file` key is optional. External plugins can now be referenced directly within the `plugins` section, simplifying the plugin management process. [See an example](https://github.com/fluent/fluent-bit/blob/master/conf/plugins.conf). | _none_ | -| `streams_file` | Path for the Stream Processor configuration file. This file defines the rules and operations for stream processing within Fluent Bit. The `streams_file` key is optional, as Stream Processor configurations can be defined directly in the `streams` section of the YAML schema. This flexibility allows for easier and more centralized configuration. [Learn more about Stream Processing configuration](../../../stream-processing/overview.md). | _none_ | -| `http_server` | Enables the built-in HTTP Server. | `off` | -| `http_listen` | Sets the listening interface for the HTTP Server when it's enabled. | `0.0.0.0` | -| `http_port` | Sets the TCP port for the HTTP Server. | `2020` | -| `hot_reload` | Enables [hot reloading](../../hot-reload.md) of configuration with SIGHUP. | `on` | -| `coro_stack_size` | Sets the coroutine stack size in bytes. The value must be greater than the page size of the running system. Setting the value too small (`4096`) can cause coroutine threads to overrun the stack buffer. The default value of this parameter shouldn't be changed. | `24576` | -| `scheduler.cap` | Sets a maximum retry time in seconds. Supported in v1.8.7 and greater. | `2000` | -| `scheduler.base` | Sets the base of exponential backoff. Supported in v1.8.7 and greater. | `5` | -| `json.convert_nan_to_null` | If enabled, `NaN` is converted to `null` when Fluent Bit converts `msgpack` to `json`. | `false` | + +| Key | Description | Default Value | +| --- | ----------- | ------------- | +| `flush` | Sets the flush time in `seconds.nanoseconds`. The engine loop uses a flush timeout to define when to flush the records ingested by input plugins through the defined output plugins. | `1` | +| `grace` | Sets the grace time in `seconds` as an integer value. The engine loop uses a grace timeout to define the wait time on exit. | `5` | +| `daemon` | Specifies whether Fluent Bit should run as a daemon (background process). Possible values: `yes`, `no`, `on`, and `off`. Don't enable when using a Systemd-based unit, such as the one provided in Fluent Bit packages. | `off` | +| `dns.mode` | Sets the primary transport layer protocol used by the asynchronous DNS resolver. Can be overridden on a per-plugin basis. | `UDP` | +| `log_file` | Absolute path for an optional log file. By default, all logs are redirected to the standard error interface (`stderr`). | _none_ | +| `log_level` | Sets the logging verbosity level. Possible values: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Values are cumulative. For example, if `debug` is set, it will include `error`, `warning`, `info`, and `debug`. The `trace` mode is only available if Fluent Bit was built with the `WITH_TRACE` option enabled. | `info` | +| `parsers_file` | Path for a parsers configuration file. Multiple `parsers_file` entries can be defined within the section. Parsers can be declared directly in the [`parsers` section](../administration/configuring-fluent-bit/yaml/parsers-section.md) of YAML configuration files. | _none_ | +| `plugins_file` | Path for a `plugins` configuration file. This file specifies the paths to custom plugins (.so files) that Fluent Bit can load at runtime. Plugins can be declared directly in the [`plugins` section](../administration/configuring-fluent-bit/yaml/plugins-section.md) of YAML configuration files. | _none_ | +| `streams_file` | Path for the [stream processor](../stream-processing/overview.md) configuration file. This file defines the rules and operations for stream processing in Fluent Bit. Stream processor configurations can also be defined directly in the `streams` section of YAML configuration files. | _none_ | +| `http_server` | Enables the built-in HTTP server. | `off` | +| `http_listen` | Sets the listening interface for the HTTP Server when it's enabled. | `0.0.0.0` | +| `http_port` | Sets the TCP port for the HTTP server. | `2020` | +| `hot_reload` | Enables [hot reloading](../administration/hot-reload.md) of configuration with SIGHUP. | `on` | +| `coro_stack_size` | Sets the coroutines stack size in bytes. The value must be greater than the page size of the running system. Setting the value too small (for example, `4096`) can cause coroutine threads to overrun the stack buffer. For best results, don't change this parameter from its default value. | `24576` | +| `scheduler.cap` | Sets a maximum retry time in seconds. | `2000` | +| `scheduler.base` | Sets the base of exponential backoff. | `5` | +| `json.convert_nan_to_null` | f enabled, `NaN` is converted to `null` when Fluent Bit converts msgpack to json. | `false` | | `json.escape_unicode` | Controls how Fluent Bit serializes non‑ASCII / multi‑byte Unicode characters in JSON strings. When enabled, Unicode characters are escaped as `\uXXXX` sequences (characters outside BMP become surrogate pairs). When disabled, Fluent Bit emits raw UTF‑8 bytes. | `true` | -| `sp.convert_from_str_to_num` | If enabled, the Stream Processor converts strings that represent numbers to a numeric type. | `true` | +| `sp.convert_from_str_to_num` | If enabled, the stream processor converts strings that represent numbers to a numeric type. | `true` | +| `windows.maxstdio` | If specified, adjusts the limit of `stdio`. Only provided for Windows. Values from `512` to `2048` are allowed. | `512` | + +For scheduler and retry details, see [scheduling and retries](../../scheduling-and-retries.md#Scheduling-and-Retries). ## Configuration example diff --git a/pipeline/processors.md b/pipeline/processors.md index 1c22bce2d..daf1ef9ff 100644 --- a/pipeline/processors.md +++ b/pipeline/processors.md @@ -1,8 +1,8 @@ # Processors -Processors are components that modify, transform, or enhance data as it flows through Fluent Bit. Unlike [filters](filters.md), processors are tightly coupled to inputs, which means they execute immediately and avoid creating a performance bottleneck. +Processors are components that modify, transform, or enhance the data that flows through Fluent Bit. -Additionally, filters can be implemented in a way that mimics the behavior of processors, but processors can't be implemented in a way that mimics filters. +Each input plugin or output plugin can have one or more attached processors. Processors can modify logs, metrics, and traces. {% hint style="info" %} @@ -20,10 +20,58 @@ Fluent Bit offers the following processors: - [OpenTelemetry envelope](./processors/opentelemetry-envelope.md): Transform logs into an OpenTelemetry-compatible format. - [Sampling](./processors/sampling.md): Apply head or tail sampling to incoming traces. - [SQL](./processors/sql.md): Use SQL queries to extract log content. -- [Filters as processors](filters.md): Use filters as processors. +- [Filters as processors](./processors/filters.md): Use filters as processors. ## Features Compatible processors include the following features: - [Conditional processing](./processors/conditional-processing.md): Selectively apply processors to logs based on the value of fields that those logs contain. + +## How processors are different from filters + +Although processors and filters both transform data, they're different in many ways. + +Processors are attached to individual input and output plugins, and don't use tag matching. Filters are defined globally, and _do_ use tag matching. + +Processors run in the same thread as their associated input or output plugin. This lets processors execute immediately and helps reduce performance bottlenecks, especially when [multithreading](../administration/multithreading.md) is enabled. Filters always run in the main thread, and using multiple filters can introduce performance overhead, particularly under heavy workloads. + +Additionally, filters can be implemented in a way that mimics the behavior of processors, but processors can't be implemented in a way that mimics filters. + +## Example configuration + +In the following example, the [content modifier](../data-pipeline/processors/content-modifier.md) processor inserts or updates (upserts) the key `my_new_key` with the value `123` for all log records generated by the tail plugin. This processor is only applied to logs. + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +parsers: + - name: json + format: json + +pipeline: + inputs: + - name: tail + path: /var/log/example.log + parser: json + + processors: + logs: + - name: content_modifier + action: upsert + key: my_new_key + value: 123 + + filters: + - name: grep + match: '*' + regex: key pattern + + outputs: + - name: stdout + match: '*' +``` + +{% endtab %} +{% endtabs %} diff --git a/pipeline/processors/filters.md b/pipeline/processors/filters.md index 3ed879f3c..8d0bd6d1c 100644 --- a/pipeline/processors/filters.md +++ b/pipeline/processors/filters.md @@ -8,28 +8,83 @@ Only [YAML configuration files](../../administration/configuring-fluent-bit/yaml {% endhint %} -## Grep example +## Examples -In this example, the [Grep](../filters/grep.md) filter is an output processor that sends log records only if they match a specified regular expression. +The following examples show how to configure filters as processors. + +### Grep + +In this example, the [Grep](../filters/grep.md) filter is used as an input processor to filter log events based on a regular expression pattern: {% tabs %} {% tab title="fluent-bit.yaml" %} ```yaml +parsers: + - name: json + format: json + pipeline: - inputs: - - name: tail - path: lines.txt - parser: json - - outputs: - - name: stdout - match: '*' - - processors: - logs: - - name: grep - regex: log aa + inputs: + - name: tail + path: /var/log/example.log + parser: json + + processors: + logs: + - name: grep + regex: log aa + outputs: + - name: stdout + match: '*' +``` + +{% endtab %} +{% endtabs %} + +### Lua + +In this example configuration, an input plugin uses the [Lua](../data-pipeline/filters/lua.md) filter as a processor to add a new key `hostname` with the value `monox`. Then, an output plugin adds a new key named `output` with the value `new data`. + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml + service: + log_level: info + http_server: on + http_listen: 0.0.0.0 + http_port: 2021 + pipeline: + inputs: + - name: random + tag: test-tag + interval_sec: 1 + processors: + logs: + - name: modify + add: hostname monox + - name: lua + call: append_tag + code: | + function append_tag(tag, timestamp, record) + new_record = record + new_record["tag"] = tag + return 1, timestamp, new_record + end + outputs: + - name: stdout + match: '*' + processors: + logs: + - name: lua + call: add_field + code: | + function add_field(tag, timestamp, record) + new_record = record + new_record["output"] = "new data" + return 1, timestamp, new_record + end ``` {% endtab %} From 9ddef534b772a2d4a7e4a68d3ad6c9d46fc4e79c Mon Sep 17 00:00:00 2001 From: Alexa Kreizinger Date: Fri, 21 Nov 2025 10:10:51 -0800 Subject: [PATCH 2/3] remove/move placeholder info Signed-off-by: Alexa Kreizinger --- administration/configuring-fluent-bit/yaml.md | 38 ------------------- .../yaml/pipeline-section.md | 2 +- 2 files changed, 1 insertion(+), 39 deletions(-) diff --git a/administration/configuring-fluent-bit/yaml.md b/administration/configuring-fluent-bit/yaml.md index d498f0c64..16297bb3a 100644 --- a/administration/configuring-fluent-bit/yaml.md +++ b/administration/configuring-fluent-bit/yaml.md @@ -20,41 +20,3 @@ YAML configuration files support the following top-level sections: {% hint style="info" %} YAML configuration is used in the smoke tests for containers. An always-correct up-to-date example is here: . {% endhint %} - ----- - - -## List of available sections - -Configuring Fluent Bit with YAML introduces the following root-level sections: - -| Section Name | Description | -|--------------|-------------| -| `service` | Describes the global configuration for the Fluent Bit service. Optional. If not set, default values will apply. Only one `service` section can be defined. | -| `parsers` | Lists parsers to be used by components like inputs, processors, filters, or output plugins. You can define multiple `parsers` sections, which can also be loaded from external files included in the main YAML configuration. | -| `multiline_parsers` | Lists multiline parsers, functioning similarly to `parsers`. Multiple definitions can exist either in the root or in included files. | -| `pipeline` | Defines a pipeline composed of inputs, processors, filters, and output plugins. You can define multiple `pipeline` sections, but they won't operate independently. Instead, all components will be merged into a single pipeline internally. | -| `plugins` | Specifies the path to external plugins (`.so` files) to be loaded by Fluent Bit at runtime. | -| `upstream_servers` | Refers to a group of node endpoints that can be referenced by output plugins that support this feature. | -| `env` | Sets a list of environment variables for Fluent Bit. System environment variables are available, while the ones defined in the configuration apply only to Fluent Bit. | - -## Section documentation - -To access detailed configuration guides for each section, use the following links: - -- [Service Section documentation](./yaml/service-section.md) - - Overview of global settings, configuration options, and examples. -- [Parsers Section documentation](./yaml/parsers-section.md) - - Detailed guide on defining parsers and supported formats. -- [Multiline Parsers Section documentation](./yaml/multiline-parsers-section.md) - - Explanation of multiline parsing configuration. -- [Pipeline Section documentation](./yaml/pipeline-section.md) - - Details on setting up pipelines and using processors. -- [Plugins Section documentation](./yaml/plugins-section.md) - - How to load external plugins. -- [Upstream Servers Section documentation](./yaml/upstream-servers-section.md) - - Guide on setting up and using upstream nodes with supported plugins. -- [Environment Variables Section documentation](./yaml/environment-variables-section.md) - - Information on setting environment variables and their scope within Fluent Bit. -- [Includes Section documentation](./yaml/includes-section.md) - - Description on how to include external YAML files. diff --git a/administration/configuring-fluent-bit/yaml/pipeline-section.md b/administration/configuring-fluent-bit/yaml/pipeline-section.md index 3dbaf9f9a..e14647386 100644 --- a/administration/configuring-fluent-bit/yaml/pipeline-section.md +++ b/administration/configuring-fluent-bit/yaml/pipeline-section.md @@ -16,7 +16,7 @@ Unlike filters, processors and parsers aren't defined within a unified section o ## Format -A `pipeline` section will define a complete pipeline configuration, including `inputs`, `filters`, and `outputs` subsections. +A `pipeline` section will define a complete pipeline configuration, including `inputs`, `filters`, and `outputs` subsections. You can define multiple `pipeline` sections, but they won't operate independently. Instead, all components will be merged into a single pipeline internally. ```yaml pipeline: From 8fdade7dbca47189a269c28c57fff2b6f6f9fb03 Mon Sep 17 00:00:00 2001 From: Alexa Kreizinger Date: Fri, 21 Nov 2025 11:55:19 -0800 Subject: [PATCH 3/3] Apply suggestions from code review Signed-off-by: Alexa Kreizinger --- administration/configuring-fluent-bit/yaml/service-section.md | 2 +- pipeline/processors.md | 2 +- pipeline/processors/filters.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/administration/configuring-fluent-bit/yaml/service-section.md b/administration/configuring-fluent-bit/yaml/service-section.md index ef2aeabc9..5ee069358 100644 --- a/administration/configuring-fluent-bit/yaml/service-section.md +++ b/administration/configuring-fluent-bit/yaml/service-section.md @@ -21,7 +21,7 @@ The `service` section defines global properties of the service. The available co | `coro_stack_size` | Sets the coroutines stack size in bytes. The value must be greater than the page size of the running system. Setting the value too small (for example, `4096`) can cause coroutine threads to overrun the stack buffer. For best results, don't change this parameter from its default value. | `24576` | | `scheduler.cap` | Sets a maximum retry time in seconds. | `2000` | | `scheduler.base` | Sets the base of exponential backoff. | `5` | -| `json.convert_nan_to_null` | f enabled, `NaN` is converted to `null` when Fluent Bit converts msgpack to json. | `false` | +| `json.convert_nan_to_null` | If enabled, `NaN` is converted to `null` when Fluent Bit converts msgpack to JSON. | `false` | | `json.escape_unicode` | Controls how Fluent Bit serializes non‑ASCII / multi‑byte Unicode characters in JSON strings. When enabled, Unicode characters are escaped as `\uXXXX` sequences (characters outside BMP become surrogate pairs). When disabled, Fluent Bit emits raw UTF‑8 bytes. | `true` | | `sp.convert_from_str_to_num` | If enabled, the stream processor converts strings that represent numbers to a numeric type. | `true` | | `windows.maxstdio` | If specified, adjusts the limit of `stdio`. Only provided for Windows. Values from `512` to `2048` are allowed. | `512` | diff --git a/pipeline/processors.md b/pipeline/processors.md index daf1ef9ff..a61cf454e 100644 --- a/pipeline/processors.md +++ b/pipeline/processors.md @@ -40,7 +40,7 @@ Additionally, filters can be implemented in a way that mimics the behavior of pr ## Example configuration -In the following example, the [content modifier](../data-pipeline/processors/content-modifier.md) processor inserts or updates (upserts) the key `my_new_key` with the value `123` for all log records generated by the tail plugin. This processor is only applied to logs. +In the following example, the [content modifier](../pipeline/processors/content-modifier.md) processor inserts or updates (upserts) the key `my_new_key` with the value `123` for all log records generated by the tail plugin. This processor is only applied to logs. {% tabs %} {% tab title="fluent-bit.yaml" %} diff --git a/pipeline/processors/filters.md b/pipeline/processors/filters.md index 8d0bd6d1c..6193b29e4 100644 --- a/pipeline/processors/filters.md +++ b/pipeline/processors/filters.md @@ -44,7 +44,7 @@ pipeline: ### Lua -In this example configuration, an input plugin uses the [Lua](../data-pipeline/filters/lua.md) filter as a processor to add a new key `hostname` with the value `monox`. Then, an output plugin adds a new key named `output` with the value `new data`. +In this example configuration, an input plugin uses the [Lua](../pipeline/filters/lua.md) filter as a processor to add a new key `hostname` with the value `monox`. Then, an output plugin adds a new key named `output` with the value `new data`. {% tabs %} {% tab title="fluent-bit.yaml" %}