diff --git a/docs/src/assets/img/enterprise/lakefs-enterprise-architecture.png b/docs/src/assets/img/enterprise/lakefs-enterprise-architecture.png new file mode 100644 index 00000000000..ce1163fae9b Binary files /dev/null and b/docs/src/assets/img/enterprise/lakefs-enterprise-architecture.png differ diff --git a/docs/src/enterprise/architecture.md b/docs/src/enterprise/architecture.md index 66ce1d3a045..63e32729c4e 100644 --- a/docs/src/enterprise/architecture.md +++ b/docs/src/enterprise/architecture.md @@ -1,39 +1,27 @@ --- title: Architecture -description: lakeFS Enterprise architecture explained!!!!!1 +description: lakeFS Enterprise architecture explained! --- # Architecture -!!! warning - fluffy will be deprecated in the upcoming versions and all functionality will be migrated into lakeFS Enterprise -The lakeFS Enterprise software consists of two components: -1. lakeFS Enterprise: a proprietary version of lakeFS which is based on the OSS version and includes advanced functionality. -2. A proprietary component called **Fluffy** which includes lakeFS' Enterprise features. +lakeFS Enterprise extends the open-source lakeFS foundation, delivering a complete data versioning and governance solution with seamlessly integrated enterprise features like SSO, RBAC, mounting capabilities, and more. -![img.png](../assets/img/enterprise/enterprise-arch.png) +![img.png](../assets/img/enterprise/lakefs-enterprise-architecture.png) [1] Any user request to lakeFS via Browser or Programmatic access (SDK, HTTP API, lakectl). -[2] Reverse Proxy (e.g. NGINX, Traefik, K8S Ingress): will handle user requests -and proxy between lakeFS server and fluffy server based on the path prefix -while maintaining the same host. +[2] A reverse proxy (e.g., NGINX, Traefik, Kubernetes Ingress, Load Balanacer) will distribute requests between lakeFS server instances, SSL termination etc. Required when using more than 1 lakeFS instance. -[3] lakeFS server - the main lakeFS service. +[3] lakeFS Enterprise - lakeFS with additional enterprise functionality, including advanced security, SSO authentication, RBAC authorization, compliance, audit logging, and enterprise support. -[4] fluffy server - service that is responsible for the Enterprise features., -it is separated by ports for security reasons. +[4] The [KV Store](../understand/architecture.md) - Where metadata is stored, used by both core lakeFS and enterprise features. -1. SSO auth (i.e Browser login via Azure AD, Okta, Auth0), default port 8000. -1. RBAC authorization, default port 9000. - -[5] The [KV Store](../understand/architecture.md) - Where metadata is stored used both by lakeFS and fluffy. - -[6] SSO IdP - Identity provider (e.g. Azure AD, Okta, JumpCloud). fluffy -implements SAML and Oauth2 protocols. +[5] SSO IdP - External identity provider (e.g. Azure AD, Okta, JumpCloud). +lakeFS Enterprise implements SAML, OAuth2, and OIDC protocols. For more details and pricing, please [contact sales](https://lakefs.io/contact-sales/). diff --git a/docs/src/enterprise/configuration.md b/docs/src/enterprise/configuration.md index a8212fa897a..535250149de 100644 --- a/docs/src/enterprise/configuration.md +++ b/docs/src/enterprise/configuration.md @@ -1,20 +1,99 @@ --- title: Configuration Reference -description: a configuration reference for lakeFS Enterprise +description: A configuration reference for lakeFS Enterprise --- # lakeFS Enterprise Configuration Reference -!!! warning - fluffy will be deprecated in the upcoming versions and all functionality will be migrated into lakeFS Enterprise -Working with lakeFS Enterprise involve configuring both lakeFS and Fluffy. You can find the extended configuration references for both components below. +lakeFS Enterprise configuration extends lakeFS's configuration and uses the same config file. ## lakeFS Configuration For a complete list of configuration options, see the [lakeFS Server Configuration](../reference/configuration.md). The sections below provide additional configuration references that complement the main configuration guide. +### Reference + +This reference uses `.` to denote the nesting of values. + + +### auth + +Configuration section for authentication services, like SAML or OIDC. + +* `auth.logout_redirect_url` `(string : "/auth/login")` - The URL to redirect to after logout. The behavior depends on the authentication provider: + - **For OIDC**: The logout URL of the OIDC provider (e.g., Auth0 logout endpoint) + - **For SAML**: The URL within lakeFS where the IdP should redirect after logout (e.g., `/auth/login`) + +### auth.providers + +Configuration section external identity providers + +#### auth.providers.ldap + +* `auth.providers.ldap.server_endpoint` `(string : "")` - The LDAP server address, e.g. `'ldaps://ldap.company.com:636'` +* `auth.providers.ldap.bind_dn` `(string : "")` - The bind string, e.g. `'uid=,ou=Users,o=,dc=,dc=com'` +* `auth.providers.ldap.bind_password` `(string : "")` - The password for the user to bind +* `auth.providers.ldap.username_attribute` `(string : "")` - The user name attribute, e.g. 'uid' +* `auth.providers.ldap.user_base_dn` `(string : "")` - The search request base dn, e.g. `'ou=Users,o=,dc=,dc=com'` +* `auth.providers.ldap.user_filter` `(string : "")` - The search request user filter, e.g. `'(objectClass=inetOrgPerson)'` +* `auth.providers.ldap.connection_timeout_seconds` `(int : 0)` - The timeout for a single connection +* `auth.providers.ldap.request_timeout_seconds` `(int : 0)` - The timeout for a single request +* `auth.providers.ldap.default_user_group` `(string : "")` - The default group for the users initially authenticated by the remote service + +#### auth.providers.saml + +Configuration section for SAML + +* `auth.providers.saml.sp_root_url` `(string : '')` - The base lakeFS-URL, e.g. `'https://'` +* `auth.providers.saml.sp_x509_key_path` `(string : '')` - The path to the private key, e.g `'/etc/saml_certs/rsa_saml_private.cert'` +* `auth.providers.saml.sp_x509_cert_path` `(string : '')` - The path to the public key, '/etc/saml_certs/rsa_saml_public.pem' +* `auth.providers.saml.sp_sign_request` `(bool : false)` Some IdP require the SLO request to be signed +* `auth.providers.saml.sp_signature_method` `(string : '')` Optional valid signature values depending on the IdP configuration, e.g. 'http://www.w3.org/2001/04/xmldsig-more#rsa-sha256' +* `auth.providers.saml.idp_metadata_url` `(string : '')` - The URL for the metadata server, e.g. `'https:///federationmetadata/2007-06/federationmetadata.xml'` +* `auth.providers.saml.idp_metadata_file_path` `(string : '')` - The path to the Identity Provider (IdP) metadata XML file, e.g. '/etc/saml/idp-metadata.xml' +* `auth.providers.saml.idp_skip_verify_tls_cert` `(bool : false)` - Insecure skip verification of the IdP TLS certificate, like when signed by a private CA +* `auth.providers.saml.idp_authn_name_id_format` `(string : 'urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified')` - The format used in the NameIDPolicy for authentication requests +* `auth.providers.saml.idp_request_timeout` `(duration : '10s')` The timeout for remote authentication requests +* `auth.providers.saml.post_login_redirect_url` `(string : '')` - The URL to redirect users to after successful SAML authentication, e.g. `'http://localhost:8000/'` + +#### auth.providers.oidc + +Configuration section for OIDC + +* `auth.providers.oidc.url` `(string : '')` - The OIDC provider url, e.g. `'https://oidc-provider-url.com/'` +* `auth.providers.oidc.client_id` `(string : '')` - The application's ID +* `auth.providers.oidc.client_secret` `(string : '')` - The application's secret +* `auth.providers.oidc.callback_base_url` `(string : '')` - A default callback address of the lakeFS server +* `auth.providers.oidc.callback_base_urls` `(string[] : '[]')` - If callback_base_urls is configured, check current host is whitelisted otherwise use callback_base_url (without 's'). These config keys are mutually exclusive + +!!! note + You may configure a list of URLs that the OIDC provider may redirect to. This allows lakeFS to be accessed from multiple hostnames while retaining federated auth capabilities. + If the provider redirects to a URL not in this list, the login will fail. This property and callback_base_url are mutually exclusive. + +* `auth.providers.oidc.authorize_endpoint_query_parameters` `(map[string]string : {} )` - key/value parameters that are passed to a provider's authorization endpoint +* `auth.providers.oidc.logout_endpoint_query_parameters` `(string[] : [])` - The query parameters that will be used to redirect the user to the OIDC provider after logout, e.g. `["returnTo", "https:///oidc/login"]` +* `auth.providers.oidc.logout_client_id_query_parameter` `(string : '')` - The claim name that represents the client identifier in the OIDC provider +* `auth.providers.oidc.additional_scope_claims` `(string[] : '[]')` - Specifies optional requested permissions, other than `openid` and `profile` that are being used +* `auth.providers.oidc.post_login_redirect_url` `(string : '')` - The URL to redirect users to after successful OIDC authentication, e.g. `'http://localhost:8000/'` + +### auth.external + +Configuration section for the external authentication methods + +#### auth.external.aws_auth + +Configuration section for authenticating to lakeFS using AWS presign get-caller-identity request: [External Principals AWS Auth](../security/external-principals-aws.md) + +* `auth.external.aws_auth.enabled` `(bool : false)` - If true, external principals API will be enabled, e.g auth service and login api's +* `auth.external.aws_auth.get_caller_identity_max_age` `(duration : 15m)` - The maximum age in seconds for the GetCallerIdentity request to be valid, the max is 15 minutes enforced by AWS, smaller TTL can be set +* `auth.external.aws_auth.valid_sts_hosts` `([]string)` - The default are all the valid AWS STS hosts (`sts.amazonaws.com`, `sts.us-east-2.amazonaws.com` etc.) +* `auth.external.aws_auth.required_headers` `(map[string]string : )` - Headers that must be present by the client when doing login request. For security reasons it is recommended to set `X-LakeFS-Server-ID: `, lakeFS clients assume that's the default +* `auth.external.aws_auth.optional_headers` `(map[string]string : )` - Optional headers that can be present by the client when doing login request +* `auth.external.aws_auth.http_client.timeout` `(duration : 10s)` - The timeout for the HTTP client used to communicate with AWS STS +* `auth.external.aws_auth.http_client.skip_verify` `(bool : false)` - Skip SSL verification with AWS STS + ### blockstores !!! info @@ -78,142 +157,23 @@ The sections below provide additional configuration references that complement t * `blockstores.stores[].gs.server_side_encryption_customer_supplied` `(string : )` - Server side encryption with AES key in hex format, exclusive with key ID below * `blockstores.stores[].gs.server_side_encryption_kms_key_id` `(string : )` - Server side encryption KMS key ID, exclusive with above -## Fluffy Server Configuration - -Configuring Fluffy using a YAML configuration file and/or environment variables. -The configuration file's location can be set with the '--config' flag. If not specified, the first file found in the following order will be used: +### features -1. ./config.yaml -1. `$HOME`/fluffy/config.yaml -1. /etc/fluffy/config.yaml -1. `$HOME`/.fluffy.yaml +* `features.local_rbac` `(bool: true)` - Backward compatibility if you use an external RBAC service (such as legacy fluffy). If `false` lakeFS will expect to use `auth.api` and all fluffy related configuration for RBAC. -Configuration items can be controlled by environment variables, see [below](#using-environment-variables). +### iceberg_catalog -### Reference - -This reference uses `.` to denote the nesting of values. +Configuration section for the Iceberg REST Catalog -* `logging.format` `(one of ["json", "text"] : "text")` - Format to output log message in -* `logging.level` `(one of ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "NONE"] : "INFO")` - Logging level to output -* `logging.audit_log_level` `(one of ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "NONE"] : "DEBUG")` - Audit logs level to output. - - !!! note - In case you configure this field to be lower than the main logger level, you won't be able to get the audit logs - -* `logging.output` `(string : "-")` - A path or paths to write logs to. A `-` means the standard output, `=` means the standard error. -* `logging.file_max_size_mb` `(int : 100)` - Output file maximum size in megabytes. -* `logging.files_keep` `(int : 0)` - Number of log files to keep, default is all. -* `logging.trace_request_headers` `(bool : false)` - If set to `true` and logging level is set to `TRACE`, logs request headers. -* `listen_address` `(string : "0.0.0.0:8000")` - A `:` structured string representing the address to listen on -* `database` - Configuration section for the Fluffy key-value store database. The database must be shared between lakeFS & Fluffy - + `database.type` `(string ["postgres"|"dynamodb"|"cosmosdb"|"local"] : )` - Fluffy database type - + `database.postgres` - Configuration section when using `database.type="postgres"` - + `database.postgres.connection_string` `(string : "postgres://localhost:5432/postgres?sslmode=disable")` - PostgreSQL connection string to use - + `database.postgres.max_open_connections` `(int : 25)` - Maximum number of open connections to the database - + `database.postgres.max_idle_connections` `(int : 25)` - Maximum number of connections in the idle connection pool - + `database.postgres.connection_max_lifetime` `(duration : 5m)` - Sets the maximum amount of time a connection may be reused `(valid units: ns|us|ms|s|m|h)` - + `database.dynamodb` - Configuration section when using `database.type="dynamodb"` - + `database.dynamodb.table_name` `(string : "kvstore")` - Table used to store the data - + `database.dynamodb.scan_limit` `(int : 1025)` - Maximal number of items per page during scan operation - - !!! note - Refer to the following [AWS documentation](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.html#Query.Limit) for further information - - + `database.dynamodb.endpoint` `(string : )` - Endpoint URL for database instance - + `database.dynamodb.aws_region` `(string : )` - AWS Region of database instance - + `database.dynamodb.aws_profile` `(string : )` - AWS named profile to use - + `database.dynamodb.aws_access_key_id` `(string : )` - AWS access key ID - + `database.dynamodb.aws_secret_access_key` `(string : )` - AWS secret access key - - !!! note - `endpoint` `aws_region` `aws_access_key_id` `aws_secret_access_key` are not required and used mainly for experimental purposes when working with DynamoDB with different AWS credentials. - - + `database.dynamodb.health_check_interval` `(duration : 0s)` - Interval to run health check for the DynamoDB instance (won't run if equal to 0). - + `database.cosmosdb` - Configuration section when using `database.type="cosmosdb"` - + `database.cosmosdb.key` `(string : "")` - If specified, will - be used to authenticate to the CosmosDB account. Otherwise, Azure SDK - default authentication (with env vars) will be used. - + `database.cosmosdb.endpoint` `(string : "")` - CosmosDB account endpoint, e.g. `https://.documents.azure.com/`. - + `database.cosmosdb.database` `(string : "")` - CosmosDB database name. - + `database.cosmosdb.container` `(string : "")` - CosmosDB container name. - + `database.cosmosdb.throughput` `(int32 : )` - CosmosDB container's RU/s. If not set - the default CosmosDB container throughput is used. - + `database.cosmosdb.autoscale` `(bool : false)` - If set, CosmosDB container throughput is autoscaled (See CosmosDB docs for minimum throughput requirement). Otherwise, uses "Manual" mode ([Docs](https://learn.microsoft.com/en-us/azure/cosmos-db/provision-throughput-autoscale)). - - + `database.local` - Configuration section when using `database.type="local"` - + `database.local.path` `(string : "~/fluffy/metadata")` - Local path on the filesystem to store embedded KV metadata - + `database.local.sync_writes` `(bool: true)` - Ensure each write is written to the disk. Disable to increase performance - + `database.local.prefetch_size` `(int: 256)` - How many items to prefetch when iterating over embedded KV records - + `database.local.enable_logging` `(bool: false)` - Enable trace logging for local driver -* `auth` - Configuration section for the Fluffy authentication services, like SAML or OIDC. - + `auth.encrypt.secret_key` `(string : required)` - Same value given to lakeFS. A random (cryptographically safe) generated string that is used for encryption and HMAC signing - + `auth.logout_redirect_url` `(string : "/auth/login")` - The address to redirect to after a successful logout, e.g. login. - + `auth.post_login_redirect_url` `(string : '')` - Required when SAML is enabled. The address to redirect after a successful login. For most common configurations, setting to `/` will redirect to lakeFS homepage. - + `auth.serve_listen_address` `(string : '')` - If set, an endpoint serving RBAC requests binds to this address. - + `auth.serve_disable_authentication` `(bool : false)` - Unsafe. Disables authentication to the RBAC server. - + `auth.ldap` - + `auth.ldap.server_endpoint` `(string : required)` - The LDAP server address, e.g. 'ldaps://ldap.company.com:636' - + `auth.ldap.bind_dn` `(string : required)` - The bind string, e.g. 'uid=,ou=Users,o=,dc=,dc=com' - + `auth.ldap.bind_password` `(string : required)` - The password for the user to bind. - + `auth.ldap.username_attribute` `(string : required)` - The user name attribute, e.g. 'uid' - + `auth.ldap.user_base_dn` `(string : required)` - The search request base dn, e.g. 'ou=Users,o=,dc=,dc=com' - + `auth.ldap.user_filter` `(string : required)` - The search request user filter, e.g. '(objectClass=inetOrgPerson)' - + `auth.ldap.connection_timeout_seconds` `(int : required)` - The timeout for a single connection - + `auth.ldap.request_timeout_seconds` `(int : required)` - The timeout for a single request - + `auth.saml` Configuration section for SAML - + `auth.saml.enabled` `(bool : false)` - Enables SAML Authentication. - + `auth.saml.sp_root_url` `(string : '')` - The base lakeFS-URL, e.g. 'https://' - + `auth.saml.sp_x509_key_path` `(string : '')` - The path to the private key, e.g '/etc/saml_certs/rsa_saml_private.cert' - + `auth.saml.sp_x509_cert_path` `(string : '')` - The path to the public key, '/etc/saml_certs/rsa_saml_public.pem' - + `auth.saml.sp_sign_request` `(bool : 'false')` SPSignRequest some IdP require the SLO request to be signed - + `auth.saml.sp_signature_method` `(string : '')` SPSignatureMethod optional valid signature values depending on the IdP configuration, e.g. 'http://www.w3.org/2001/04/xmldsig-more#rsa-sha256' - + `auth.saml.idp_metadata_url` `(string : '')` - The URL for the metadata server, e.g. 'https:///federationmetadata/2007-06/federationmetadata.xml' - + `auth.saml.idp_skip_verify_tls_cert` `(bool : false)` - Insecure skip verification of the IdP TLS certificate, like when signed by a private CA - + `auth.saml.idp_authn_name_id_format` `(string : 'urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified')` - The format used in the NameIDPolicy for authentication requests - + `auth.saml.idp_request_timeout` `(duration : '10s')` The timeout for remote authentication requests. - + `auth.saml.external_user_id_claim_name` `(string : '')` - The claim name to use as the user identifier with an IdP mostly for logout - + `auth.oidc` Configuration section for OIDC - + `auth.oidc.enabled` `(bool : false)` - Enables OIDC Authentication. - + `auth.oidc.url` `(string : '')` - The OIDC provider url, e.g. 'https://oidc-provider-url.com/' - + `auth.oidc.client_id` `(string : '')` - The application's ID. - + `auth.oidc.client_secret` `(string : '')` - The application's secret. - + `auth.oidc.callback_base_url` `(string : '')` - A default callback address of the Fluffy server. - + `auth.oidc.callback_base_urls` `(string[] : '[]')` - - !!! note - You may configure a list of URLs that the OIDC provider may redirect to. This allows lakeFS to be accessed from multiple hostnames while retaining federated auth capabilities. - If the provider redirects to a URL not in this list, the login will fail. This property and callback_base_url are mutually exclusive. - - + `auth.oidc.authorize_endpoint_query_parameters` `(bool : map[string]string)` - key/value parameters that are passed to a provider's authorization endpoint. - + `auth.oidc.logout_endpoint_query_parameters` `(string[] : '[]')` - The query parameters that will be used to redirect the user to the OIDC provider after logout, e.g. '[returnTo, https:///oidc/login]' - + `auth.oidc.logout_client_id_query_parameter` `(string : '')` - The claim name that represents the client identifier in the OIDC provider - + `auth.oidc.additional_scope_claims` `(string[] : '[]')` - Specifies optional requested permissions, other than `openid` and `profile` that are being used. - + `auth.cache` Configuration section for RBAC service cache - + `auth.cache.enabled` `(bool : true)` - Enables RBAC service cache - + `auth.cache.size` `(int : 1024)` - Number of users, policies and credentials to cache. - + `auth.cache.ttl` `(duration : 20s)` - Cache items time to live expiry. - + `auth.cache.jitter` `(duration : 3s)` - Cache items time to live jitter. - + `auth.external` - Configuration section for the external authentication methods - + `auth.external.aws_auth` - Configuration section for authenticating to lakeFS using AWS presign get-caller-identity request: [External Principals AWS Auth](../security/external-principals-aws.md) - + `auth.external.aws_auth.enabled` `(bool : false)` - If true, external principals API will be enabled, e.g auth service and login api's. - + `auth.external.aws_auth.get_caller_identity_max_age` `(duration : 15m)` - The maximum age in seconds for the GetCallerIdentity request to be valid, the max is 15 minutes enforced by AWS, smaller TTL can be set. - + `auth.authentication_api.external_principals_enabled` `(bool : false)` - If true, external principals API will be enabled, e.g auth service and login api's. - + `auth.external.aws_auth.valid_sts_hosts` `([]string)` - The default are all the valid AWS STS hosts (`sts.amazonaws.com`, `sts.us-east-2.amazonaws.com` etc). - + `auth.external.aws_auth.required_headers` `(map[string]string : )` - Headers that must be present by the client when doing login request (e.g `X-LakeFS-Server-ID: `). - + `auth.external.aws_auth.optional_headers` `(map[string]string : )` - Optional headers that can be present by the client when doing login request. - + `auth.external.aws_auth.http_client.timeout` `(duration : 10s)` - The timeout for the HTTP client used to communicate with AWS STS. - + `auth.external.aws_auth.http_client.skip_verify` `(bool : false)` - Skip SSL verification with AWS STS. - -* `iceberg_catalog` - Configuration section for the Iceberg REST Catalog - + `iceberg_catalog.token_duration` `(duration : 1h)` - Authenticated token duration +* `iceberg_catalog.token_duration` `(duration : 1h)` - Authenticated token duration ### Using Environment Variables All the configuration variables can be set or overridden using environment variables. -To set an environment variable, prepend `FLUFFY_` to its name, convert it to upper case, and replace `.` with `_`: +To set an environment variable, prepend `LAKEFS_` to its name, convert it to upper case, and replace `.` with `_`: -For example, `logging.format` becomes `FLUFFY_LOGGING_FORMAT`, `auth.saml.enabled` becomes `FLUFFY_AUTH_SAML_ENABLED`, etc. +For example, `auth.logout_redirect_url` becomes `LAKEFS_AUTH_LOGOUT_REDIRECT_URL`, `auth.external.aws_auth.enabled` becomes `LAKEFS_AUTH_EXTERNAL_AWS_AUTH_ENABLED`, etc. To set a value for a `map[string]string` type field, use the syntax `key1=value1,key2=value2,...`. diff --git a/docs/src/enterprise/getstarted/install.md b/docs/src/enterprise/getstarted/install.md index 7baaa323931..5f26a24c29b 100644 --- a/docs/src/enterprise/getstarted/install.md +++ b/docs/src/enterprise/getstarted/install.md @@ -10,7 +10,10 @@ description: lakeFS Enterprise Installation Guide ## lakeFS Enterprise Architecture -We recommend to review the [lakeFS Enterprise architecture][lakefs-enterprise-architecture] to understand the components you will be deploying. +We recommend reviewing the [lakeFS Enterprise architecture][lakefs-enterprise-architecture] to understand the components you will be deploying. +!!! note + Fluffy service is deprecated in chart version 1.5.0 and later. + For more information, see the [Upgrade Guide][lakefs-enterprise-upgrade]. ## Deploy lakeFS Enterprise on Kubernetes @@ -26,9 +29,9 @@ The guide includes example configurations, follow the steps below and adjust the 1. You have a Kubernetes cluster running in one of the platforms [supported by lakeFS](../../howto/deploy/index.md#deployment-and-setup-details). 1. [Helm](https://helm.sh/docs/intro/install/) is installed -1. Access to download *dockerhub/fluffy* from [Docker Hub](https://hub.docker.com/u/treeverse). [Contact us](https://lakefs.io/contact-sales/) to gain access to lakeFS Enterprise features. -1. A KV Database that will be shared by lakeFS and Fluffy. The available options are dependent in your [deployment platform](../../howto/deploy/index.md#deployment-and-setup-details). -1. A proxy server configured to route traffic between the lakeFS and Fluffy servers, see Reverse Proxy in [lakeFS Enterprise architecture][lakefs-enterprise-architecture]. +1. Access to download *treeverse/lakefs-enterprise* from [Docker Hub](https://hub.docker.com/u/treeverse). [Contact us](https://lakefs.io/contact-sales/) to gain access to lakeFS Enterprise features. +1. A KV Database. The available options are dependent in your [deployment platform](../../howto/deploy/index.md#deployment-and-setup-details). +1. A method to route traffic into lakeFS from outside of the cluster (via Ingress or Service). #### Optional @@ -41,17 +44,15 @@ Access to configure your SSO IdP [supported by lakeFS Enterprise][lakefs-sso-ent * Add the lakeFS Helm repository with `helm repo add lakefs https://charts.lakefs.io` * The chart contains a values.yaml file you can customize to suit your needs as you follow this guide. Use `helm show values lakefs/lakefs` to see the default values. -* While customizing your values.yaml file, note to configure `fluffy.image.privateRegistry.secretToken` with the token Docker Hub token you received. +* Configure `image.privateRegistry.secretToken` with the Docker Hub token you received. ### Authentication Configuration -Authentication in lakeFS Enterprise is handled by the Fluffy SSO service which runs side-by-side to lakeFS. This section explains -what Fluffy configurations are required for configuring the SSO service. See [this][fluffy-configuration] configuration reference for additional Fluffy configurations. +Authentication in lakeFS Enterprise is handled directly by the lakeFS Enterprise service. This section explains the configurations required for setting up SSO. See [SSO for lakeFS Enterprise][lakefs-sso-enterprise-spec] for the supported identity providers and protocols. -The examples below include example configuration for each of the supported SSO protocols. Note the IdP-specific details you'll need to -replace with your IdP details. +The examples below include example configuration for each of the supported SSO protocols. Note the IdP-specific details you'll need to replace with your IdP details. === "OpenID Connect" @@ -61,12 +62,26 @@ replace with your IdP details. The full OIDC configurations explained [here][lakefs-sso-enterprise-spec-oidc]. ```yaml + enterprise: + enabled: true + auth: + oidc: + enabled: true + # secret given by the OIDC provider (e.g auth0, Okta, etc) + client_secret: + + image: + privateRegistry: + enabled: true + secretToken: + lakefsConfig: | logging: - level: "INFO" + level: "INFO" blockstore: type: s3 auth: + logout_redirect_url: https://oidc-provider-url.com/logout/example oidc: # the claim that's provided by the OIDC provider (e.g Okta) that will be used as the username according to OIDC provider claims provided after successful authentication friendly_name_claim_name: "" @@ -74,60 +89,26 @@ replace with your IdP details. # if true then the value of friendly_name_claim_name will be refreshed during each login to maintain the latest value # and the the claim value (i.e user name) will be stored in the lakeFS database persist_friendly_name: true - ui_config: - login_cookie_names: - - internal_auth_session - - oidc_auth_session - ingress: - enabled: true - ingressClassName: - hosts: - # the ingress that will be created for lakeFS - - host: - paths: - - / - - ################################################## - ########### lakeFS enterprise - FLUFFY ########### - ################################################## - - fluffy: - enabled: true - image: - repository: treeverse/fluffy - pullPolicy: IfNotPresent - privateRegistry: - enabled: true - secretToken: - fluffyConfig: | - logging: - format: "json" - level: "INFO" - auth: - logout_redirect_url: https://oidc-provider-url.com/logout/example + providers: oidc: - enabled: true - url: https://oidc-provider-url.com/ - client_id: + post_login_redirect_url: / + url: https://oidc-provider-url.com/ + client_id: callback_base_url: https:// # the claim name that represents the client identifier in the OIDC provider (e.g Okta) logout_client_id_query_parameter: client_id - # the query parameters that will be used to redirect the user to the OIDC provider (e.g Okta) after logout + # the query parameters that will be used to redirect the user to the OIDC provider after logout logout_endpoint_query_parameters: - returnTo - https:///oidc/login - secrets: - create: true - sso: - enabled: true - oidc: - enabled: true - # secret given by the OIDC provider (e.g auth0, Okta, etc) store in kind: Secret - client_secret: - rbac: - enabled: true - useDevPostgres: true + ingress: + enabled: true + ingressClassName: + hosts: + - host: + paths: + - / ``` === "SAML (With Azure AD)" @@ -145,7 +126,7 @@ replace with your IdP details. 1. Add users: **App > Users and groups**: Attach users and roles from their existing AD users list - only attached users will be able to login to lakeFS. 1. Configure SAML: App > Single sign-on > SAML: - 1. Entity ID: Add 2 ID’s, lakefs-url + lakefs-url/saml/metadata (e.g. https://lakefs.acme.com and https://lakefs.acme.com/saml/metadata) + 1. Entity ID: Add 2 ID's, lakefs-url + lakefs-url/saml/metadata (e.g. https://lakefs.acme.com and https://lakefs.acme.com/saml/metadata) 1. Reply URL: lakefs-url/saml (e.g. https://lakefs.acme.com/saml) 1. Sign on URL: lakefs-url/sso/login-saml (e.g. https://lakefs.acme.com/sso/login-saml) 1. Relay State (Optional, controls where to redirect after login): / @@ -153,30 +134,59 @@ replace with your IdP details. #### SAML Configuration 1. Configure SAML application in your IdP (i.e Azure AD) and replace the required parameters into the `values.yaml` below. - 2. To generate certificates keypair use: `openssl req -x509 -newkey rsa:2048 -keyout myservice.key -out myservice.cert -days 365 -nodes -subj "/CN=lakefs.acme.com" - + 2. To generate certificates keypair use: `openssl req -x509 -newkey rsa:2048 -keyout myservice.key -out myservice.cert -days 365 -nodes -subj "/CN=lakefs.acme.com"` ```yaml + enterprise: + enabled: true + auth: + saml: + enabled: true + createCertificateSecret: true # NEW: Auto-creates secret + certificate: + # certificate and private key for the SAML service provider to sign outgoing SAML requests + samlRsaPublicCert: | # RENAMED: from saml_rsa_public_cert + -----BEGIN CERTIFICATE----- + ... + -----END CERTIFICATE----- + samlRsaPrivateKey: | # RENAMED: from saml_rsa_private_key + -----BEGIN PRIVATE KEY----- + ... + -----END PRIVATE KEY----- + secrets: authEncryptSecretKey: "some random secret string" + image: + privateRegistry: + enabled: true + secretToken: + lakefsConfig: | logging: - level: "DEBUG" + level: "DEBUG" blockstore: type: local auth: + logout_redirect_url: https:// cookie_auth_verification: + auth_source: saml # claim name to use for friendly name in lakeFS UI friendly_name_claim_name: displayName - external_user_id_claim_name: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name + external_user_id_claim_name: samName default_initial_groups: - "Developers" - encrypt: - secret_key: shared-secrey-key - ui_config: - login_cookie_names: - - internal_auth_session - - saml_auth_session + providers: + saml: + post_login_redirect_url: https:// + sp_root_url: https:// + sp_sign_request: false + sp_signature_method: "http://www.w3.org/2001/04/xmldsig-more#rsa-sha256" + idp_metadata_url: "https:///federationmetadata/2007-06/federationmetadata.xml" + # the default id format urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified + # idp_authn_name_id_format: "urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified" + idp_skip_verify_tls_cert: true + ingress: enabled: true ingressClassName: @@ -184,92 +194,41 @@ replace with your IdP details. hosts: - host: paths: - - / - - fluffy: - enabled: true - image: - repository: treeverse/fluffy - pullPolicy: IfNotPresent - privateRegistry: - enabled: true - secretToken: - fluffyConfig: | - logging: - format: "json" - level: "DEBUG" - auth: - # redirect after logout - logout_redirect_url: https:// - saml: - sp_sign_request: false - sp_signature_method: "http://www.w3.org/2001/04/xmldsig-more#rsa-sha256" - idp_metadata_url: https://login.microsoftonline.com/<...>/federationmetadata/2007-06/federationmetadata.xml?appid= - # the default id format urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified - # idp_authn_name_id_format: "urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified" - external_user_id_claim_name: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name - idp_skip_verify_tls_cert: true - secrets: - create: true - sso: - enabled: true - saml: - enabled: true - createSecret: true - lakeFSServiceProviderIngress: https:// - certificate: - # certificate and private key for the SAML service provider to sign outgoing SAML requests - saml_rsa_public_cert: | - -----BEGIN CERTIFICATE----- - ... - -----END CERTIFICATE----- - saml_rsa_private_key: | - -----BEGIN PRIVATE KEY----- - ... - -----END PRIVATE KEY----- - rbac: - enabled: true + - / ``` + === "LDAP" + The following `values` file will run lakeFS Enterprise with LDAP. !!! tip The full LDAP configurations explained [here][lakefs-sso-enterprise-spec-ldap]. ```yaml + enterprise: + enabled: true + auth: + ldap: + enabled: true + bindPassword: + + image: + privateRegistry: + enabled: true + secretToken: + lakefsConfig: | logging: - level: "INFO" + level: "INFO" blockstore: type: local auth: - remote_authenticator: - enabled: true - # RBAC group for first time users - default_user_group: "Developers" ui_config: + login_url: /auth/login + logout_url: /logout login_cookie_names: - internal_auth_session - - ingress: - enabled: true - ingressClassName: - hosts: - - host: - paths: - - / - - fluffy: - enabled: true - image: - privateRegistry: - enabled: true - secretToken: - fluffyConfig: | - logging: - level: "INFO" - auth: - post_login_redirect_url: / + providers: ldap: server_endpoint: 'ldaps://ldap.company.com:636' bind_dn: uid=,ou=Users,o=,dc=,dc=com @@ -278,92 +237,69 @@ replace with your IdP details. user_filter: (objectClass=inetOrgPerson) connection_timeout_seconds: 15 request_timeout_seconds: 17 - secrets: - create: true - sso: - enabled: true - ldap: - enabled: true - bind_password: - rbac: - enabled: true + # RBAC group for first time users + default_user_group: "Developers" + + ingress: + enabled: true + ingressClassName: + hosts: + - host: + paths: + - / - useDevPostgres: true ``` -See [additional examples on GitHub](https://github.com/treeverse/charts/tree/master/examples/lakefs/enterprise) we provide for each authentication method (oidc, adfs, ldap, rbac, IAM etc). +See [additional examples on GitHub](https://github.com/treeverse/charts/tree/master/examples/lakefs/enterprise) we provide for each authentication method (oidc, saml, ldap, rbac, external AWS IAM). ### Database Configuration -In this section, you will learn how to configure lakeFS Enterprise to work with the KV Database you created (see [prerequisites](#prerequisites). +In this section, you will learn how to configure lakeFS Enterprise to work with the KV Database you created (see [prerequisites](#prerequisites)). Notes: -* By default, the lakeFS Helm chart comes with `useDevPostgres: true`, you should change it to `useDevPostgres: false` for Fluffy to work with your KV Database and be suitable for production needs. -* The KV database is shared between lakeFS and Fluffy, and therefore both services must use the same configuration. -* See [fluffy][fluffy-configuration] and [lakeFS](../../reference/configuration.md#database) `database` configuration. +* By default, the lakeFS Helm chart comes with `useDevPostgres: false`, you can change it to `useDevPostgres: true` for dev use. This setup is useful when you want to run a setup with multiple replicas or want to prevent data loss between containers restarts. +* See [lakeFS database configuration](../../reference/configuration.md#database). -The database configuration structure between lakeFS and fluffy can be set directly via `fluffyConfig`, via K8S Secret Kind, and `lakefsConfig` or via environment variables. +The database configuration can be set directly via `lakefsConfig`, via K8S Secret Kind, or via environment variables. === "Postgres via environment variables" - This example uses Postgres as KV Database. lakeFS is configured via `lakefsConfig` and Fluffy via environment with the same database configuration. + This example uses Postgres as KV Database configured via environment variables. - ```yaml - useDevPostgres: false - lakefsConfig: | - database: - type: postgres - postgres: - connection_string: - - fluffy: - extraEnvVars: - - name: FLUFFY_DATABASE_TYPE - value: postgres - - name: FLUFFY_DATABASE_POSTGRES_CONNECTION_STRING - value: '' + ```yaml + extraEnvVars: + - name: LAKEFS_DATABASE_TYPE + value: postgres + - name: LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING + value: '' ``` -=== "Via fluffyConfig" +=== "Via lakefsConfig" This example uses DynamoDB as KV Database. ```yaml - - # disable dev postgres - useDevPostgres: false - lakefsConfig: | - database: - type: dynamodb - dynamodb: - table_name: - aws_profile: - fluffyConfig: | - database: + database: type: dynamodb dynamodb: - table_name:
- aws_profile: - aws_region: + table_name:
+ aws_profile: + aws_region: ``` === "Postgres via shared Secret kind" - This example uses Postgres as KV Database. The chart will create a `kind: Secret` holding the database connection string, and the lakeFS and Fluffy will use it. + This example uses Postgres as KV Database. The chart will create a `kind: Secret` holding the database connection string. ```yaml - useDevPostgres: false secrets: - authEncryptSecretKey: shared-key-hello - databaseConnectionString: + authEncryptSecretKey: shared-key-hello + databaseConnectionString: lakefsConfig: | - database: - type: postgres - fluffyConfig: | - database: + database: type: postgres ``` @@ -373,7 +309,7 @@ After populating your values.yaml file with the relevant configuration, in the d ### Access the lakeFS UI -In your browser go to the to the Ingress host to access lakeFS UI. +In your browser, go to the Ingress host to access lakeFS UI. ## Log Collection @@ -381,14 +317,15 @@ The recommended practice for collecting logs would be sending them to the contai and letting an external service to collect them to a sink. An example for logs collector would be [fluentbit](https://fluentbit.io/) that can collect container logs, format them and ship them to a target like S3. -There are 2 kinds of logs, regular logs like an API error or some event description used for debugging -and audit_logs that are describing a user action (i.e create branch). -The distinction between regular logs and audit_logs is in the boolean field log_audit. -lakeFS and fluffy share the same configuration structure under logging.* section in the config. +There are 2 kinds of logs: +- Regular logs like an API error or some event description used for debugging +- Audit logs that describe user actions (i.e create branch) + +The distinction between regular logs and audit_logs is in the boolean field `log_audit`. ## Advanced Deployment Configurations -The following example demonstrates a scenario where you need to configure an HTTP proxy for lakeFS and Fluffy, TLS certificates for the Ingress and extending the K8S manifests without forking the Helm chart. +The following example demonstrates a scenario where you need to configure an HTTP proxy for lakeFS, TLS certificates for the Ingress and extending the K8S manifests without forking the Helm chart. ```yaml ingress: @@ -411,24 +348,19 @@ extraEnvVars: - name: HTTPS_PROXY value: 'http://my.company.proxy:8081' -fluffy: - # configure proxy for fluffy - extraEnvVars: - - name: HTTP_PROXY - value: 'http://my.company.proxy:8081' - - name: HTTPS_PROXY - value: 'http://my.company.proxy:8081' - # advanced: extra manifests to extend the K8S resources extraManifests: - apiVersion: v1 kind: ConfigMap metadata: - name: '{% raw %}{{ .Values.fluffy.name }}{% endraw %}-extra-config' + name: '{% raw %}{{ .Values.lakefs.name }}{% endraw %}-extra-config' data: config.yaml: my-data ``` [lakefs-sso-enterprise-spec]: ../../security/sso.md#sso-for-lakefs-enterprise -[fluffy-configuration]: ../../enterprise/configuration.md#fluffy-server-configuration +[lakefs-sso-enterprise-spec-oidc]: ../../security/sso.md#openid-connect +[lakefs-sso-enterprise-spec-saml]: ../../security/sso.md#active-directory-federation-services-ad-fs-using-saml +[lakefs-sso-enterprise-spec-ldap]: ../../security/sso.md#ldap [lakefs-enterprise-architecture]: ../../understand/architecture.md +[lakefs-enterprise-upgrade]: ../../enterprise/upgrade.md \ No newline at end of file diff --git a/docs/src/enterprise/getstarted/migrate-from-oss.md b/docs/src/enterprise/getstarted/migrate-from-oss.md index 73c9ee1f441..5d63a34f620 100644 --- a/docs/src/enterprise/getstarted/migrate-from-oss.md +++ b/docs/src/enterprise/getstarted/migrate-from-oss.md @@ -7,18 +7,17 @@ description: How to migrate from lakeFS OSS to lakeFS Enterprise To migrate from lakeFS Open Source to lakeFS Enterprise, follow the steps below: -1. Make sure you have the Fluffy Docker token, if not [contact us](https://lakefs.io/contact-sales/) to gain access to Fluffy. You will be granted with a token that enables downloading *dockerhub/fluffy* from [Docker Hub](https://hub.docker.com/u/treeverse). -1. Update lakeFS docker image to enterprise. Use `treeverse/lakefs-enterprise` instead of `treeverse/lakefs`. The image can be pulled using the same token as Fluffy. +1. Make sure you have the lakeFS Enterprise Docker token. if not, [contact us](https://lakefs.io/contact-sales/) to gain access to lakeFS Enterprise. You will be granted a token that enables downloading *dockerhub/lakefs-enterprise* from [Docker Hub](https://hub.docker.com/u/treeverse). +1. Update the lakeFS Docker image to the enterprise version. Replace `treeverse/lakefs` with `treeverse/lakefs-enterprise` in your configuration. The enterprise image can be pulled using your lakeFS Enterprise token. 1. Sanity Test (Optional): Install a new test lakeFS Enterprise before moving your current production setup. Test the setup > login > Create repository etc. Once everything seems to work, delete and cleanup the test setup and we will move to the migration process. -1. Follow lakeFS [Enterprise installation guide][lakefs-enterprise-install] - 1. Make sure that you meet the [prerequisites][lakefs-enterprise-install-prerequisites] - 1. Update your existing `values.yaml` file for your deployment -1. DB Migration: we are going to use the same DB for both lakeFS and Fluffy, so we need to migrate the DB schema. -1. Make sure to SSH / exec into the lakeFS server (old pre-upgrade version), the point is to use the same lakefs configuration file when running a migration. +1. Make sure to configure the lakeFS enterprise properly, see [Enterprise installation guide][lakefs-enterprise-install] + 1. Update your existing lakeFS configuration for enterprise (i.e `values.yaml` file for your deployment if using helm) +1. DB Migration: We are going to use the same database for lakeFS Enterprise, so we need to migrate the database schema. +1. Make sure to SSH / exec into the lakeFS server (old pre-upgrade version); the point is to use the same lakeFS configuration file when running a migration. 1. If upgrading `lakefs` version do this or skip to the next step: Install the new lakeFS binary, if not use the existing one (the one you are running). 1. Run the command: `LAKEFS_AUTH_UI_CONFIG_RBAC=internal lakefs migrate up` (use the **new binary** if upgrading lakeFS version). 1. You should expect to see a log message saying Migration completed successfully. - 1. During this short db migration process please make sure not to make any policy / RBAC related changes. + 1. During this short DB migration process, please make sure not to make any policy / RBAC related changes. 1. Once the migration completed - Upgrade your helm release with the modified `values.yaml` and the new version and run `helm ugprade`. 1. Login to the new lakeFS pod: Execute the following command, make sure you have proper credentials, or discard to get new ones: @@ -26,7 +25,7 @@ To migrate from lakeFS Open Source to lakeFS Enterprise, follow the steps below: lakefs setup --user-name --access-key-id --secret-access-key --no-check ``` !!! warning - Please note that the newly set up lakeFS instance remains inaccessible to users until full setup completion, due to the absence of established credentials within the system. + Please note that the newly set-up lakeFS instance remains inaccessible to users until full setup completion, due to the absence of established credentials within the system. [lakefs-enterprise-install]: install.md diff --git a/docs/src/enterprise/getstarted/quickstart.md b/docs/src/enterprise/getstarted/quickstart.md index 54fae28a779..1270aa7040f 100644 --- a/docs/src/enterprise/getstarted/quickstart.md +++ b/docs/src/enterprise/getstarted/quickstart.md @@ -8,8 +8,7 @@ description: Quickstart guides for lakeFS Enterprise Follow these quickstarts to try out lakeFS Enterprise. !!! warning - - fluffy will be deprecated in the upcoming versions and all functionality will be migrated into lakeFS Enterprise - - lakeFS Enterprise Quickstarts are not suitable for production use-cases. See the [installation guide](install.md) to set up a production-grade lakeFS Enterprise installation + lakeFS Enterprise Quickstarts are not suitable for production use-cases. See the [installation guide](install.md) to set up a production-grade lakeFS Enterprise installation ## lakeFS Enterprise Sample @@ -19,9 +18,8 @@ to easily interact with lakeFS without the hassle of integration and experiment By running the [lakeFS Enterprise Sample](https://github.com/treeverse/lakeFS-samples/tree/main/02_lakefs_enterprise), you will be getting a ready-to-use environment including the following containers: -* lakeFS -* Fluffy (includes lakeFS Enterprise features) -* Postgres: used by lakeFS and Fluffy as a shared KV store +* lakeFS Enterprise (includes additional features) +* Postgres: used by lakeFS as a KV store * MinIO container: used as the storage connected to lakeFS * Jupyter notebooks setup: Pre-populated with [notebooks](https://github.com/treeverse/lakeFS-samples/blob/main/00_notebooks/00_index.ipynb) that demonstrate lakeFS Enterprise' capabilities * Apache Spark: this is useful for interacting with data you'll manage with lakeFS @@ -32,16 +30,22 @@ Checkout the [RBAC demo](https://github.com/treeverse/lakeFS-samples/blob/main/0 ### Prerequisites +!!! note + In order to use lakeFS enterprise you must have: + - Access token to download binaries from Docker hub + - License to run lakeFS Enterprise + [Contact us](https://lakefs.io/contact-sales/) to gain access for both. + + 1. You have installed [Docker Compose](https://docs.docker.com/compose/install/) version `2.23.1` or higher on your machine. -2. Access to download *dockerhub/fluffy* from [Docker Hub](https://hub.docker.com/u/treeverse). [Contact us](https://lakefs.io/contact-sales/) to gain access to Fluffy. +2. Access to download *treeverse/lakefs-enterprise* from [Docker Hub](https://hub.docker.com/u/treeverse). 3. With the token you've been granted, login locally to Docker Hub with `docker login -u externallakefs -p `.
The quickstart docker-compose files below create a lakeFS server that's connected to a [local blockstore](../../howto/deploy/onprem.md#local-blockstore) and spin up the following containers: -* lakeFS -* Fluffy (includes lakeFS Enterprise features) -* Postgres: used by lakeFS and Fluffy as a shared KV store +* lakeFS Enterprise +* Postgres: used by lakeFS as a KV store You can choose from the following options: @@ -63,24 +67,22 @@ You can choose from the following options: image: "treeverse/lakefs-enterprise:latest" command: "RUN" ports: - - "8080:8080" + - "8000:8000" depends_on: - "postgres" environment: - - LAKEFS_LISTEN_ADDRESS=0.0.0.0:8080 + - LAKEFS_LISTEN_ADDRESS=0.0.0.0:8000 - LAKEFS_LOGGING_LEVEL=DEBUG - - LAKEFS_AUTH_ENCRYPT_SECRET_KEY="random_secret" - - LAKEFS_AUTH_API_ENDPOINT=http://fluffy:9000/api/v1 - - LAKEFS_AUTH_API_SUPPORTS_INVITES=true + - LAKEFS_AUTH_ENCRYPT_SECRET_KEY=random_secret - LAKEFS_AUTH_UI_CONFIG_RBAC=internal - - LAKEFS_AUTH_AUTHENTICATION_API_ENDPOINT=http://localhost:8000/api/v1 - - LAKEFS_AUTH_AUTHENTICATION_API_EXTERNAL_PRINCIPALS_ENABLED=true - LAKEFS_DATABASE_TYPE=postgres - - LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING=postgres://lakefs:lakefs@postgres/postgres?sslmode=disable + - LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING=postgres://lakefs:lakefs@postgres:5432/postgres?sslmode=disable - LAKEFS_BLOCKSTORE_TYPE=local - LAKEFS_BLOCKSTORE_LOCAL_PATH=/home/lakefs - LAKEFS_BLOCKSTORE_LOCAL_IMPORT_ENABLED=true - entrypoint: ["/app/wait-for", "postgres:5432", "--", "/app/lakefs", "run"] + - LAKEFS_AUTH_POST_LOGIN_REDIRECT_URL=http://localhost:8000/ + - LAKEFS_FEATURES_LOCAL_RBAC=true + - LAKEFS_LICENSE_CONTENTS= configs: - source: lakefs.yaml target: /etc/lakefs/config.yaml @@ -92,25 +94,6 @@ You can choose from the following options: POSTGRES_USER: lakefs POSTGRES_PASSWORD: lakefs - fluffy: - image: "${FLUFFY_REPO:-treeverse}/fluffy:${TAG:-latest}" - command: "${COMMAND:-run}" - ports: - - "8000:8000" - - "9000:9000" - depends_on: - - "postgres" - environment: - - FLUFFY_LOGGING_LEVEL=DEBUG - - FLUFFY_DATABASE_TYPE=postgres - - FLUFFY_DATABASE_POSTGRES_CONNECTION_STRING=postgres://lakefs:lakefs@postgres/postgres?sslmode=disable - - FLUFFY_AUTH_ENCRYPT_SECRET_KEY="random_secret" - - FLUFFY_AUTH_SERVE_LISTEN_ADDRESS=0.0.0.0:9000 - - FLUFFY_LISTEN_ADDRESS=0.0.0.0:8000 - - FLUFFY_AUTH_SERVE_DISABLE_AUTHENTICATION=true - - FLUFFY_AUTH_POST_LOGIN_REDIRECT_URL=http://localhost:8080/ - entrypoint: [ "/app/wait-for", "postgres:5432", "--", "/app/fluffy" ] - configs: lakefs.yaml: content: | @@ -121,7 +104,7 @@ You can choose from the following options: ``` === "Advanced (SSO Enabled)" - This setup uses OIDC as the SSO authentication method thus requiring a valid OIDC configuration. + This setup uses OIDC as the SSO authentication method, thus requiring a valid OIDC configuration. 1. Create a `docker-compose.yaml` with the content below. 2. Create a `.env` file with the configurations below in the same directory as the `docker-compose.yaml`, docker compose will automatically use that. @@ -134,14 +117,15 @@ You can choose from the following options: `.env` ``` - FLUFFY_AUTH_OIDC_CLIENT_ID= - FLUFFY_AUTH_OIDC_CLIENT_SECRET= + LAKEFS_AUTH_PROVIDERS_OIDC_CLIENT_ID= + LAKEFS_AUTH_PROVIDERS_OIDC_CLIENT_SECRET= # The name of the query parameter that is used to pass the client ID to the logout endpoint of the SSO provider, i.e client_id - FLUFFY_AUTH_OIDC_LOGOUT_CLIENT_ID_QUERY_PARAMETER= - FLUFFY_AUTH_OIDC_URL=https://my-sso.com/ - FLUFFY_AUTH_LOGOUT_REDIRECT_URL=https://my-sso.com/logout + LAKEFS_AUTH_PROVIDERS_OIDC_LOGOUT_CLIENT_ID_QUERY_PARAMETER= + LAKEFS_AUTH_PROVIDERS_OIDC_URL=https://my-sso.com/ + LAKEFS_AUTH_LOGOUT_REDIRECT_URL=https://my-sso.com/logout # Optional: display a friendly name in the lakeFS UI by specifying which claim from the provider to show (i.e name, nickname, email etc) LAKEFS_AUTH_OIDC_FRIENDLY_NAME_CLAIM_NAME= + LAKEFS_LICENSE_CONTENTS= ``` `docker-compose.yaml` @@ -157,22 +141,31 @@ You can choose from the following options: depends_on: - "postgres" environment: - - LAKEFS_LISTEN_ADDRESS=0.0.0.0:8080 + - LAKEFS_LISTEN_ADDRESS=0.0.0.0:8000 - LAKEFS_LOGGING_LEVEL=DEBUG - - LAKEFS_AUTH_ENCRYPT_SECRET_KEY="random_secret" - - LAKEFS_AUTH_API_ENDPOINT=http://fluffy:9000/api/v1 - - LAKEFS_AUTH_API_SUPPORTS_INVITES=true + - LAKEFS_LOGGING_AUDIT_LOG_LEVEL=INFO + - LAKEFS_AUTH_ENCRYPT_SECRET_KEY=shared-secret-key + - LAKEFS_AUTH_LOGOUT_REDIRECT_URL=${LAKEFS_AUTH_LOGOUT_REDIRECT_URL} - LAKEFS_AUTH_UI_CONFIG_LOGIN_URL=http://localhost:8000/oidc/login - LAKEFS_AUTH_UI_CONFIG_LOGOUT_URL=http://localhost:8000/oidc/logout - LAKEFS_AUTH_UI_CONFIG_RBAC=internal - - LAKEFS_AUTH_AUTHENTICATION_API_ENDPOINT=http://localhost:8000/api/v1 - - LAKEFS_AUTH_AUTHENTICATION_API_EXTERNAL_PRINCIPALS_ENABLED=true + - LAKEFS_AUTH_OIDC_FRIENDLY_NAME_CLAIM_NAME=${LAKEFS_AUTH_OIDC_FRIENDLY_NAME_CLAIM_NAME} + - LAKEFS_AUTH_PROVIDERS_OIDC_ENABLED=true + - LAKEFS_AUTH_PROVIDERS_OIDC_POST_LOGIN_REDIRECT_URL=http://localhost:8000/ + - LAKEFS_AUTH_PROVIDERS_OIDC_URL=${LAKEFS_AUTH_PROVIDERS_OIDC_URL} + - LAKEFS_AUTH_PROVIDERS_OIDC_CLIENT_ID=${LAKEFS_AUTH_PROVIDERS_OIDC_CLIENT_ID} + - LAKEFS_AUTH_PROVIDERS_OIDC_CLIENT_SECRET=${LAKEFS_AUTH_PROVIDERS_OIDC_CLIENT_SECRET} + - LAKEFS_AUTH_PROVIDERS_OIDC_CALLBACK_BASE_URL=http://localhost:8000 + - LAKEFS_AUTH_PROVIDERS_OIDC_LOGOUT_CLIENT_ID_QUERY_PARAMETER=${LAKEFS_AUTH_OIDC_LOGOUT_CLIENT_ID_QUERY_PARAMETER} + - LAKEFS_ENTERPRISE_LICENSE_SERVER_URL=https://license.lakefs.io + - LAKEFS_LICENSE_CONTENTS=${LAKEFS_LICENSE_CONTENTS} + - LAKEFS_AUTH_PROVIDERS_OIDC_LOGOUT_CLIENT_ID_QUERY_PARAMETER=${LAKEFS_AUTH_PROVIDERS_OIDC_LOGOUT_CLIENT_ID_QUERY_PARAMETER} - LAKEFS_DATABASE_TYPE=postgres - - LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING=postgres://lakefs:lakefs@postgres/postgres?sslmode=disable + - LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING=postgres://lakefs:lakefs@postgres:5432/postgres?sslmode=disable - LAKEFS_BLOCKSTORE_TYPE=local - - LAKEFS_BLOCKSTORE_LOCAL_PATH=/home/lakefs + - LAKEFS_BLOCKSTORE_LOCAL_PATH=/tmp/lakefs/data - LAKEFS_BLOCKSTORE_LOCAL_IMPORT_ENABLED=true - - LAKEFS_AUTH_OIDC_FRIENDLY_NAME_CLAIM_NAME=${LAKEFS_AUTH_OIDC_FRIENDLY_NAME_CLAIM_NAME} + - LAKEFS_FEATURES_LOCAL_RBAC=true entrypoint: ["/app/wait-for", "postgres:5432", "--", "/app/lakefs", "run"] configs: - source: lakefs.yaml @@ -185,35 +178,6 @@ You can choose from the following options: POSTGRES_USER: lakefs POSTGRES_PASSWORD: lakefs - fluffy: - image: "${FLUFFY_REPO:-treeverse}/fluffy:${TAG:-latest}" - command: "${COMMAND:-run}" - ports: - - "8000:8000" - - "9000:9000" - depends_on: - - "postgres" - environment: - - FLUFFY_LOGGING_LEVEL=DEBUG - - FLUFFY_DATABASE_TYPE=postgres - - FLUFFY_DATABASE_POSTGRES_CONNECTION_STRING=postgres://lakefs:lakefs@postgres/postgres?sslmode=disable - - FLUFFY_AUTH_ENCRYPT_SECRET_KEY="random_secret" - - FLUFFY_AUTH_SERVE_LISTEN_ADDRESS=0.0.0.0:9000 - - FLUFFY_LISTEN_ADDRESS=0.0.0.0:8000 - - FLUFFY_AUTH_SERVE_DISABLE_AUTHENTICATION=true - - FLUFFY_AUTH_LOGOUT_REDIRECT_URL=${FLUFFY_AUTH_LOGOUT_REDIRECT_URL} - - FLUFFY_AUTH_POST_LOGIN_REDIRECT_URL=http://localhost:8080/ - - FLUFFY_AUTH_OIDC_ENABLED=true - - FLUFFY_AUTH_OIDC_URL=${FLUFFY_AUTH_OIDC_URL} - - FLUFFY_AUTH_OIDC_CLIENT_ID=${FLUFFY_AUTH_OIDC_CLIENT_ID} - - FLUFFY_AUTH_OIDC_CLIENT_SECRET=${FLUFFY_AUTH_OIDC_CLIENT_SECRET} - - FLUFFY_AUTH_OIDC_CALLBACK_BASE_URL=http://localhost:8000 - - FLUFFY_AUTH_OIDC_LOGOUT_CLIENT_ID_QUERY_PARAMETER=${FLUFFY_AUTH_OIDC_LOGOUT_CLIENT_ID_QUERY_PARAMETER} - entrypoint: [ "/app/wait-for", "postgres:5432", "--", "/app/fluffy" ] - configs: - - source: fluffy.yaml - target: /etc/fluffy/config.yaml - #This tweak is unfortunate but also necessary. logout_endpoint_query_parameters is a list #of strings which isn't parsed nicely as env vars. configs: @@ -228,25 +192,21 @@ You can choose from the following options: # friendly_name_claim_name: "name" default_initial_groups: - Admins - - fluffy.yaml: - content: | - auth: - oidc: - logout_endpoint_query_parameters: - - returnTo - - http://localhost:8080/oidc/login + providers: + oidc: + logout_endpoint_query_parameters: + - returnTo + - http://localhost:8000/oidc/login ``` ## Kubernetes Helm Chart Quickstart -In order to use lakeFS Enterprise and Fluffy, we provided out of the box setup, see [lakeFS Helm chart configuration](https://github.com/treeverse/charts/tree/master/charts/lakefs). +In order to use lakeFS Enterprise, we provided out of the box setup, see [lakeFS Helm chart configuration](https://github.com/treeverse/charts/tree/master/charts/lakefs). The values below create a fully functional lakeFS Enterprise setup without SSO support. The created setup is connected to a [local blockstore](../../howto/deploy/onprem.md#local-blockstore), and spins up the following pods: -* lakeFS -* Fluffy (includes lakeFS Enterprise features) -* Postgres: used by lakeFS and Fluffy as a shared KV store +* lakeFS Enterprise +* Postgres: used by lakeFS as a KV store !!! info @@ -257,21 +217,35 @@ The values below create a fully functional lakeFS Enterprise setup without SSO s 1. You have a Kubernetes cluster running in one of the platforms [supported by lakeFS](../../howto/deploy/index.md#deployment-and-setup-details). 2. [Helm](https://helm.sh/docs/intro/install/) is installed -3. Access to download *dockerhub/fluffy* from [Docker Hub](https://hub.docker.com/u/treeverse). [Contact us](https://lakefs.io/contact-sales/) to gain access to Fluffy. +3. Access to download *treeverse/lakefs-enterprise* from [Docker Hub](https://hub.docker.com/u/treeverse). +4. lakeFS Enterprise license + [Contact us](https://lakefs.io/contact-sales/) to gain access to lakeFS Enterprise. ### Instructions 1. Add the lakeFS Helm repository with `helm repo add lakefs https://charts.lakefs.io` -1. Create a `values.yaml` file with the following content and make sure to replace `` with the token Docker Hub token you recieved, `` and ``. +1. Create a `values.yaml` file with the following content and make sure to replace `` with the Docker Hub token you received, `` and ``. 1. In the desired K8S namespace run `helm install lakefs lakefs/lakefs -f values.yaml` -1. In your browser go to the Ingress host to access lakeFS UI. +1. In your browser, go to the Ingress host to access lakeFS UI. ```yaml +enterprise: + enabled: true + +image: + privateRegistry: + enabled: true + secretToken: + lakefsConfig: | logging: - level: "DEBUG" + level: "DEBUG" blockstore: type: local + auth: + ui_config: + rbac: internal + ingress: enabled: true ingressClassName: @@ -280,22 +254,8 @@ ingress: - host: paths: - / -fluffy: - enabled: true - image: - privateRegistry: - enabled: true - secretToken: - fluffyConfig: | - logging: - level: "DEBUG" - secrets: - create: true - sso: - enabled: false - rbac: - enabled: true -# useDevPostgres is true by default and will override any other db configuration, set false for configuring your own db +# useDevPostgres is false by default and will override any other db configuration, +# set false or remove for configuring your own db useDevPostgres: true -``` +``` \ No newline at end of file diff --git a/docs/src/enterprise/troubleshooting.md b/docs/src/enterprise/troubleshooting.md index 80f000641d4..16f941a7b4f 100644 --- a/docs/src/enterprise/troubleshooting.md +++ b/docs/src/enterprise/troubleshooting.md @@ -5,17 +5,16 @@ description: Guidance on troubleshooting a lakeFS Enterprise deployment # Troubleshooting lakeFS Enterprise -A lakeFS Enterprise deployment has multiple moving parts that must all be deployed and configured correctly. This is especially true during initial setup. To help troubleshoot issues, both lakeFS and fluffy include the `flare` command. +A lakeFS Enterprise deployment includes various configuration components that must be set up correctly, especially during initial setup. To help troubleshoot configuration and deployment issues, lakeFS includes the `flare` command. ## The `flare` command #### Synopsis -Both `lakefs` and `fluffy` include the flare command +The `lakeFS` binary include the flare command ```bash lakefs flare [flags] -fluffy flare [flags] ``` #### Flags @@ -34,22 +33,19 @@ fluffy flare [flags] ```shell # This will run flare with output set to stdout, which is redirected into a file $ ./lakefs flare --stdout > lakefs.flare -# The same works for fluffy -$ ./fluffy flare --stdout > fluffy.flare ``` ## What Information Does the `flare` Command Collect? ### Configuration -Both lakeFS and fluffy allow configuration to be supplied in multiple ways: configuration file, environment variables, `.env` files, and command flags. The `flare` command collects the fully resolved final configuration used by the lakeFS/fluffy process. +lakeFS Enterprise allow configuration to be supplied in multiple ways: configuration file, environment variables, and command flags. The `flare` command collects the fully resolved final configuration used by the lakeFS process. ### Environment Variables -When troubleshooting, it's important to get a view of the environment in which lakeFS/fluffy are running. This is especially true for container-based deployment environments, like Kubernetes, where env vars are used extensively. The `flare` command collects environment variables with the following prefixes: +When troubleshooting, it's important to get a view of the environment in which lakeFS are running. This is especially true for container-based deployment environments, like Kubernetes, where env vars are used extensively. The `flare` command collects environment variables with the following prefixes: - `LAKEFS_` -- `FLUFFY_` - `HTTP_` - `HOSTNAME` @@ -67,25 +63,22 @@ Both configuration and env var include sensitive secrets. The `flare` command ha Aside from the specific secret type listed above, `flare` also has the ability to detect and redact generic high-entropy strings, which are likely to be secrets. -Redacted secrets are replaced by a `SHA512` hash of the value. This allows comparing them (e.g., between lakeFS and fluffy) without exposing the actual values. +Redacted secrets are replaced by a `SHA512` hash of the value without exposing the actual values. ## Usage - Collect and Send Flare -The following script is intended to be run locally and assumes that lakeFS and fluffy are deployed to a Kubernetes cluster, since this is the recommended setup. -Running this script requires that `kubectl` be installed on the machine it is being run from and that `kubectl` is configured with the correct context and credentials to access the cluster. Aside from running the `flare` command on both lakeFS and fluffy, this script also fetches the logs from all running pods of lakeFS and fluffy. +The following script is intended to be run locally and assumes that lakeFS Enterprise is deployed to a Kubernetes cluster, since this is the recommended setup. +Running this script requires that `kubectl` be installed on the machine it is being run from and that `kubectl` is configured with the correct context and credentials to access the cluster. Aside from running the `flare` command on lakeFS Enterprise, this script also fetches the logs from all running pods of lakeFS. ### Step 1 - Set Script Variables At the top of the script you'll find the `Variables` block. It is important to change these values according to how lakeFS is deployed in your cluster. -`NAMESPACE` - The K8s namespace where lakeFS and fluffy are deployed +`NAMESPACE` - The K8s namespace where lakeFS is deployed `LAKEFS_DEPLOYMENT` - The name of the lakeFS K8s deployment -`FLUFFY_DEPLOYMENT` - The name of the fluffy K8s deployment `LAKEFS_LOGS_OUTPUT_FILE` - The name of the local file where lakeFS logs will be saved -`FLUFFY_LOGS_OUTPUT_FILE` - The name of the local file where fluffy logs will be saved `LAKEFS_FLARE_FILE` - The name of the local file where the lakeFS `flare` result will be saved -`FLUFFY_FLARE_FILE` - The name of the local file where the fluffy `flare` result will be saved ### Step 2 - Execute the Script @@ -98,11 +91,8 @@ NC='\033[0m' # Variables NAMESPACE=lakefs-prod LAKEFS_DEPLOYMENT=lakefs-server -FLUFFY_DEPLOYMENT=lakefs-fluffy LAKEFS_LOGS_OUTPUT_FILE=lakefs.log -FLUFFY_LOGS_OUTPUT_FILE=fluffy.log LAKEFS_FLARE_FILE=lakefs.flare -FLUFFY_FLARE_FILE=fluffy.flare # Find kubectl KUBECTLCMD=$(which kubectl) @@ -113,15 +103,13 @@ then fi $KUBECTLCMD get pods -o name -n $NAMESPACE | grep pod/$LAKEFS_DEPLOYMENT | xargs -I {} $KUBECTLCMD logs -n $NAMESPACE --all-containers=true --prefix --ignore-errors --timestamps {} > $LAKEFS_LOGS_OUTPUT_FILE -$KUBECTLCMD get pods -o name -n $NAMESPACE | grep pod/$FLUFFY_DEPLOYMENT | xargs -I {} $KUBECTLCMD logs -n $NAMESPACE --all-containers=true --prefix --ignore-errors --timestamps {} > $FLUFFY_LOGS_OUTPUT_FILE $KUBECTLCMD exec deployment/$LAKEFS_DEPLOYMENT -- ./lakefs flare --stdout > $LAKEFS_FLARE_FILE -$KUBECTLCMD exec deployment/$FLUFFY_DEPLOYMENT -- ./lakefs flare --stdout > $FLUFFY_FLARE_FILE ``` ### Step 3 - Inspect the Output Files -After executing the script you should have four files: lakeFS/fluffy logs and lakeFS/fluffy flare output. Before sharing these files, please review them to make sure all secrets were correctly redacted and that all the collected information is shareable. +After executing the script you should have two files: lakeFS logs and lakeFS flare output. Before sharing these files, please review them to make sure all secrets were correctly redacted and that all the collected information is shareable. ### Step 4 - Zip Output Files and Attach to Support Ticket diff --git a/docs/src/enterprise/upgrade.md b/docs/src/enterprise/upgrade.md index a0c246e16a7..99587c10d2b 100644 --- a/docs/src/enterprise/upgrade.md +++ b/docs/src/enterprise/upgrade.md @@ -6,3 +6,874 @@ description: How to upgrade lakeFS Enterprise # Upgrade For upgrading from lakeFS enterprise to a newer version see [lakefs migration](../howto/deploy/upgrade.md). + + +## Migrate From Fluffy to lakeFS Enterprise + +The new lakeFS Enterprise integrates all enterprise features directly into a single binary, eliminating the need for the separate Fluffy service. This simplifies deployment, configuration, and maintenance. + +## Prerequisites + +1. You're using lakeFS enterprise binary or the image in Dockerhub treeverse/lakefs-enterprise with fluffy. +1. Your lakeFS-Enterprise version is >= 1.63.0 +1. You possess a lakeFS Enterprise license. + +!!! note + [Contact us](https://lakefs.io/contact-sales/) to gain access to lakeFS Enterprise. You will be granted a token that enables downloading *dockerhub/lakeFS-Enterprise* + from [Docker Hub](https://hub.docker.com/u/treeverse), and a license to run lakeFS Enterprise. + + +To migrate from fluffy to lakeFS Enterprise, follow the steps below: + +1. Sanity Test (Optional): Install a new test lakeFS Enterprise before moving your current production setup. **Make sure to include your lakeFS Enterprise license in the configuration before setup**. Test the setup → login → create repository, etc. Once everything seems to work, delete and cleanup the test setup and we will move to the migration process. +1. Update configuration: Unlike lakeFS + Fluffy, lakeFS Enterprise uses only one configuration file. See [Configuration Changes](#configuration-changes), make sure to add the license to the configuration. +1. Spin down lakeFS and fluffy, and run lakeFS Enterprise! + +!!! warning + Please note that there will be a short downtime while replacing the lakeFS instances. + +## Configuration Changes + + +### Authentication configuration + +Most Fluffy `auth.*` settings migrate directly to lakeFS Enterprise with the same structure. Below are the differences between the configurations. + + +!!! note "SAML" + + === "lakeFS & Fluffy (old)" + + ```yaml + # fluffy.yaml + auth: + logout_redirect_url: https://lakefs.company.com + post_login_redirect_url: https://lakefs.company.com + saml: + enabled: true + sp_root_url: https://lakefs.company.com + sp_x509_key_path: dummy_saml_rsa.key + sp_x509_cert_path: dummy_saml_rsa.cert + sp_sign_request: true + sp_signature_method: http://www.w3.org/2001/04/xmldsig-more#rsa-sha256 + idp_metadata_url: https://my.saml-provider.com/federationmetadata/2007-06/federationmetadata.xml + # idp_authn_name_id_format: "urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified" + external_user_id_claim_name: samName + # idp_metadata_file_path: + # idp_skip_verify_tls_cert: true + ``` + ```yaml + # lakefs.yaml + auth: + logout_redirect_url: https://lakefs.company.com + cookie_auth_verification: + auth_source: saml + friendly_name_claim_name: displayName + persist_friendly_name: true + external_user_id_claim_name: samName + validate_id_token_claims: + department: r_n_d + default_initial_groups: + - "Developers" + ui_config: + login_url: https://lakefs.company.com/sso/login-saml + logout_url: https://lakefs.company.com/sso/logout-saml + login_cookie_names: + - internal_auth_session + - saml_auth_session + ``` + + + === "lakeFS Enterprise (new)" + + ```yaml + # lakefs.yaml + auth: + logout_redirect_url: https://lakefs.company.com/ # optional, URL to redirect to after logout + cookie_auth_verification: + auth_source: saml + friendly_name_claim_name: displayName + default_initial_groups: ["Admins"] + external_user_id_claim_name: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name + validate_id_token_claims: + department: r_n_d + providers: + saml: + # enabled: true # This field was dropped! + sp_root_url: https://lakefs.company.com + sp_x509_key_path: dummy_saml_rsa.key + sp_x509_cert_path: dummy_saml_rsa.cert + sp_sign_request: true + sp_signature_method: http://www.w3.org/2001/04/xmldsig-more#rsa-sha256 + idp_metadata_url: https://my.saml-provider.com/federationmetadata/2007-06/federationmetadata.xml + post_login_redirect_url: / # Where to redirect after successful SAML login + # external_user_id_claim_name: # This field was moved to auth.cookie_auth_verification + ui_config: + login_url: https://lakefs.company.com/sso/login-saml + logout_url: https://lakefs.company.com/sso/logout-saml + login_cookie_names: + - internal_auth_session + - saml_auth_session + ``` + + +!!! note "OIDC + OIDC STS" + + === "lakeFS + Fluffy (old)" + + ```yaml + # fluffy.yaml + auth: + post_login_redirect_url: / + logout_redirect_url: https://oidc-provider-url.com/logout/url + oidc: + enabled: true + url: https://oidc-provider-url.com/ + client_id: + client_secret: + callback_base_url: https://lakefs.company.com + is_default_login: true + logout_client_id_query_parameter: client_id + logout_endpoint_query_parameters: + - returnTo + - https://lakefs.company.com/oidc/login + ``` + + ```yaml + # lakefs.yaml + auth: + oidc: + friendly_name_claim_name: "name" + persist_friendly_name: true + default_initial_groups: ["Developers"] + ui_config: + login_url: /oidc/login + logout_url: /oidc/logout + login_cookie_names: + - internal_auth_session + - oidc_auth_session + ``` + + === "lakeFS Enterprise (new)" + + ```yaml + # lakefs.yaml + auth: + logout_redirect_url: https://oidc-provider-url.com/logout/url # optional, URL to redirect to after logout + ui_config: + login_url: /oidc/login + logout_url: /oidc/logout + login_cookie_names: + - internal_auth_session + - oidc_auth_session + oidc: + friendly_name_claim_name: "nickname" + default_initial_groups: ["Admins"] + providers: + oidc: + # enabled: true # This field was dropped! + post_login_redirect_url: / # This field was moved here! + url: https://oidc-provider-url.com/ + client_id: + client_secret: + callback_base_url: https://lakefs.company.com + logout_client_id_query_parameter: client_id + logout_endpoint_query_parameters: + - returnTo + - http://lakefs.company.com/oidc/login + ``` + + +!!! note "LDAP" + + === "lakeFS + Fluffy (old)" + + ```yaml + # fluffy.yaml + auth: + post_login_redirect_url: / + ldap: + server_endpoint: ldaps://ldap.company.com:636 + bind_dn: uid=,ou=,o=,dc=,dc=com + bind_password: '' + username_attribute: uid + user_base_dn: ou=,o=,dc=,dc=com + user_filter: (objectClass=inetOrgPerson) + connection_timeout_seconds: 15 + request_timeout_seconds: 7 + ``` + ```yaml + # lakefs.yaml + auth: + remote_authenticator: + enabled: true + endpoint: http://:/api/v1/ldap/login + default_user_group: "Developers" # Value needs to correspond with an existing group in lakeFS + ui_config: + logout_url: /logout + login_cookie_names: + - internal_auth_session + ``` + + === "lakeFS Enterprise (new)" + + ```yaml + # lakefs.yaml + auth: + ui_config: + logout_url: /logout + login_cookie_names: + - internal_auth_session + providers: + ldap: + server_endpoint: ldaps://ldap.company.com:636 + bind_dn: uid=,ou=,o=,dc=,dc=com + bind_password: '' + username_attribute: uid + user_base_dn: ou=,o=,dc=,dc=com + user_filter: (objectClass=inetOrgPerson) + connection_timeout_seconds: 15 + request_timeout_seconds: 7 + default_user_group: "Developers" # This field moved here! + ``` + +!!! note "AWS IAM" + + === "lakeFS + Fluffy (old)" + + ```yaml + # fluffy.yaml + serve_listen: "localhost:9001" + auth: + external: + aws_auth: + enabled: true + required_headers: + X-LakeFS-Server-ID: "localhost" + ``` + ```yaml + # lakefs.yaml + auth: + authentication_api: + endpoint: http://localhost:9001/api/v1 + external_principals_enabled: true + ``` + + === "lakeFS Enterprise (new)" + + ```yaml + # lakefs.yaml + auth: + external_aws_auth: + enabled: true + required_headers: + X-LakeFS-Server-ID: "localhost" + + ``` + + +### Authorization configuration + +!!! note "RBAC" + + === "lakeFS + Fluffy (old)" + + ```yaml + # fluffy.yaml + auth: + serve_listen_address: "localhost:9000" + cache: + enabled: true + ``` + + ```yaml + # lakefs.yaml + auth: + api: + endpoint: http://localhost:9000/api/v1 + ``` + + === "lakeFS Enterprise (new)" + + ```yaml + # lakefs.yaml + auth: + # serve_disable_authentication: false # this field was dropped! + # serve_listen_address: "localhost:9000" # this field was dropped! + # api: # this field was dropped! + # endpoint: http://localhost:9000/api/v1 # this field was dropped! + cache: + enabled: true + ``` + +## Kubernetes: Migrating with Helm from Fluffy to new lakeFS Enterprise + +### Overview + +Starting with **lakeFS Helm chart version 1.5.0**, the Fluffy authentication service has been deprecated and replaced with native lakeFS Enterprise authentication. This migration consolidates authentication into the main lakeFS application, simplifying deployment and maintenance. + +#### What's Changing + +When you upgrade to lakeFS Enterprise: + +- **Fluffy Deployment Removed**: The separate Fluffy deployment, service, and associated Kubernetes resources are no longer needed +- **Simplified Architecture**: Authentication is now handled directly by lakeFS Enterprise, reducing the number of pods and services +- **Streamlined Ingress**: No more routing between Fluffy and lakeFS - all traffic goes directly to lakeFS +- **Updated values.yaml Structure**: Authentication configuration moves from `fluffy.*` to `enterprise.auth.*` and `lakefsConfig.auth.providers.*` + +#### Prerequisites + +- Current lakeFS deployment using Fluffy authentication (chart version < 1.5.0) +- Access to update Helm values +- lakeFS Enterprise Docker Hub token +- Backup of your current values.yaml + +### Step-by-Step Migration Guide + +#### Step 1: Update Helm Repository + +```bash +helm repo update lakefs +``` + +Verify you have access to chart version 1.5.0 or later: +```bash +helm search repo lakefs/lakefs --versions +``` + +#### Step 2: Review New Chart Values + +Examine all available configuration options in the new chart: +```bash +helm show values lakefs/lakefs --version 1.5.0 > new-values-reference.yaml +``` + +#### Step 3: Update Your Image Configuration + +If you're overriding the image in your values.yaml, update it to use lakeFS Enterprise: + +```yaml +image: + repository: treeverse/lakefs-enterprise + tag: (TODO: specify the version tag, e.g., 1.5.0)@ItamarYuran + privateRegistry: + enabled: true + secretToken: +``` + +**Note**: If you're not overriding the image, the chart will automatically use the correct Enterprise image. + +#### Step 4: Migrate Your Authentication Configuration + +Using the [configuration examples below](#configuration-examples), update your values.yaml file: +1. Remove all `fluffy.*` configuration sections +2. Add the new `enterprise.auth.*` configuration for your authentication method +3. Move authentication settings to `lakefsConfig.auth.providers.*` + +Refer to the complete examples in the [lakeFS Helm chart repository](https://github.com/treeverse/charts/tree/master/examples/lakefs/enterprise). + +#### Step 5: Validate with Dry Run + +Before applying changes, validate your configuration: +```bash +helm upgrade lakefs/lakefs \ + --version 1.5.0 \ + --namespace \ + --values \ + --dry-run +``` + +Review the output to ensure: +- No Fluffy resources are being created +- lakeFS Enterprise deployment is configured correctly +- Ingress configuration is simplified + +#### Step 6: Perform the Upgrade + +Once validated, perform the actual upgrade: +```bash +helm upgrade lakefs/lakefs \ + --version 1.5.0 \ + --namespace \ + --values +``` + +#### Step 7: Verify the Migration + +After the upgrade completes: + +1. **Check Pod Status**: + ```bash + kubectl get pods -n + # Fluffy pods should no longer exist + ``` + +2. **Verify lakeFS Health**: + ```bash + kubectl exec -n -- curl http://localhost:8000/_health + ``` + +3. **Check Logs**: + ```bash + kubectl logs -n + # Look for successful authentication provider initialization + ``` + +4. **Test Authentication**: + - Navigate to your lakeFS URL + - Verify SSO login works correctly + - Confirm RBAC permissions are preserved + +5. **Verify Fluffy Resources Removed**: + ```bash + kubectl get all -n | grep fluffy + # Should return no results + ``` + +#### Step 8: Rollback (if needed) + +If you encounter issues, rollback to the previous version: +```bash +# Find the previous revision +helm history -n + +# Rollback to previous revision +helm rollback -n +``` + +### Configuration Examples + +Below are complete configuration examples for each authentication method, showing both the old (Fluffy) and new (Enterprise) configurations: + +#### OIDC with Helm + +!!! note "OIDC with Helm" + + === "lakeFS + Fluffy (old)" + + ```yaml + ingress: + enabled: true + ingressClassName: + hosts: + # the ingress that will be created for lakeFS + - host: + paths: + - / + + fluffy: + enabled: true + image: + privateRegistry: + enabled: true + secretToken: + fluffyConfig: | + auth: + logout_redirect_url: https://oidc-provider-url.com/logout/example + oidc: + enabled: true + url: https://oidc-provider-url.com/ + client_id: + callback_base_url: https:// + # the claim name that represents the client identifier in the OIDC provider (e.g Okta) + logout_client_id_query_parameter: client_id + # the query parameters that will be used to redirect the user to the OIDC provider (e.g Okta) after logout + logout_endpoint_query_parameters: + - returnTo + - https:///oidc/login + secrets: + create: true + sso: + enabled: true + oidc: + enabled: true + # secret given by the OIDC provider (e.g auth0, Okta, etc) + client_secret: + rbac: + enabled: true + + lakefsConfig: | + database: + type: local + blockstore: + type: local + auth: + ui_config: + login_cookie_names: + - internal_auth_session + - oidc_auth_session + oidc: + friendly_name_claim_name: + default_initial_groups: ["Developers"] + ``` + + === "lakeFS Enterprise (new)" + + ```yaml + ingress: + enabled: true + ingressClassName: + hosts: + # the ingress that will be created for lakeFS + - host: + paths: + - / + + enterprise: + enabled: true + auth: + oidc: + enabled: true + # secret given by the OIDC provider (e.g auth0, Okta, etc) + client_secret: + + image: + privateRegistry: + enabled: true + secretToken: + + lakefsConfig: | + blockstore: + type: local + auth: + logout_redirect_url: https://oidc-provider-url.com/logout/example + oidc: + friendly_name_claim_name: + default_initial_groups: ["Developers"] + providers: + oidc: + post_login_redirect_url: / + url: https://oidc-provider-url.com/ + client_id: + callback_base_url: https:// + # the claim name that represents the client identifier in the OIDC provider (e.g Okta) + logout_client_id_query_parameter: client_id + # the query parameters that will be used to redirect the user to the OIDC provider (e.g Okta) after logout + logout_endpoint_query_parameters: + - returnTo + - https:///oidc/login + ``` + +#### SAML with Helm + +!!! note "SAML with Helm" + + === "lakeFS + Fluffy (old)" + + ```yaml + ingress: + enabled: true + ingressClassName: + hosts: + # the ingress that will be created for lakeFS + - host: + paths: + - / + + fluffy: + enabled: true + image: + privateRegistry: + enabled: true + secretToken: + fluffyConfig: | + auth: + # logout_redirect_url: https:// + # post_login_redirect_url: https:// + saml: + sp_sign_request: true + # depends on IDP + sp_signature_method: "http://www.w3.org/2001/04/xmldsig-more#rsa-sha256" + # url to the metadata of the IDP + idp_metadata_url: "https:///federationmetadata/2007-06/federationmetadata.xml" + # IDP SAML claims format default unspecified + # idp_authn_name_id_format: "urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified" + # claim name from IDP to use as the unique user name + external_user_id_claim_name: samName + # depending on IDP setup, if CA certs are self signed and not trusted by a known CA + idp_skip_verify_tls_cert: true + rbac: + enabled: true + secrets: + create: true + sso: + enabled: true + saml: + enabled: true + createSecret: true + lakeFSServiceProviderIngress: https:// + certificate: + saml_rsa_public_cert: | + -----BEGIN CERTIFICATE----- + ... + -----END CERTIFICATE----- + saml_rsa_private_key: | + -----BEGIN PRIVATE KEY----- + ... + -----END PRIVATE KEY----- + + lakefsConfig: | + blockstore: + type: local + auth: + cookie_auth_verification: + # claim name to display user in the UI + friendly_name_claim_name: displayName + # claim name from IDP to use as the unique user name + external_user_id_claim_name: samName + default_initial_groups: + - "Developers" + ui_config: + login_cookie_names: + - internal_auth_session + - saml_auth_session + ``` + + === "lakeFS Enterprise (new)" + + ```yaml + ingress: + enabled: true + ingressClassName: + hosts: + # the ingress that will be created for lakeFS + - host: + paths: + - / + + enterprise: + enabled: true + auth: + saml: + enabled: true + createCertificateSecret: true + certificate: + samlRsaPublicCert: | + -----BEGIN CERTIFICATE----- + ... + -----END CERTIFICATE----- + samlRsaPrivateKey: | + -----BEGIN PRIVATE KEY----- + ... + -----END PRIVATE KEY----- + + image: + privateRegistry: + enabled: true + secretToken: + + lakefsConfig: | + blockstore: + type: local + auth: + logout_redirect_url: https:// + cookie_auth_verification: + auth_source: saml + # claim name to display user in the UI + friendly_name_claim_name: displayName + # claim name from IDP to use as the unique user name + external_user_id_claim_name: samName + default_initial_groups: + - "Developers" + providers: + saml: + post_login_redirect_url: https:// + sp_root_url: https:// + sp_sign_request: true + # depends on IDP + sp_signature_method: "http://www.w3.org/2001/04/xmldsig-more#rsa-sha256" + # url to the metadata of the IDP + idp_metadata_url: "https:///federationmetadata/2007-06/federationmetadata.xml" + # IDP SAML claims format default unspecified + idp_authn_name_id_format: "urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified" + # depending on IDP setup, if CA certs are self signed and not trusted by a known CA + #idp_skip_verify_tls_cert: true + ``` + +#### LDAP with Helm + +!!! note "LDAP with Helm" + + === "lakeFS + Fluffy (old)" + + ```yaml + ingress: + enabled: true + ingressClassName: + hosts: + # the ingress that will be created for lakeFS + - host: + paths: + - / + + fluffy: + enabled: true + image: + privateRegistry: + enabled: true + secretToken: + fluffyConfig: | + auth: + post_login_redirect_url: / + ldap: + server_endpoint: ldaps://ldap.company.com:636 + bind_dn: uid=,ou=Users,o=,dc=,dc=com + username_attribute: uid + user_base_dn: ou=Users,o=,dc=,dc=com + user_filter: (objectClass=inetOrgPerson) + connection_timeout_seconds: 15 + request_timeout_seconds: 7 + + secrets: + create: true + + sso: + enabled: true + ldap: + enabled: true + bind_password: + rbac: + enabled: true + + lakefsConfig: | + blockstore: + type: local + auth: + remote_authenticator: + enabled: true + default_user_group: "Developers" + ui_config: + login_cookie_names: + - internal_auth_session + ``` + + === "lakeFS Enterprise (new)" + + ```yaml + ingress: + enabled: true + ingressClassName: + hosts: + # the ingress that will be created for lakeFS + - host: + paths: + - / + + enterprise: + enabled: true + auth: + ldap: + enabled: true + bindPassword: + + image: + privateRegistry: + enabled: true + secretToken: + + lakefsConfig: | + blockstore: + type: local + auth: + ui_config: + login_cookie_names: + - internal_auth_session + providers: + ldap: + server_endpoint: ldaps://ldap.company.com:636 + bind_dn: uid=,ou=Users,o=,dc=,dc=com + username_attribute: uid + user_base_dn: ou=Users,o=,dc=,dc=com + user_filter: (objectClass=inetOrgPerson) + default_user_group: "Developers" + connection_timeout_seconds: 15 + request_timeout_seconds: 7 + ``` + +#### AWS IAM with Helm + +!!! note "AWS IAM with Helm" + + === "lakeFS + Fluffy (old)" + + ```yaml + lakefsConfig: | + auth: + authentication_api: + external_principals_enabled: true + ingress: + enabled: true + ingressClassName: + hosts: + # the ingress that will be created for lakeFS + - host: + paths: + - / + + fluffy: + enabled: true + image: + repository: treeverse/fluffy + pullPolicy: IfNotPresent + privateRegistry: + enabled: true + secretToken: + fluffyConfig: | + auth: + external: + aws_auth: + enabled: true + # the maximum age in seconds for the GetCallerIdentity request + #get_caller_identity_max_age: 60 + # headers that must be present by the client when doing login request + required_headers: + # same host as the lakeFS server ingress + X-LakeFS-Server-ID: + secrets: + create: true + sso: + enabled: true + rbac: + enabled: true + ``` + + === "lakeFS Enterprise (new)" + + ```yaml + ingress: + enabled: true + ingressClassName: + hosts: + # the ingress that will be created for lakeFS + - host: + paths: + - / + + lakefsConfig: | + auth: + external_aws_auth: + enabled: true + # the maximum age in seconds for the GetCallerIdentity request + #get_caller_identity_max_age: 60 + # headers that must be present by the client when doing login request + required_headers: + # same host as the lakeFS server ingress + X-LakeFS-Server-ID: + ``` + +### Important Notes + +* Complete configuration examples for each authentication method are available in the [lakeFS Helm chart repository](https://github.com/treeverse/charts/tree/master/examples/lakefs/enterprise) +* The examples include local blockstore for quick-start - replace with S3/Azure/GCS for production deployments +* Configure the `image.privateRegistry.secretToken` with your DockerHub token for accessing enterprise images +* Update all placeholder values (marked with `<>`) with your actual configuration + +### Troubleshooting + +If you encounter issues during migration: + +1. **Authentication Failures**: Check that all authentication settings have been properly moved to the new configuration structure +2. **Image Pull Errors**: Ensure your DockerHub token has access to the lakeFS Enterprise image +3. **Ingress Issues**: Confirm that your ingress is pointing directly to lakeFS (not Fluffy) + +For additional support, consult the [lakeFS documentation](https://docs.lakefs.io) or contact lakeFS support. \ No newline at end of file diff --git a/docs/src/reference/configuration.md b/docs/src/reference/configuration.md index d50ddca5ea0..13e089f2929 100644 --- a/docs/src/reference/configuration.md +++ b/docs/src/reference/configuration.md @@ -159,6 +159,8 @@ Configuration section for the lakeFS key-value store database. * `auth.oidc.persist_friendly_name` `(string : false)` - If set to `true`, the friendly name is persisted to the KV store and can be displayed in the user list. This is meant to be used in conjunction with `auth.oidc.friendly_name_claim_name`. * `auth.oidc.validate_id_token_claims` `(map[string]string : )` - When a user tries to access lakeFS, validate that the ID token contains these claims with the corresponding values. + + ### blockstore * `blockstore.type` `(one of ["local", "s3", "gs", "azure", "mem"] : required)`. Block adapter to use. This controls where the underlying data will be stored diff --git a/docs/src/security/external-principals-aws.md b/docs/src/security/external-principals-aws.md index 09e860524a1..d205e645406 100644 --- a/docs/src/security/external-principals-aws.md +++ b/docs/src/security/external-principals-aws.md @@ -16,7 +16,7 @@ search: lakeFS supports authenticating users programmatically using AWS IAM roles instead of using static lakeFS access and secret keys. The method enables you to bound IAM principal ARNs to lakeFS users. -A single lakeFS user may have many AWS's principle ARNs attached to it. When a client is authenticating to a lakeFS server with an AWS's session, the actions performed by the client are on behalf of the user attached to the ARN. +A single lakeFS user may have many AWS principal ARNs attached to it. When a client is authenticated to a lakeFS server with an AWS session, the actions performed by the client are on behalf of the user attached to the ARN. ### Using Session Names @@ -33,13 +33,13 @@ If the `SessionName` is `john@acme.com` then lakeFS would return token for `john ### How AWS authentication works -The AWS STS API includes a method, `sts:GetCallerIdentity`, which allows you to validate the identity of a client. The client signs a GetCallerIdentity query using the AWS Signature v4 algorithm and sends it to the lakeFS server. +The AWS STS API includes a method, `sts:GetCallerIdentity`, which allows you to validate the identity of a client. The client signs a GetCallerIdentity query using the AWS Signature v4 algorithm and sends it to the lakeFS server. The `GetCallerIdentity` query consists of four pieces of information: the request URL, the request body, the request headers and the request method. The AWS signature is computed over those fields. The lakeFS server reconstructs the query using this information and forwards it on to the AWS STS service. Depending on the response from the STS service, the server authenticates the client. Notably, clients don't need network-level access themselves to talk to the AWS STS API endpoint; they merely need access to the credentials to sign the request. However, it means that the lakeFS server does need network-level access to send requests to the STS endpoint. -Each signed AWS request includes the current timestamp to mitigate the risk of replay attacks. In addition, lakeFS allows you to require an additional header, `X-LakeFS-Server-ID` (added by default), to be present to mitigate against different types of replay attacks (such as a signed `GetCallerIdentity` request stolen from a dev lakeFS instance and used to authenticate to a prod lakeFS instance). +Each signed AWS request includes the current timestamp to mitigate the risk of replay attacks. In addition, lakeFS allows you to require an additional header, `X-LakeFS-Server-ID` (added by default), to be present to mitigate against different types of replay attacks (such as a signed `GetCallerIdentity` request stolen from a dev lakeFS instance and used to authenticate to a prod lakeFS instance). It's also important to note that Amazon does NOT appear to include any sort of authorization around calls to GetCallerIdentity. For example, if you have an IAM policy on your credential that requires all access to be MFA authenticated, non-MFA authenticated credentials will still be able to authenticate to lakeFS using this method. @@ -47,53 +47,60 @@ It's also important to note that Amazon does NOT appear to include any sort of a ## Server Configuration !!! info - lakeFS Helm chart supports the configuration since version `1.2.11` - see usage [values.yaml example](https://github.com/treeverse/charts/blob/master/examples/lakefs/enterprise/values-external-aws.yaml). + lakeFS Helm chart supports the configuration below since version 1.5.0 -* in lakeFS `auth.authentication_api.external_principals_enabled` must be set to `true` in the configuration file, other configuration (`auth.authentication_api.*`) can be found at [configuration reference](../reference/configuration.md) +To enable AWS IAM authentication in lakeFS Enterprise: -For the full list of the Fluffy server configuration, see [Fluffy Configuration][fluffy-configuration] under `auth.external.aws_auth` +1. Enable external principals in lakeFS configuration +2. Configure external AWS authentication settings - -!!! note - By default, lakeFS clients will add the parameter `X-LakeFS-Server-ID: ` to the initial [login request][login-api] for STS. - - -**Example configuration with required headers:** - -Configuration for `lakefs.yaml`: - -```yaml -auth: - authentication_api: - endpoint: http:///api/v1 - external_principals_enabled: true - api: - endpoint: http:///api/v1 -``` - -Configuration for `fluffy.yaml`: +**Helm Configuration (`values.yaml`):** ```yaml -# fluffy address for lakefs auth.authentication_api.endpoint -# used by lakeFS to log in and get the token -listen_address: -auth: - # fluffy address for lakeFS auth.api.endpoint - # used by lakeFS to manage the lifecycle attach/detach of the external principals - serve_listen_address: - external: - aws_auth: +ingress: + enabled: true + ingressClassName: + hosts: + - host: + paths: + - / + +lakefsConfig: | + auth: + # Configure external AWS authentication + external_aws_auth: enabled: true + # the maximum age in seconds for the GetCallerIdentity request + #get_caller_identity_max_age: 60 # headers that must be present by the client when doing login request required_headers: # same host as the lakeFS server ingress X-LakeFS-Server-ID: ``` +!!! note + By default, lakeFS clients will add the parameter `X-LakeFS-Server-ID: ` to the initial [login request][login-api] for STS. + +**Direct Configuration File (`lakefs.yaml`):** + +```yaml +auth: + external_aws_auth: + enabled: true + # Optional: max age for GetCallerIdentity requests (default: 24h) + get_caller_identity_max_age: 3600 + # Required headers for login requests + required_headers: + X-LakeFS-Server-ID: + # Optional headers that may be present + optional_headers: + X-Custom-Header: value +``` + ## Administration of IAM Roles in lakeFS Administration refers to the management of the IAM roles that are allowed to authenticate to lakeFS. -Operations such as attaching and detaching IAM roles to a user, listing the roles attached to a user, and listing the users attached to a role. +Operations such as attaching and detaching IAM roles to a user, listing the roles attached to a user, and listing the users attached to a role. Currently, this is done through the lakeFS [External Principals API][external-principal-admin] and generated clients. Example of attaching an IAM roles to a user: @@ -128,14 +135,14 @@ Currently, the login operation is supported out of the box in: - [python](#login-with-python) - [Everest mount](../reference/mount.md#authenticating-with-aws-iam-role.md) -For other use cases authenticate to lakeFS via login endpoint, this will require building the request input. +For other use cases authenticated to lakeFS via login endpoint, this will require building the request input. ## Login with python ### prerequisites -1. lakeFS should be [configured](#server-configuration) to allow external principals to authenticate and the used IAM role should be [attached](#administration-of-iam-roles-in-lakefs) to the relevant lakeFS user -2. The Python SDK requires additional packages to be installed in order to generate a lakeFS client with the assumed role. +1. lakeFS should be [configured](#server-configuration) to allow external principals to authenticate, and the used IAM role should be [attached](#administration-of-iam-roles-in-lakefs) to the relevant lakeFS user +2. The Python SDK requires additional packages to be installed to generate a lakeFS client with the assumed role. To install the required packages, run the following command: ```shell @@ -193,5 +200,4 @@ There are two ways in which external principals can be used to authenticate to l [external-principal-admin]: ../reference/api.md#external [login-api]: ../reference/api.md#auth/externalPrincipalLogin [lakefs-hadoopfs]: ../integrations/spark.md#lakefs-hadoop-filesystem -[lakefs-spark]: ../integrations/spark.md#usage-with-temporaryawscredentialslakefstokenprovider -[fluffy-configuration]: ../enterprise/configuration.md#fluffy-server-configuration +[lakefs-spark]: ../integrations/spark.md#usage-with-temporaryawscredentialslakefstokenprovider \ No newline at end of file diff --git a/docs/src/security/sso.md b/docs/src/security/sso.md index 2a5c38b45d7..6ff3f3ec21f 100644 --- a/docs/src/security/sso.md +++ b/docs/src/security/sso.md @@ -14,7 +14,7 @@ search: ## SSO for lakeFS Cloud -lakeFS Cloud uses Auth0 for authentication and thus support the same identity providers as Auth0 including Active Directory/LDAP, ADFS, Azure Active Directory Native, Google Workspace, OpenID Connect, Okta, PingFederate, SAML, and Azure Active Directory. +lakeFS Cloud uses Auth0 for authentication and thus supports the same identity providers as Auth0 including Active Directory/LDAP, ADFS, Azure Active Directory Native, Google Workspace, OpenID Connect, Okta, PingFederate, SAML, and Azure Active Directory. === "Okta" @@ -131,268 +131,309 @@ lakeFS Cloud uses Auth0 for authentication and thus support the same identity pr ## SSO for lakeFS Enterprise -Authentication in lakeFS Enterprise is handled by a secondary service which runs side-by-side with lakeFS. With a nod to Hogwarts and their security system, we've named this service _Fluffy_. Details for configuring the supported identity providers with Fluffy are shown below. In addition, please review the necessary [Helm configuration](#helm) to configure Fluffy. +Starting from v1.63.0, authentication in lakeFS Enterprise is handled directly by the lakeFS Enterprise service. lakeFS Enterprise supports the following identity providers: * Active Directory Federation Services (AD FS) (using SAML) * OpenID Connect * LDAP +* External AWS Authentication (using IAM) -If you're using an authentication provider that is not listed please [contact us](https://lakefs.io/contact-us/) for further assistance. +If you're using an authentication provider that is not listed, please [contact us](https://lakefs.io/contact-us/) for further assistance. === "Active Directory Federation Services (AD FS) (using SAML)" !!! note - AD FS integration uses certificates to sign & encrypt requests going out from Fluffy and decrypt incoming requests from AD FS server. + AD FS integration uses certificates to sign and encrypt requests going out from lakeFS Enterprise and decrypt incoming requests from AD FS server. - In order for Fluffy to work, the following values must be configured. Update (or override) the following attributes in the chart's `values.yaml` file. - - 1. Replace `fluffy.saml_rsa_public_cert` and `fluffy.saml_rsa_private_key` with real certificate values - 2. Replace `fluffyConfig.auth.saml.idp_metadata_url` with the metadata URL of the AD FS provider (e.g `adfs-auth.company.com`) - 3. Replace `fluffyConfig.auth.saml.external_user_id_claim_name` with the claim name representing user id name in AD FS - 4. Replace `lakefs.company.com` with your lakeFS server URL. - - If you'd like to generate the certificates using OpenSSL, you can take a look at the following example: + If you'd like to generate a self-signed certificates using OpenSSL: ```sh - openssl req -x509 -newkey rsa:2048 -keyout myservice.key -out myservice.cert -days 365 -nodes -subj "/CN=lakefs.company.com" - + openssl req -x509 -newkey rsa:2048 -keyout myservice.key -out myservice.cert -days 365 -nodes -subj "/CN=lakefs.company.com" ``` - - lakeFS Server Configuration (Update in helm's `values.yaml` file): + In order for SAML authentication to work, configure the following values in your chart's `values.yaml` file: ```yaml - auth: - cookie_auth_verification: - auth_source: saml - friendly_name_claim_name: displayName - persist_friendly_name: true - external_user_id_claim_name: samName - default_initial_groups: - - "Developers" - logout_redirect_url: "https://lakefs.company.com/logout-saml" - encrypt: - secret_key: shared-secrey-key - ui_config: - login_url: "https://lakefs.company.com/sso/login-saml" - logout_url: "https://lakefs.company.com/sso/logout-saml" - login_cookie_names: - - internal_auth_session - - saml_auth_session + ingress: + enabled: true + ingressClassName: + hosts: + - host: + paths: + - / + + enterprise: + enabled: true + auth: + saml: + enabled: true + createCertificateSecret: true # NEW: Auto-creates secret + certificate: + samlRsaPublicCert: | # RENAMED: from saml_rsa_public_cert + -----BEGIN CERTIFICATE----- + ... + -----END CERTIFICATE----- + samlRsaPrivateKey: | # RENAMED: from saml_rsa_private_key + -----BEGIN PRIVATE KEY----- + ... + -----END PRIVATE KEY----- + + image: + privateRegistry: + enabled: true + secretToken: + + lakefsConfig: | + blockstore: + type: local + auth: + logout_redirect_url: https:// + cookie_auth_verification: + auth_source: saml + # claim name to display user in the UI + friendly_name_claim_name: displayName + # claim name from IDP to use as the unique user name + external_user_id_claim_name: samName + default_initial_groups: + - "Developers" + providers: + saml: + post_login_redirect_url: https:// + sp_root_url: https:// + sp_sign_request: true + # depends on IDP + sp_signature_method: "http://www.w3.org/2001/04/xmldsig-more#rsa-sha256" + # url to the metadata of the IDP + idp_metadata_url: "https:///federationmetadata/2007-06/federationmetadata.xml" + # IDP SAML claims format default unspecified + idp_authn_name_id_format: "urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified" + # depending on IDP setup, if CA certs are self signed and not trusted by a known CA + #idp_skip_verify_tls_cert: true ``` - Fluffy Configuration (Update in helm's `values.yaml` file): - - ```yaml - logging: - format: "json" - level: "INFO" - audit_log_level: "INFO" - output: "=" - auth: - encrypt: - secret_key: shared-secrey-key - logout_redirect_url: https://lakefs.company.com - post_login_redirect_url: https://lakefs.company.com - saml: - enabled: true - sp_root_url: https://lakefs.company.com - sp_x509_key_path: '/etc/saml_certs/rsa_saml_private.cert' - sp_x509_cert_path: '/etc/saml_certs/rsa_saml_public.pem' - sp_sign_request: true - sp_signature_method: "http://www.w3.org/2001/04/xmldsig-more#rsa-sha256" - idp_metadata_url: "https://adfs-auth.company.com/federationmetadata/2007-06/federationmetadata.xml" - # idp_authn_name_id_format: "urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified" - external_user_id_claim_name: samName - # idp_metadata_file_path: - # idp_skip_verify_tls_cert: true - ``` - === "OpenID Connect" - In order for Fluffy to work, the following values must be configured. Update (or override) the following attributes in the chart's `values.yaml` file. - - 1. Replace `lakefsConfig.friendly_name_claim_name` with the right claim name. - 1. Replace `lakefsConfig.default_initial_groups` with desired claim name (See [pre-configured][rbac-preconfigured] groups for enterprise) - 2. Replace `fluffyConfig.auth.logout_redirect_url` with your full OIDC logout URL (e.g `https://oidc-provider-url.com/logout/path`) - 3. Replace `fluffyConfig.auth.oidc.url` with your OIDC provider URL (e.g `https://oidc-provider-url.com`) - 4. Replace `fluffyConfig.auth.oidc.logout_endpoint_query_parameters` with parameters you'd like to pass to the OIDC provider for logout. - 5. Replace `fluffyConfig.auth.oidc.client_id` and `fluffyConfig.auth.oidc.client_secret` with the client ID & secret for OIDC. - 6. Replace `fluffyConfig.auth.oidc.logout_client_id_query_parameter` with the query parameter that represent the client_id, note that it should match the the key/query param that represents the client id and required by the specific OIDC provider. - 7. Replace `lakefs.company.com` with the lakeFS server URL. - - lakeFS Server Configuration (Update in helm's `values.yaml` file): - - ```yaml - # Important: make sure to include the rest of your lakeFS Configuration here! - auth: - encrypt: - secret_key: shared-secrey-key - oidc: - friendly_name_claim_name: "name" - persist_friendly_name: true - default_initial_groups: ["Developers"] - ui_config: - login_url: /oidc/login - logout_url: /oidc/logout - login_cookie_names: - - internal_auth_session - - oidc_auth_session - ``` - - Fluffy Configuration (Update in helm's `values.yaml` file): + In order for OIDC to work, configure the following values in your chart's `values.yaml` file: ```yaml - logging: - format: "json" - level: "INFO" - audit_log_level: "INFO" - output: "=" - installation: - fixed_id: fluffy-authenticator - auth: - post_login_redirect_url: / - logout_redirect_url: https://oidc-provider-url.com/logout/url - oidc: + ingress: + enabled: true + ingressClassName: + hosts: + - host: + paths: + - / + + enterprise: + enabled: true + auth: + oidc: + enabled: true + # secret given by the OIDC provider (e.g auth0, Okta, etc) + client_secret: + + image: + privateRegistry: enabled: true - url: https://oidc-provider-url.com/ - client_id: - client_secret: - callback_base_url: https://lakefs.company.com - is_default_login: true - logout_client_id_query_parameter: client_id - logout_endpoint_query_parameters: - - returnTo - - https://lakefs.company.com/oidc/login - encrypt: - secret_key: shared-secrey-key + secretToken: + + lakefsConfig: | + blockstore: + type: local + auth: + logout_redirect_url: https://oidc-provider-url.com/logout/example + oidc: + friendly_name_claim_name: + default_initial_groups: ["Developers"] + providers: + oidc: + post_login_redirect_url: / + url: https://oidc-provider-url.com/ + client_id: + callback_base_url: https:// + # the claim name that represents the client identifier in the OIDC provider (e.g Okta) + logout_client_id_query_parameter: client_id + # the query parameters that will be used to redirect the user to the OIDC provider (e.g Okta) after logout + logout_endpoint_query_parameters: + - returnTo + - https:///oidc/login ``` === "LDAP" - Fluffy is incharge of providing LDAP authentication for lakeFS Enterprise. - The authentication works by querying the LDAP server for user information and authenticating the user based on the provided credentials. - - **Important:** An administrative bind user must be configured. It should have search permissions for the LDAP server that will be used to query the LDAP server for user information. - - **For Helm:** set the following attributes in the Helm chart values, for lakeFS `lakefsConfig.*` and `fluffyConfig.*` for fluffy. - - **No Helm:** If not using Helm use the YAML below to directly update the configuration file for each service. - - **lakeFS Configuration:** - - 1. Replace `auth.remote_authenticator.enabled` with `true` - 2. Replace `auth.remote_authenticator.endpoint` with the fluffy authentication server URL combined with the `api/v1/ldap/login` suffix (e.g `http://lakefs.company.com/api/v1/ldap/login`) - - **fluffy Configuration:** - - See [Fluffy configuration][fluffy-configuration] reference. - - 1. Replace `auth.ldap.remote_authenticator.server_endpoint` with your LDAP server endpoint (e.g `ldaps://ldap.ldap-address.com:636`) - 2. Replace `auth.ldap.remote_authenticator.bind_dn` with the LDAP bind user/permissions to query your LDAP server. - 3. Replace `auth.ldap.remote_authenticator.user_base_dn` with the user base to search users in. + lakeFS Enterprise provides direct LDAP authentication by querying the LDAP server for user information and authenticating the user based on the provided credentials. - **lakeFS Server Configuration file:** - - `$lakefs run -c ./lakefs.yaml` + **Important:** An administrative bind user must be configured. It should have search permissions for the LDAP server. ```yaml - # Important: make sure to include the rest of your lakeFS Configuration here! - - auth: - remote_authenticator: + ingress: + enabled: true + ingressClassName: + hosts: + - host: + paths: + - / + + enterprise: + enabled: true + auth: + ldap: + enabled: true + bindPassword: + + image: + privateRegistry: enabled: true - endpoint: http://:/api/v1/ldap/login - default_user_group: "Developers" # Value needs to correspond with an existing group in lakeFS - ui_config: - logout_url: /logout - login_cookie_names: - - internal_auth_session - ``` - - Fluffy Configuration file: - - `$fluffy run -c ./fluffy.yaml` - - ```yaml - logging: - format: "json" - level: "INFO" - audit_log_level: "INFO" - output: "=" - installation: - fixed_id: fluffy-authenticator - auth: - post_login_redirect_url: / - ldap: - server_endpoint: 'ldaps://ldap.company.com:636' - bind_dn: uid=,ou=,o=,dc=,dc=com - bind_password: '' - username_attribute: uid - user_base_dn: ou=,o=,dc=,dc=com - user_filter: (objectClass=inetOrgPerson) - connection_timeout_seconds: 15 - request_timeout_seconds: 7 + secretToken: + + lakefsConfig: | + blockstore: + type: local + auth: + providers: + ldap: + server_endpoint: ldaps://ldap.company.com:636 + bind_dn: uid=,ou=Users,o=,dc=,dc=com + username_attribute: uid + user_base_dn: ou=Users,o=,dc=,dc=com + user_filter: (objectClass=inetOrgPerson) + connection_timeout_seconds: 15 + request_timeout_seconds: 7 + # RBAC group for first time users + default_user_group: "Developers" ``` -

Troubleshooting LDAP issues

+ ### Troubleshooting LDAP issues -

Inspecting Logs

+ #### Inspecting Logs - If you encounter LDAP connection errors, you should inspect the **fluffy container** logs to get more information. + If you encounter LDAP connection errors, inspect the lakeFS container logs to get more information. -

Authentication issues

+ #### Authentication issues Auth issues (e.g. user not found, invalid credentials) can be debugged with the `ldapwhoami` CLI tool. - The Examples are based on the fluffy config above: + The examples are based on the configuration above: To verify that the main bind user can connect: ```sh - ldapwhoami -H ldap://ldap.company.com:636 -D "uid=,ou=,o=,dc=,dc=com" -x -W + ldapwhoami -H ldaps://ldap.company.com:636 -D "uid=bind-user-name,ou=Users,o=org-id,dc=company,dc=com" -x -W ``` To verify that a specific lakeFS user `dev-user` can connect: ```sh - ldapwhoami -H ldap://ldap.company.com:636 -D "uid=dev-user,ou=,o=,dc=,dc=com" -x -W + ldapwhoami -H ldaps://ldap.company.com:636 -D "uid=dev-user,ou=Users,o=org-id,dc=company,dc=com" -x -W ``` -

User not found issue

+ #### User not found issue - Upon a login request in fluffy, the bind user will search for the user in the LDAP server. If the user is not found it will be presented in the logs. + Upon a login request, the bind user will search for the user in the LDAP server. If the user is not found it will be presented in the logs. We can search the user using [ldapsearch](https://docs.ldap.com/ldap-sdk/docs/tool-usages/ldapsearch.html) CLI tool. Search ALL users in the base DN (no filters): !!! note - `-b` is the `user_base_dn`, `-D` is `bind_dn` and `-w` is `bind_password` from the fluffy configuration. + `-b` is the `user_base_dn`, `-D` is `bind_dn` and `-w` is `bind_password` from the lakeFS configuration. ```sh - ldapsearch -H ldap://ldap.company.com:636 -x -b "ou=,o=,dc=,dc=com" -D "uid=,ou=,o=,dc=,dc=com" -w '' + ldapsearch -H ldaps://ldap.company.com:636 -x -b "ou=Users,o=org-id,dc=company,dc=com" -D "uid=bind-user-name,ou=Users,o=org-id,dc=company,dc=com" -w 'bind_user_pwd' ``` - If the user is found, we should now use filters for the specific user the same way fluffy does it and expect to see the user. + If the user is found, we should now use filters for the specific user the same way lakeFS does it and expect to see the user. - For example, to repdocue the same search as fluffy does: + For example, to reproduce the same search as lakeFS does: - user `dev-user` set from `uid` attribute in LDAP - - Fluffy configuration values: `user_filter: (objectClass=inetOrgPerson)` and `username_attribute: uid` + - Configuration values: `user_filter: (objectClass=inetOrgPerson)` and `username_attribute: uid` ```sh - ldapsearch -H ldap://ldap.company.com:636 -x -b "ou=,o=,dc=,dc=com" -D "uid=,ou=,o=,dc=,dc=com" -w '' "(&(uid=dev-user)(objectClass=inetOrgPerson))" + ldapsearch -H ldaps://ldap.company.com:636 -x -b "ou=Users,o=org-id,dc=company,dc=com" -D "uid=bind-user-name,ou=Users,o=org-id,dc=company,dc=com" -w 'bind_user_pwd' "(&(uid=dev-user)(objectClass=inetOrgPerson))" ``` -### Helm +=== "External AWS Authentication" + + lakeFS Enterprise supports authentication using AWS IAM credentials. This allows users to authenticate using their AWS credentials via GetCallerIdentity. + + ```yaml + ingress: + enabled: true + ingressClassName: + hosts: + - host: + paths: + - / + + lakefsConfig: | + auth: + external_aws_auth: + enabled: true + # the maximum age in seconds for the GetCallerIdentity request + #get_caller_identity_max_age: 60 + # headers that must be present by the client when doing login request + required_headers: + # same host as the lakeFS server ingress + X-LakeFS-Server-ID: + ``` + +### Common Configuration + +All authentication methods share some common configuration options: + +```yaml +# Enable enterprise features +enterprise: + enabled: true + +# Common lakeFS configuration +lakefsConfig: | + # Basic auth configuration + auth: + encrypt: + secret_key: # Set via secrets.authEncryptSecretKey + login_duration: 24h # Session duration + login_max_duration: 168h # Maximum session duration +``` + +### Secrets Management + +The Helm chart will automatically create Kubernetes secrets for sensitive values: + +```yaml +secrets: + # Required: encryption key for auth cookies + authEncryptSecretKey: + + # For database connection (if using external DB) + databaseConnectionString: + +# Or use existing secrets +secrets: + existingSecret: my-lakefs-secrets + # Define the keys in your secret: + authEncryptSecretKeyName: authEncryptSecretKey + databaseConnectionStringKeyName: databaseConnectionString +``` + +Authentication provider secrets are managed separately via the `enterprise.auth.*` configuration. + +### Environment Variables + +All configurations can also be set via environment variables: + +```bash +# SAML +LAKEFS_AUTH_PROVIDERS_SAML_SP_ROOT_URL=https://lakefs.company.com +LAKEFS_AUTH_COOKIE_AUTH_VERIFICATION_EXTERNAL_USER_ID_CLAIM_NAME=samName -In order to use lakeFS Enterprise and Fluffy, we provided out of the box setup, see [lakeFS Helm chart configuration](https://github.com/treeverse/charts). +# OIDC +LAKEFS_AUTH_PROVIDERS_OIDC_CLIENT_ID=your-client-id +LAKEFS_AUTH_OIDC_FRIENDLY_NAME_CLAIM_NAME=name -Notes: +# LDAP +LAKEFS_AUTH_PROVIDERS_LDAP_SERVER_ENDPOINT=ldaps://ldap.company.com:636 +LAKEFS_AUTH_PROVIDERS_LDAP_BIND_DN=uid=bind-user,ou=Users,o=org,dc=company,dc=com -* Check the [examples on GitHub](https://github.com/treeverse/charts/tree/master/examples/lakefs/enterprise) we provide for each authentication method (oidc/adfs/ldap + rbac). -* The examples are provisioned with a Postgres pod for quick-start, make sure to replace that to a stable database once ready. -* The encrypt secret key `secrets.authEncryptSecretKey` is shared between fluffy and lakeFS for authentication. -* The lakeFS `image.tag` must be >= 1.0.0 -* The fluffy `image.tag` must be >= 0.2.7 -* Change the `ingress.hosts[0]` from `lakefs.company.com` to a real host (usually same as lakeFS), also update additional references in the file (note: URL path after host if provided should stay unchanged). -* Update the `ingress` configuration with other optional fields if used -* Fluffy docker image: replace the `fluffy.image.privateRegistry.secretToken` with real token to dockerhub for the fluffy docker image. +# External AWS Auth +LAKEFS_AUTH_EXTERNAL_AWS_AUTH_ENABLED=true +``` \ No newline at end of file