Airbyte

Beta

Overview

This module deploys Airbyte, an open-source data integration platform, on Kubernetes. It sets up the necessary Helm release, configures the database, and exposes the Airbyte webserver. This module also includes OAuth2 proxy configuration for secure access.

Tip

This implementation uses the official Airbyte Helm chart. You can find more details in the Airbyte documentation.

Helm values have been customized from the defaults, and common configuration needs are exposed as variables.

If needed the entire helm chart can be customized by setting the override_helm_values variable.

Compute Nodes

For optimal performance, the following resources are recommended:

  • 4 or more CPU cores

  • At least 8GB of memory

While Airbyte can run in low-resource mode with 2 CPUs and 4GB RAM, this is not recommended for production deployments.

Note that Airbyte will schedule new pods for every sync job.

Note

Airbyte includes a self hosted version of Postgres. This will not back iteslf up, and is not HA. For many use cases, this is not a problem, as Airbyte is not a system of record itself, and can easily be recreated in disaster recovery. However, if you care about configuration durability, you should consider using a managed database. The module exposes postgres configuration options, see the example for more details.

Warning

Some versions of Airbyte have issues with tolerances and taints. See this discussion for more details.

Warning

Airbytes included version of Minio has issues starting on ARM64 nodes. Use AMD64 nodes instead.

Warning

If deploying on EKS, note that Fargate will not work, as airbyte pods dynamically create new pods, which will not work on Fargate. See this discussion for more details.

Local Deployment

Airbyte will start on port 30080 by default.

Production Considerations

Airbyte is licensed under the Elastic License 2.0. Consult the Airbyte licensing page for more details for how that will impact your usecase.

Airbyte OSS does not include authentication or authorization. We recommend using oauth2 proxy to secure access to the Airbyte webserver. For more advanced security, consider using an enterprise version of Airbyte.

To use the oauth2 proxy, you will need to provide a client id and secret. You can get these by creating a project in the Google Developer Console. Also, provide a list of emails to allow access to the service via the userlist variable. See the variable se

Examples

Note

All examples omit the configuration for the kubernetes provider and helm provider. You can find more information about how to configure these providers in the usage section.

Simple

module "airbyte" {
    source = "kadreio/relativistic/kubernetes//modules/airbyte"

    use_external_pg = false
    enable_proxy = false
}

OAuth2 Proxy and External Database

module "airbyte" {
  source = "kadreio/relativistic/kubernetes//modules/airbyte"

  # Enable and configure OAuth2 proxy
  enable_proxy = true
  google_oauth_client_id     = "your-client-id"
  google_oauth_client_secret = "your-client-secret"
  # cookie_secret will be auto-generated if not provided
  
  # Configure allowed users
  userlist = <<EOF
    [email protected]
    [email protected]
  EOF

  # Replace with your external PostgreSQL configuration
  use_external_pg = true
  db_host     = "your-postgres-host"
  db_port     = 5432
  db_name     = "airbyte"
  db_user     = "airbyte_user"
  db_password = "your-secure-password"

  # Optional: Configure custom domain
  target_domain     = "airbyte.yourdomain.com"
  deployment_domain = "yourdomain.com"
  airbyte_subdomain = "airbyte"
} 

Inputs

Name

Description

Type

Default

Required

airbyte_chart_version

Airbyte chart version

string

"1.1.0"

no

airbyte_subdomain

The subdomain for Airbyte

string

"airbyte"

no

cookie_secret

Random value to use as a cookie secret for OAuth2 Proxy

string

""

no

db_host

PostgreSQL database host

string

"airbyte-postgresql"

no

db_name

PostgreSQL database name

string

"airbyte"

no

db_password

PostgreSQL database password

string

""

no

db_port

PostgreSQL database port

number

5432

no

db_user

PostgreSQL database user

string

"airbyte"

no

deployment_domain

The deployment domain

string

""

no

enable_proxy

Enable OAuth2 proxy for Airbyte

bool

true

no

google_oauth_client_id

Google OAuth client ID

string

""

no

google_oauth_client_secret

Google OAuth client secret

string

""

no

override_helm_values

Override helm values as YAML string

string

""

no

target_domain

The url of the deployed application

string

"localhost:30080"

no

use_external_pg

Use external PostgreSQL for Airbyte

bool

false

no

userlist

Newline delimeted list of users that can access the service

string

"        [email protected]\n        [email protected]\n"

no

Outputs

No outputs.

Providers

Name

Version

helm

2.16.1

kubernetes

2.33.0

random

3.6.3

Requirements

No requirements.

Resources

Name

Type

helm_release.airbyte

resource

helm_release.oauth2_proxy

resource

kubernetes_secret.db_secrets

resource

kubernetes_service.expose_airbyte_webserver

resource

kubernetes_service.expose_proxy

resource

random_string.cookie_secret

resource

Modules

No modules.