Meltano v2.0 is almost here! See what's on the roadmap.
Meltano provides a command line interface (CLI) that makes it easy to manage your project, plugins, and EL(T) pipelines.
To quickly find the meltano
subcommand you’re looking for, use the Table of Contents in the sidebar.
For a better understanding of command line documentation syntax, the docopt standard is useful.
add
#
meltano add
lets you add plugins to your Meltano project.
Specifically, it will:
meltano.yml
project file under plugins: <type>s:
, e.g. plugins: extractors:
, andpip_url
is specified, install the new plugin using meltano install <type> <name>
, which will:
.meltano
directory at .meltano/<type>s/<name>/venv
, e.g. .meltano/extractors/tap-gitlab/venv
, andpip install <pip_url>
.(Some plugin types have slightly different or additional behavior; refer to the plugin type documentation for more details.)
Once the plugin has been added to your project, you can configure it using meltano config
,
invoke its executable using meltano invoke
, and use it in a pipeline using meltano elt
.
To learn more about adding a plugin to your project, refer to the Plugin Management guide.
Note: Unlike
meltano install
, this command installs plugins serially to avoid missing dependencies (e.g. atransform
requires thedbt
plugin to be installed first).
The only required arguments are the new plugin’s type and unique name:
meltano add <type> <name>
# For example:
meltano add extractor tap-gitlab
meltano add loader target-postgres
Without a --custom
or --inherit-from
option, this will add the
discoverable plugin with the provided name
to your meltano.yml
project file
using a shadowing plugin definition.
If multiple variants of the discoverable plugin are available, the specific variant to add can be identified using the --variant
option:
meltano add <type> <name> --variant <variant>
# For example:
meltano add loader target-postgres --variant transferwise
To add a custom plugin using a custom plugin definition, use the --custom
flag:
meltano add --custom <type> <name>
# For example:
meltano add --custom extractor tap-covid-19
# If you're using Docker, don't forget to mount the project directory,
# and ensure that interactive mode is enabled so that Meltano can ask you
# additional questions about the plugin and get your answers over STDIN:
docker run --interactive -v $(pwd):/project -w /project meltano/meltano add --custom extractor tap-covid-19
To add a plugin inheriting from an existing one using an inheriting plugin definition, use the --inherit-from
option:
meltano add <type> <name> --inherit-from <existing-name>
# For example:
meltano add extractor tap-ga--client-foo --inherit-from tap-google-analytics
--custom
: Add a custom plugin. The command will prompt you for the package’s base plugin description metadata.
--inherit-from=<existing-name>
: Add a plugin inheriting from an existing plugin in the project or a discoverable plugin identified by name.
--as=<new-name>
: meltano add <type> <name> --as=<new-name>
is equivalent to meltano add <type> <new-name> --inherit-from=<name>
, and can be used to add a discoverable plugin to your project with a different name.
--variant=<variant>
: Add a specific (non-default) variant of the identified discoverable plugin.
--include-related
: Also add transform, dashboard, and model plugins related to the identified discoverable extractor.
config
#
Enables you to manage the configuration of Meltano itself or any of its plugins, as well as plugin extras.
When no explicit --store
is specified, meltano config <plugin> set
will automatically store the value in the most appropriate location:
.env
, if a setting is sensitive or environment-specific (defined as kind: password
or env_specific: true
);meltano.yml
otherwise.If supported by the plugin type, its configuration can be tested using meltano config <plugin> test
.
To manage the configuration of Meltano itself, specify meltano
as the plugin name.
# List all settings for Meltano itself with their names,
# environment variables, and current values
meltano config meltano list
# List all settings for the specified plugin with their names,
# environment variables, and current values
meltano config <plugin> list
# View the plugin's current configuration.
meltano config <plugin>
# Sets the configuration's setting `<name>` to `<value>`.
meltano config <plugin> set <name> <value>
# Values are parsed as JSON, and interpreted as simple strings when invalid
meltano config <plugin> set <name> <string> # String with no meaning in JSON
meltano config <plugin> set <name> "<word> <word> ..." # Multi-word string with no meaning in JSON
meltano config <plugin> set <name> <json> # JSON that fits in a single word
meltano config <plugin> set <name> '<json>' # JSON in a string argument
meltano config <plugin> set <name> '"<string>"' # JSON string
meltano config <plugin> set <name> <number> # JSON number, e.g. 100 or 3.14
meltano config <plugin> set <name> <true/false> # Boolean True or False
meltano config <plugin> set <name> '[<elem>, ...]' # Array
meltano config <plugin> set <name> '{"<key>": <value>, ...}' # JSON object
# Remove the configuration's setting `<name>`.
meltano config <plugin> unset <name>
# Clear the configuration (back to defaults).
meltano config <plugin> reset
# Set, unset, or reset in a specific location
meltano config <plugin> set --store=meltano_yml <name> <value> # set in `meltano.yml`
meltano config <plugin> unset --store=dotenv <name> # unset in `.env`
meltano config <plugin> reset --store=db # reset in system database
# Test the plugin's current configuration, if supported.
meltano config <plugin> test
If multiple plugins share the same name, you can provide an additional --plugin-type
argument to disambiguate:
meltano config --plugin-type=<type> <plugin> ...
Nested properties can be set (and unset) by specifying a list of property names:
meltano config <plugin> set <property> <subproperty> <value>
meltano config <plugin> set <property> <deep> <nesting> <value>
meltano config <plugin> unset <property> <subproperty>
This will result in the following configuration being passed on to the plugin:
{
"<property>": {
"<subproperty>": "<value>",
"<deep>": { "<nesting>": "<value>" }
}
}
Note that meltano config <plugin> list
always displays full config keys
with nesting represented by the .
separator, matching the internal flattened representation:
meltano config <plugin> list
# => <property>.<subproperty>
# => <property>.<deep>.<nesting>
You can also set nested properties using the .
separator, but specifying a list of names is preferred
since this will result in the nesting being reflected in the plugin’s config
object in your meltano.yml
project file:
meltano config <plugin> set <property> <deep> <nesting> <value>
# `meltano.yml`:
# config:
# <property>:
# <deep>:
# <nesting>: <value>
meltano config <plugin> set <property>.<deep>.<nesting> <value>
# `meltano.yml`:
# config:
# <property>.<deep>.<nesting>: <value>
config
with Environments #
If you have multiple Meltano Environments you can specify the environment name:
meltano --environment=<ENVIRONMENT> config <plugin>
In the context of meltano config
, plugin extras are distinguished from regular plugin-specific settings using an underscore (_
) prefix, e.g. _example_extra
. This also applies in the environment variables that can be used to override them at runtime: since setting names for extras are prefixed with underscores (_
), they get an extra underscore to separate them from the plugin name, e.g. TAP_EXAMPLE__EXAMPLE_EXTRA
.
By default, meltano config <plugin>
and meltano config <plugin> list
only take into account regular plugin settings.
An --extras
flag can be passed to view or list only extras instead.
Be aware that meltano config <plugin> reset
resets both regular settings and extras.
# List all extras for the specified plugin with their names,
# environment variables, and current values
meltano config <plugin> list --extras
# View the plugin's current extras
meltano config --extras <plugin>
# Set value of extra `<extra>` to `<value>` through the `_<extra>` setting
meltano config <plugin> set _<extra> <value>
# Unset extra `<extra>`
meltano config <plugin> unset _<extra>
# Reset regular settings _and_ extras
meltano config <plugin> reset
discover
#
Lists the available discoverable plugins and their variants.
# List all available plugins
meltano discover all
# Only list available extractors
meltano discover extractors
# Only list available loaders
meltano discover loaders
# Only list available models
meltano discover models
elt
#
This allows you to run your ELT pipeline to Extract, Load, and Transform data using an extractor and loader of your choosing, and optional transformations.
To allow subsequent pipeline runs with the same extractor/loader/transform combination to pick up right where the previous run left off,
each ELT run has a Job ID that is used to store and look up the incremental replication state in the system database. If no stable identifier is provided using the --job_id
flag or the MELTANO_JOB_ID
environment variable, extraction will always start from scratch and a one-off Job ID is automatically generated using the current date and time.
All the output generated by this command is also logged inside the .meltano
directory at .meltano/logs/elt/{job_id}/{run_id}/elt.log
. The run_id
is a UUID autogenerated at each run.
The preview command meltano run
is a new way to run cross-plugin workflows, including ELT, in a composable manner.
meltano elt <extractor> <loader> [--transform={run,skip,only}] [--job_id TEXT]
The --job_id
option identifies related EL(T) runs when storing and looking up incremental replication state.
The --transform
option can be:
run
: run the Transformsskip
: skip the Transforms (Default)only
: only run the Transforms (skip the Extract and Load steps)A --full-refresh
flag can be passed to perform a full refresh, ignoring state left behind by any previous runs with the same job ID.
A --force
flag can be passed to force a new run even when a pipeline with the same Job ID is already running, which would result in an error otherwise.
A --catalog
option can be passed to manually provide a catalog file for the extractor, as an alternative to letting one be generated on the fly.
This is equivalent to setting the catalog
extractor extra.
A --state
option can be passed to manually provide a state file for the extractor, as an alternative to letting state be looked up based on the Job ID.
This is equivalent to setting the state
extractor extra.
One or more --select <entity>
options can be passed to only extract records for matching selected entities.
Similarly, --exclude <entity>
can be used to extract records for all selected entities except for those specified.
Notes:
meltano select --list <extractor>
.--exclude
takes precedence over inclusion using --select
.--select
and/or --exclude
is equivalent to setting the select_filter
extractor extra.A --dump
option can be passed (along with any of the other options) to dump the content of a pipeline-specific generated file to STDOUT instead of actually running the pipeline.
This can aid in debugging extractor catalog generation, incremental replication state lookup, and pipeline environment variables.
Supported values are:
catalog
: Dump the extractor catalog file that would be passed to the tap’s executable using the --catalog
option.state
: Dump the extractor state file that would be passed to the tap’s executable using the --state
option.extractor-config
: Dump the extractor config file that would be passed to the tap’s executable using the --config
option.loader-config
: Dump the loader config file that would be passed to the target’s executable using the --config
option.Like any standard output, the dumped content can be redirected to a file using >
, e.g. meltano elt ... --dump=state > state.json
.
meltano elt tap-gitlab target-postgres --transform=run --job_id=gitlab-to-postgres
meltano elt tap-gitlab target-postgres --job_id=gitlab-to-postgres --full-refresh
meltano elt tap-gitlab target-postgres --catalog extract/tap-gitlab.catalog.json
meltano elt tap-gitlab target-postgres --state extract/tap-gitlab.state.json
meltano elt tap-gitlab target-postgres --select commits
meltano elt tap-gitlab target-postgres --exclude project_members
meltano elt tap-gitlab target-postgres --job_id=gitlab-to-postgres --dump=state > extract/tap-gitlab.state.json
elt
with Environments #
The --environment
option can be passed to specify a Meltano Environment context for running.
meltano --environment=prod elt tap-gitlab target-postgres --transform=run --job_id=gitlab-to-postgres
If extraction, loading, or transformation is failing, or otherwise not behaving as expected,
you can learn more about what’s going on behind the scenes by setting Meltano’s
cli.log_level
setting to debug
,
using the MELTANO_CLI_LOG_LEVEL
environment variable or the --log-level
CLI option:
MELTANO_CLI_LOG_LEVEL=debug meltano elt ...
meltano --log-level=debug elt ...
In debug mode, meltano elt
will log the arguments and environment used to invoke the Singer tap and target executables (and dbt
, when running transformations), including the paths to the generated
config,
catalog, and
state files, for you to review:
$ meltano --log-level=debug elt tap-gitlab target-jsonl --job_id=gitlab-to-jsonl
meltano | INFO Running extract & load...
meltano | INFO Found state from 2020-08-05 21:30:20.487312.
meltano | DEBUG Invoking: ['demo-project/.meltano/extractors/tap-gitlab/venv/bin/tap-gitlab', '--config', 'demo-project/.meltano/run/tap-gitlab/tap.config.json', '--state', 'demo-project/.meltano/run/tap-gitlab/state.json']
meltano | DEBUG Env: {'TAP_GITLAB_API_URL': 'https://gitlab.com', 'GITLAB_API_TOKEN': '', 'GITLAB_API_GROUPS': '', 'GITLAB_API_PROJECTS': 'meltano/meltano', 'GITLAB_API_ULTIMATE_LICENSE': 'False', 'GITLAB_API_START_DATE': '2021-03-01'}
meltano | DEBUG Invoking: ['demo-project/.meltano/loaders/target-jsonl/venv/bin/target-jsonl', '--config', 'demo-project/.meltano/run/target-jsonl/target.config.json']
meltano | DEBUG Env: {'MELTANO_EXTRACTOR_NAME': 'tap-gitlab', 'MELTANO_EXTRACTOR_NAMESPACE': 'tap_gitlab', 'MELTANO_EXTRACT_API_URL': 'https://gitlab.com', 'MELTANO_EXTRACT_PRIVATE_TOKEN': '', 'MELTANO_EXTRACT_GROUPS': '', 'MELTANO_EXTRACT_PROJECTS': 'meltano/meltano', 'MELTANO_EXTRACT_ULTIMATE_LICENSE': 'False', 'MELTANO_EXTRACT_START_DATE': '2021-03-01', 'TAP_GITLAB_API_URL': 'https://gitlab.com', 'GITLAB_API_TOKEN': '', 'GITLAB_API_GROUPS': '', 'GITLAB_API_PROJECTS': 'meltano/meltano', 'GITLAB_API_ULTIMATE_LICENSE': 'False', 'GITLAB_API_START_DATE': '2021-03-01', 'TARGET_JSONL_DESTINATION_PATH': 'output', 'TARGET_JSONL_DO_TIMESTAMP_FILE': 'False'}
Note that the contents of these pipeline-specific generated files can also easily be dumped to STDOUT or a file using the --dump
option described above.
Additionally, all Singer messages output by the tap and target will be logged, identified by <plugin name> (out)
prefixes:
tap-gitlab | INFO Starting sync
tap-gitlab (out) | {"type": "SCHEMA", "stream": "projects", "schema": {"type": "object", "properties": {...}}, "key_properties": ["id"]}
tap-gitlab (out) | {"type": "RECORD", "stream": "projects", "record": {"id": 7603319, "name": "Meltano", ...}, "time_extracted": "2020-08-05T21:30:22.988250Z"}
tap-gitlab (out) | {"type": "STATE", "value": {"project_7603319": "2020-08-05T21:04:59.158000Z"}}
tap-gitlab | INFO Sync complete
target-jsonl (out) | {"project_7603319": "2020-08-05T21:04:59.158000Z"}
meltano | INFO Incremental state has been updated at 2020-08-05 21:30:26.669170.
meltano | DEBUG Incremental state: {'project_7603319': '2020-08-05T21:04:59.158000Z'}
meltano | INFO Extract & load complete!
environment
#
Use the environment
command to manage Environments in your Meltano project.
# Add an environment
meltano environment add <environment_name>
# Remove an environment
meltano environment remove <environment_name>
# List available environments
meltano environment list
Once an Environment is configured, the --environment
option or MELTANO_ENVIRONMENT
environment variable can be used with the following commands:
If there is a value provided for default_environment
in your meltano.yml
these commands will be run using that Environment if no --environment
option or MELTANO_ENVIRONMENT
environment variable is provided. If you have default_environment
set this way but would prefer to use no environment use the option --environment=null
(or its equivalent using a space instead of an =
: --environment null
) or use the --no-environment
flag.
# Add a new Environment
meltano environment add prod
# List existing Environments
meltano environment list
# Add plugin configuration within the new Environment
meltano --environment=prod config target-postgres set batch_size_rows 50000
# Remove an Environment
meltano environment remove prod
init
#
Used to create a new Meltano project directory inside the current working directory.
The new project directory will contain:
meltano.yml
project file that will list any plugins
you’ll add and pipeline schedules
you’ll create,.gitignore
, README.md
, and requirements.txt
for you to edit (or delete) as appropriate, andmodel
, extract
, load
, transform
, analyze
, notebook
, and orchestrate
directories for you to use (or delete) as you please.Anonymous usage statistics are enabled by default, unless the --no_usage_stats
flag is provided, the MELTANO_DISABLE_TRACKING
environment variable is enabled, or you set send_anonymous_usage_stats: false
in your meltano.yml
.
# Format
meltano init [project_name] [--no_usage_stats]
send_anonymous_usage_stats
setting.# Initialize a new Meltano project in the
# "demo-project" directory, and...
# - share anonymous usage data with the Meltano team
# to help them gauge interest in Meltano and its
# features and drive development time accordingly:
meltano init demo-project
# - OR don't share anything with the Meltano team
# about this specific project:
meltano init demo-project --no_usage_stats
# - OR don't share anything with the Meltano team
# about any project I initialize ever:
SHELLRC=~/.$(basename $SHELL)rc # ~/.bashrc, ~/.zshrc, etc
echo "export MELTANO_DISABLE_TRACKING=1" >> $SHELLRC
meltano init demo-project # --no_usage_stats is implied
install
#
Installs dependencies of your project based on the meltano.yml file.
Optionally, provide a plugin type argument to only (re)install plugins of a certain type. Additionally, plugin names can be provided to only (re)install those specific plugins.
Use --include-related
to automatically install transform, model, and dashboard plugins related to installed extractor plugins.
Subsequent calls to meltano install
will upgrade a plugin to it’s latest version, if any. To completely uninstall and reinstall a plugin, use --clean
.
Meltano installs plugins in parallel. The number of plugins to install in parallel defaults to the number of CPUs on the machine, but can be controlled with --parallelism
. Use --parallelism=1
to disable the feature and install them one at a time.
If you're using a custom Docker image, make sure `python3-venv` is installed:
FROM ubuntu:20.04 RUN apt-get update && \ apt-get install -y -q \ gcc \ sqlite3 \ libsqlite3-dev \ python3 \ python3-pip \ python3-venv # Add this line RUN pip3 install meltano WORKDIR /meltano COPY meltano.yml meltano.yml RUN mkdir .meltano/ && meltano install
meltano install
meltano install extractors
meltano install extractor tap-gitlab
meltano install extractors tap-gitlab tap-adwords
meltano install models
meltano install --include-related
meltano install --parallelism=16
meltano install --clean
invoke
#
Invoke the plugin’s executable with specified arguments.
meltano invoke <plugin> [PLUGIN]_ARGS...]
If multiple plugins share the same name, you can provide an additional --plugin-type
argument to disambiguate:
meltano invoke --plugin-type=<type> <plugin> [PLUGIN_ARGS...]
A --dump
option can be passed to dump the content of a generated config file or extractor catalog file to STDOUT instead of actually invoking the plugin:
meltano invoke --dump=config <plugin>
meltano invoke --dump=catalog <plugin>
Like any standard output, the dumped content can be redirected to a file using >
, e.g. meltano invoke --dump=catalog <plugin> > state.json
.
invoke
with Environments #
If you have multiple Meltano Environments you can specify the environment name:
meltano --environment=<ENVIRONMENT> invoke <plugin> [PLUGIN]_ARGS...]
Plugins can define commands, which are shortcuts for combinations of arguments. These can be invoked with the shortcut command of the form meltano invoke <plugin>:<command>
.
meltano invoke dbt:seed
meltano invoke dbt:snapshot
Additional arguments can be specified as well, which will be appended to the command.
meltano invoke dbt:seed --show --threads 5
To see what commands a plugin supports, use --list-command
:
meltano invoke --list-commands dbt
To execute plugins inside containers, use the --containers
flag:
meltano invoke --containers dbt:compile
remove
#
meltano remove
removes one or more plugins of the same type from your Meltano project.
Specifically, plugins will be removed from the:
meltano.yml
project file.meltano
directory under .meltano/<plugin_type>/<plugin_name>
plugin_settings
table in the system databasemeltano remove <type> <name>
meltano remove <type> <name> <name_two>
# meltano will attempt to remove an extractor called tap-gitlab
meltano remove extractor tap-gitlab
# meltano will attempt to remove two loaders; target-postgres and target-csv
meltano remove loader target-postgres target-csv
run
#
Run a set of command blocks in series.
Command blocks are specified as a list of plugin names, e.g. meltano run some_tap some_mapping some_target some_plugin:some_cmd
and
are run in the order they are specified from left to right. A failure in any block will cause the entire run to abort.
Multiple commmand blocks can be chained together or repeated, and extractor/loader pairs will automatically be linked to perform EL work.
If you have an active environment defined, a Job ID is autogenerated for each extractor/loader pair and used to store and look up the incremental replication state in the system database.
This allows subsequent runs with the same extractor and loader combinations to start where the previous run ended.
The format of the generated id’s is <environment_name>:<tap_name>-to-<target_name>
.
Note that inline mapping names are not included when generating IDs.
Note that if no environment is active, meltano run
does not generate a job ID and it does not track state.
In addition to explicitly specifying plugin names you can also execute one or more named jobs alongside other commands.
The run command is a preview feature. Its functionality and CLI signature is still evolving.
During the feature preview, and similar to meltano invoke dbt:[cmd]
, you may need to perform additional steps to populate `DBT_*` specific environment variables before you are able to directly invoke dbt commands. For more information and available workarounds, please see our issue tracker link #3098.
Some flags and options supported by meltano elt
are not yet supported in meltano run
. For a list of these and a discussion of available workarounds, please see our issue tracker link #3094.
meltano run tap-gitlab target-postgres
meltano run tap-gitlab target-postgres dbt:clean dbt:test dbt:run
meltano run tap-gitlab target-postgres tap-salesforce target-mysql
meltano run tap-gitlab target-postgres dbt:run tap-postgres target-bigquery
meltano --environment=<ENVIRONMENT> run tap-gitlab target-postgres
meltano run tap-gitlab one-mapping another-mapping target-postgres
meltano run tap-gitlab target-postgres simple-job
When an active environment is present, run
will attempt to run incrementally and save state by default.
However, four top level flags are provided to alter behavior:
--dry-run
just parse the invocation, validate it, and explain what would be executed. Does not execute anything.
(implicitly enables –log-level=debug for ‘console’ named handlers).--no-state-update
will disable state saving for this invocation.--full-refresh
will force a full refresh and ignore the prior state. The new state after completion will still be updated with the execution results, unless --no-state-update
is also specified.--force
will force a job run even if a conflicting job with the same generated ID is in progress.Examples:
# run the two pipelines in series
# the autogenerated ID for the first EL pair will be 'dev:tap-gitlab-to-target-postgres'
# the autogenerated ID for the second EL pair will be 'dev:tap-gitlab-to-target-mysql'
meltano --environment=prod run tap-gitlab hide-secrets target-postgres tap-salesforce target-mysql
# run the pipelines in series, performing a full refresh for all.
meltano--environment=stg run --full-refresh tap-gitlab target-postgres tap-salesforce target-mysql ...
# run the pipelines in series, forcing each if a conflicting job is found.
meltano--environment=dev run --force tap-gitlab target-postgres tap-salesforce target-mysql ...
run
with Environments #
If you have multiple Meltano Environments you can specify the environment name:
meltano --environment=<ENVIRONMENT> run ...
Note that if no environment is active, meltano run
does not generate a job ID and it does not track state.
job
#
Use the job
command to define one more related tasks. A job can contain a single task, or many tasks. Today all tasks
are run sequentially. You can run a specified job by passing the job name as an argument to meltano run
.
# Add a job with a single task representing a run command
meltano job add <job_name> --tasks "<tap_name> <mapping_name> <target_name> <command>"
# Add a new job with multiple tasks by passing arrays in yaml format, where each item representing a run command.
meltano job add <job_name> --tasks "[<tap_name> <target_name>, <command>, <tap2_name> <target2_name>, ...]"
# Update an existing job with new tasks
meltano job set <job_name> --tasks "<tap_name> <mapping_name> <target_name> <command>"
meltano job set <job_name> --tasks "[<tap_name> <target_name>, <command>, <tap2_name> <target2_name>, ...]"
# List all jobs
meltano job list
meltano job list --format=json
# List a named job
meltano job list <job_name>
meltano job list <job_name> --format=json
# Remove a named job
meltano job remove <job_name>
# Add a new job named "simple-demo" that contains two tasks
# Task 1: tap-gitlab hide-gitlab-secrets target-mysql
# Task 2: tap-gitlab target-csv
meltano job add simple-demo --tasks "[tap-gitlab hide-gitlab-secrets target-mysql dbt:run, tap-gitlab target-csv]"
# list the job named "simple-demo"
meltano job list simple-demo --format=json
# run the job named "simple-demo" using meltano run
meltano run simple-demo
# run the job named "simple-demo" AND another EL pair using meltano run
meltano run simple-demo tap-mysql target-bigquery
# remove the job named "simple-demo"
meltano job remove simple-demo
schedule
#
An orchestrator
plugin is required to use meltano schedule
: refer to the Orchestration documentation to get started with Meltano orchestration.
Use the schedule
command to define ELT pipelines to be run by an orchestrator at regular intervals. These scheduled pipelines will be added to your meltano.yml
project file.
You can run a specific scheduled pipeline’s corresponding meltano elt
command as a one-off using meltano schedule run <schedule_name>
.
Any command line options (e.g. --select=<entity>
or --dump=config
) will be passed on to meltano elt
.
The interval argument can be a cron expression or one of the following presets:
@hourly
(0 * * * *
), @daily
(0 0 * * *
), @weekly
(0 0 * * 0
), @monthly
(0 0 1 * *
), @yearly
(0 0 1 1 *
), or @once
(for schedules to be triggered manually through the UI).
# Add a schedule
meltano schedule <schedule_name> <extractor> <loader> <interval> [--transform={run,skip,only}]
# List all schedules
meltano schedule list [--format=json]
# Run a schedule
meltano schedule run <schedule_name>
meltano schedule gitlab-to-postgres tap-gitlab target-postgres @daily --transform=run
# This specifies that the following command is to be run once a day:
# meltano elt tap-gitlab target-postgres --transform=run --job_id=gitlab-to-postgres
meltano schedule gitlab-to-jsonl tap-gitlab target-jsonl "* * * * *"
# This specifies that the following command is to be run every minute:
# meltano elt tap-gitlab target-jsonl --job_id=gitlab-to-jsonl
meltano schedule run gitlab-to-jsonl --select=commits
# This will run:
# meltano elt tap-gitlab target-jsonl --job_id=gitlab-to-jsonl --select=commits
select
#
Use the select
command to add select patterns to a specific extractor in your Meltano project.
meltano select [--list] [--all] <tap_name> [ENTITIES_PATTERN] [ATTRIBUTE_PATTERN]
: Manage the selected entities/attributes for a specific tap.Selection rules will be stored in the extractor’s select
extra.
Not all taps support this feature. In addition, taps needs to support the --discover
switch. You can use meltano invoke tap-... --discover
to see if the tap supports it.
Unix shell-style wildcards can be used in selection patterns to match multiple entities or attributes at once:
*
: matches any sequence of characters?
: matches one character[abc]
: matches either a
, b
, or c
[!abc]
: matches any character but a
, b
, or c
Use --list
to list the current selected tap attributes.
Note:
--all
can be used to show all the tap attributes with their selected status.
Use --rm
or --remove
to remove previously added select patterns.
select
with Environments #
If you have multiple Meltano Environments you can specify the environment name:
meltano --environment=<ENVIRONMENT> select <tap_name>
# List all available entities and attributes
meltano select tap-gitlab --list --all
# Include all attributes of an entity
meltano select tap-gitlab tags "*"
# Include specific attributes of an entity
meltano select tap-gitlab commits id
meltano select tap-gitlab commits project_id
meltano select tap-gitlab commits created_at
meltano select tap-gitlab commits author_name
meltano select tap-gitlab commits message
# Exclude matching attributes of all entities
meltano select tap-gitlab --exclude "*" "*_url"
# List selected (enabled) entities and attributes
meltano select tap-gitlab --list
Example output:
Enabled patterns:
tags.*
commits.id
commits.project_id
commits.created_at
commits.author_name
commits.message
!*.*_url
Selected attributes:
[selected ] commits.author_name
[selected ] commits.created_at
[automatic] commits.id
[selected ] commits.message
[selected ] commits.project_id
[automatic] tags.commit_id
[selected ] tags.message
[automatic] tags.name
[automatic] tags.project_id
[selected ] tags.target
Remove patterns (--rm
or --remove
):
# Remove previously added select patterns
meltano select tap-gitlab --rm tags "*"
meltano select tap-gitlab --rm --exclude "*" "*_url"
meltano select tap-gitlab --rm commits id
Most shells parse glob syntax: you must escape the special characters in the select pattern by quoting the pattern.
Use --exclude
to exclude all attributes that match the filter.
Attributes that are automatic
are always included, even if they match an exclude pattern. Only attributes that are available
can be excluded.
Exclusion takes precedence over inclusion. If an attribute is excluded, there is no way to include it back without removing the exclusion pattern first.
meltano select --exclude tap-carbon-intensity '*' 'longitude'
meltano select --exclude tap-carbon-intensity '*' 'latitude'
This will exclude all longitude
and latitude
attributes.
state
#
Manage Singer State for jobs via the CLI.
For more information about how Meltano uses incremental replication state, see the data integration guide.
Clear the state for a given job_id
.
Prompts for confirmation.
meltano state clear [--force] <job_id>
--force
option will disable confirmation prompts. Use with caution.# Clear state. Meltano will prompt for confirmation.
meltano state clear dev:tap-gitlab-to-target-jsonl
# Clear state, overriding confirmation prompt.
meltano state clear --force dev:tap-gitlab-to-target-jsonl
Retrieve state for a given job_id
.
meltano state get <job_id>
# Print the state that would be used in the next run of dev:tap-gitlab-to-target-jsonl
meltano state get dev:tap-gitlab-to-target-jsonl
List all job_ids
found in the system database.
meltano state list [--pattern] <PATTERN>
--pattern
option allows filtering returned job IDs by using *
as a wildcard."*" is subject to auto-expansion in most shells: you must escape the " *" by quoting the pattern.
# List all job IDs
meltano state list
# List only those job IDs that start with "dev:"
meltano state list 'dev:*'
# List only those job IDs that contain "tap-gitlab"
meltano state list --pattern '*tap-gitlab*'
Merge new state onto existing state for a job ID.
Not seeing merged state in the system database?
Merged state is computed at execution time. The merge command merely adds a new payload to the database which is merged together with existing payloads the next time state is read via meltano elt, meltano run, or meltano state get.
# Read state from a file
meltano state merge <job_id> --input-file <file>
# Read state from a command-line argument
meltano state merge <job_id> <RAW STATE JSON>
# Merge state from one job into another
meltano state merge <job_id> --from-job-id <src_job_id>
--input-file
option specifies a file to read the state from.--from-job-id
option specifies an existing job ID to read the state from.State must be provided in exactly one of these ways: via --input-file
, via --from-job-id
, or via a command line argument.
# Provide state via a command-line argument.
# The argument must be valid JSON with a top-level key of "singer_state"
# Only the "project_123456_issues" key will be overwritten. Any other bookmarks will remain untouched.
meltano state merge dev:tap-gitlab-to-target-jsonl '{"singer_state": {"project_123456_issues": "2020-01-01"}}'
# Provide state via a file.
# The file must contain valid JSON with a top-level key of "singer_state"
# These two lines have the same effect as the one line above.
echo '{"singer_state": {"project_123456_issues": "2020-01-01"}}' > gitlab_state.json
meltano state merge dev:tap-gitlab-to-target-jsonl --input-file gitlab_state.json
# Provide state via an existing job.
meltano state merge dev:tap-gitlab-to-target-jsonl --from-job-id prod:tap-gitlab-to-target-jsonl
Set state for a job.
# Read state from a file
# Meltano will prompt for confirmation.
meltano state set <job_id> --input-file <file>
# Read state from a file, overriding confirmation prompt.
meltano state set --force <job_id> --input-file <file>
# Read state from a command-line argument
# Meltano will prompt for confirmation.
meltano state set <job_id> <RAW STATE JSON>
# Read state from a command-line argument, overriding confirmation prompt.
meltano state set --force <job_id> <RAW STATE JSON>
--input-file
option specifies a file to read the state from.--force
option will disable confirmation prompts. Use with caution.# Provide state via a command-line argument, overriding confirmation prompt.
# The argument must be valid JSON with a top-level key of "singer_state"
# ALL state will be overwritten. Only the "project_123456_issues" bookmark will be used in subsequent runs.
meltano state set --force dev:tap-gitlab-to-target-jsonl '{"singer_state": {"project_123456_issues": "2020-01-01"}}'
# Provide state via a file, overriding confirmation prompt.
# The file must contain valid JSON with a top-level key of "singer_state"
# These two lines have the same effect as the one line above.
echo '{"singer_state": {"project_123456_issues": "2020-01-01"}}' > gitlab_state.json
meltano state set --force dev:tap-gitlab-to-target-jsonl --input-file gitlab_state.json
test
#
Run tests for one or more plugins. A test is any command with a name starting with test
.
# Runs all tests for all plugins
meltano test --all
# Run all available tests for one or more selected plugins
meltano test <plugin1> <plugin2>
# Run a named test for a single plugin
meltano test <plugin>:<test-name>
# Run a named test for one or more plugins
meltano test <plugin1>:<test-name1> <plugin2>:<test-name2>
ui
#
meltano ui
: Start the Meltano UI.start
(default) #
Start the Meltano UI.
setup
#
This command is only relevant for production-grade setup.
Generate secrets for the ui.secret_key
and ui.password_salt
settings, that
will be stored in your project’s .env
file along with the
specified value for the ui.server_name
setting.
In production, you will likely want to move these settings to actual environment variables, since .env
is in .gitignore
by default.
Regenerating secrets will cause the following:
Use with caution!
The --bits
flag can be used to specify the size of the secrets, default to 256.
# Format
meltano ui setup [--bits=256] <server_name>
meltano ui setup meltano.example.com
user
#
This command is only relevant when Meltano is run with authentication enabled.
add
#
Create a Meltano user account, active and ready to be used.
Update the user instead of creating a new one.
Add the user to the role. Meltano ships with two built-in roles: admin
and regular
.
meltano user add admin securepassword --role admin
upgrade
#
Upgrade Meltano and your Meltano project to the latest version.
When called without arguments, this will:
meltano
packagemeltano upgrade
meltano upgrade --skip-package # Skip upgrading the Meltano package
meltano upgrade package # Only upgrade Meltano package
meltano upgrade files # Only update files managed by file bundles
meltano upgrade database # Only apply migrations to system database
meltano upgrade models # Only recompile models
version
#
It is used to check which version of Meltano currently installed.
meltano --version