API

Hooks

Module containing various file system hooks.

class airflow_fs.hooks.FsHook[source]

Base FsHook defining the FsHook interface and providing some basic functionality built on this interface.

copy(src_path, dest_path, src_hook=None)[source]

Copies file(s) into the hooks file system.

By default, source files are assumed to be on the same file system as the destination (the hooks file system). To copy between different file systems or file systems in different locations, a source file hook can be provided using the src_hook argument.

copy_fileobj(file_obj, dest_path)[source]

Copies a file object into the hooks file system.

disconnect()[source]

Closes fs connection (if applicable).

exists(file_path)[source]

Checks whether the given file path exists.

Parameters:file_path (str) – File path.
Returns:True if the file exists, else False.
Return type:bool
glob(pattern, recursive=False)[source]

Return a list of paths matching a pathname pattern.

isdir(path)[source]

Returns true if the given path points to a directory.

Parameters:path (str) – File or directory path.
listdir(dir_path)[source]

Lists names of entries in the given path.

makedirs(dir_path, mode=493, exist_ok=True)[source]

Creates directory, creating intermediate directories if needed.

Parameters:
  • dir_path (str) – Path to the directory to create.
  • mode (int) – Mode to use for directory (if created).
  • exist_ok (bool) – Whether the directory is already allowed to exist. If false, an IOError is raised if the directory exists.
mkdir(dir_path, mode=493, exist_ok=True)[source]

Creates the directory, without creating intermediate directories.

open(file_path, mode='rb')[source]

Returns file_obj for given file path.

Parameters:
  • file_path (str) – Path to the file to open.
  • mode (str) – Mode to open the file in.
Returns:

An opened file object.

rm(file_path)[source]

Deletes the given file path.

Parameters:file_path (str) – Path to file:
rmtree(dir_path)[source]

Deletes given directory tree recursively.

Parameters:dir_path (str) – Path to directory to delete.
walk(root)[source]

Directory tree generator, similar to os.walk.

class airflow_fs.hooks.FtpHook(conn_id)[source]

Hook for interacting with files over FTP.

disconnect()[source]

Closes fs connection (if applicable).

exists(file_path)[source]

Checks whether the given file path exists.

Parameters:file_path (str) – File path.
Returns:True if the file exists, else False.
Return type:bool
isdir(path)[source]

Returns true if the given path points to a directory.

Parameters:path (str) – File or directory path.
listdir(dir_path)[source]

Lists names of entries in the given path.

makedirs(dir_path, mode=493, exist_ok=True)[source]

Creates directory, creating intermediate directories if needed.

Parameters:
  • dir_path (str) – Path to the directory to create.
  • mode (int) – Mode to use for directory (if created).
  • exist_ok (bool) – Whether the directory is already allowed to exist. If false, an IOError is raised if the directory exists.
mkdir(dir_path, mode=493, exist_ok=True)[source]

Creates the directory, without creating intermediate directories.

open(file_path, mode='rb')[source]

Returns file_obj for given file path.

Parameters:
  • file_path (str) – Path to the file to open.
  • mode (str) – Mode to open the file in.
Returns:

An opened file object.

rm(file_path)[source]

Deletes the given file path.

Parameters:file_path (str) – Path to file:
rmtree(dir_path)[source]

Deletes given directory tree recursively.

Parameters:dir_path (str) – Path to directory to delete.
walk(root)[source]

Directory tree generator, similar to os.walk.

class airflow_fs.hooks.HdfsHook(conn_id=None)[source]

Hook for interacting with files over HDFS.

disconnect()[source]

Closes fs connection (if applicable).

exists(file_path)[source]

Checks whether the given file path exists.

Parameters:file_path (str) – File path.
Returns:True if the file exists, else False.
Return type:bool
isdir(path)[source]

Returns true if the given path points to a directory.

Parameters:path (str) – File or directory path.
listdir(dir_path)[source]

Lists names of entries in the given path.

makedirs(dir_path, mode=493, exist_ok=True)[source]

Creates directory, creating intermediate directories if needed.

Parameters:
  • dir_path (str) – Path to the directory to create.
  • mode (int) – Mode to use for directory (if created).
  • exist_ok (bool) – Whether the directory is already allowed to exist. If false, an IOError is raised if the directory exists.
mkdir(dir_path, mode=0.0, exist_ok=True)[source]

Creates the directory, without creating intermediate directories.

open(file_path, mode='rb')[source]

Returns file_obj for given file path.

Parameters:
  • file_path (str) – Path to the file to open.
  • mode (str) – Mode to open the file in.
Returns:

An opened file object.

rm(file_path)[source]

Deletes the given file path.

Parameters:file_path (str) – Path to file:
rmtree(dir_path)[source]

Deletes given directory tree recursively.

Parameters:dir_path (str) – Path to directory to delete.
walk(root)[source]

Directory tree generator, similar to os.walk.

class airflow_fs.hooks.S3Hook(conn_id=None)[source]

Hook for interacting with files in S3.

disconnect()[source]

Closes fs connection (if applicable).

exists(file_path)[source]

Checks whether the given file path exists.

Parameters:file_path (str) – File path.
Returns:True if the file exists, else False.
Return type:bool
isdir(path)[source]

Returns true if the given path points to a directory.

Parameters:path (str) – File or directory path.
listdir(dir_path)[source]

Lists names of entries in the given path.

makedirs(dir_path, mode=493, exist_ok=True)[source]

Creates directory, creating intermediate directories if needed.

Parameters:
  • dir_path (str) – Path to the directory to create.
  • mode (int) – Mode to use for directory (if created).
  • exist_ok (bool) – Whether the directory is already allowed to exist. If false, an IOError is raised if the directory exists.
mkdir(dir_path, mode=493, exist_ok=True)[source]

Creates the directory, without creating intermediate directories.

open(file_path, mode='rb')[source]

Returns file_obj for given file path.

Parameters:
  • file_path (str) – Path to the file to open.
  • mode (str) – Mode to open the file in.
Returns:

An opened file object.

rm(file_path)[source]

Deletes the given file path.

Parameters:file_path (str) – Path to file:
rmtree(dir_path)[source]

Deletes given directory tree recursively.

Parameters:dir_path (str) – Path to directory to delete.
walk(root)[source]

Directory tree generator, similar to os.walk.

class airflow_fs.hooks.SftpHook(conn_id)[source]

Hook for interacting with files over SFTP.

disconnect()[source]

Closes fs connection (if applicable).

exists(file_path)[source]

Checks whether the given file path exists.

Parameters:file_path (str) – File path.
Returns:True if the file exists, else False.
Return type:bool
isdir(path)[source]

Returns true if the given path points to a directory.

Parameters:path (str) – File or directory path.
listdir(dir_path)[source]

Lists names of entries in the given path.

makedirs(dir_path, mode=493, exist_ok=True)[source]

Creates directory, creating intermediate directories if needed.

Parameters:
  • dir_path (str) – Path to the directory to create.
  • mode (int) – Mode to use for directory (if created).
  • exist_ok (bool) – Whether the directory is already allowed to exist. If false, an IOError is raised if the directory exists.
mkdir(dir_path, mode=493, exist_ok=True)[source]

Creates the directory, without creating intermediate directories.

open(file_path, mode='rb')[source]

Returns file_obj for given file path.

Parameters:
  • file_path (str) – Path to the file to open.
  • mode (str) – Mode to open the file in.
Returns:

An opened file object.

rm(file_path)[source]

Deletes the given file path.

Parameters:file_path (str) – Path to file:
rmtree(dir_path)[source]

Deletes given directory tree recursively.

Parameters:dir_path (str) – Path to directory to delete.
class airflow_fs.hooks.LocalHook[source]

Hook for interacting with local files on the local file system.

exists(file_path)[source]

Checks whether the given file path exists.

Parameters:file_path (str) – File path.
Returns:True if the file exists, else False.
Return type:bool
isdir(path)[source]

Returns true if the given path points to a directory.

Parameters:path (str) – File or directory path.
listdir(dir_path)[source]

Lists names of entries in the given path.

makedirs(dir_path, mode=493, exist_ok=True)[source]

Creates directory, creating intermediate directories if needed.

Parameters:
  • dir_path (str) – Path to the directory to create.
  • mode (int) – Mode to use for directory (if created).
  • exist_ok (bool) – Whether the directory is already allowed to exist. If false, an IOError is raised if the directory exists.
mkdir(dir_path, mode=493, exist_ok=True)[source]

Creates the directory, without creating intermediate directories.

open(file_path, mode='rb')[source]

Returns file_obj for given file path.

Parameters:
  • file_path (str) – Path to the file to open.
  • mode (str) – Mode to open the file in.
Returns:

An opened file object.

rm(file_path)[source]

Deletes the given file path.

Parameters:file_path (str) – Path to file:
rmtree(dir_path)[source]

Deletes given directory tree recursively.

Parameters:dir_path (str) – Path to directory to delete.
walk(root)[source]

Directory tree generator, similar to os.walk.

Operators

File system operators, built on the file system hook interface.

class airflow_fs.operators.CopyFileOperator(src_path, dest_path, src_hook=None, dest_hook=None, **kwargs)[source]

Operator for copying files between file systems.

Parameters:
  • src_path (str) – File path to copy files from. Can be any valid file path or glob pattern. Note that if a glob pattern is given, dest_path is taken to be a destination directory, rather than a destination file path.
  • dest_path (str) – File path top copy files to.
  • src_hook (FsHook) – File system hook to copy files from.
  • dest_hook (FsHook) – File system hook to copy files to.
class airflow_fs.operators.DeleteFileOperator(path, hook=None, **kwargs)[source]

Deletes files at a given path.

Parameters:
  • path (str) – File path to file(s) to delete. Can be any valid file path or glob pattern.
  • hook (FsHook) – File system hook to use when deleting files.
class airflow_fs.operators.DeleteTreeOperator(path, hook=None, **kwargs)[source]

Deletes a directory tree at a given path.

Parameters:
  • path (str) – File path to directory to delete. Can be any valid file path or glob pattern.
  • hook (FsHook) – File system hook to use when deleting directories.

Sensors

Module containing file system sensors.

class airflow_fs.sensors.FileSensor(path, hook=None, **kwargs)[source]

Sensor that waits for files matching a given file pattern.

Parameters:
  • path (str) – File path to match files to. Can be any valid glob pattern.
  • hook (FsHook) – File system hook to use when looking for files.
poke(context)[source]

Function that the sensors defined while deriving this class should override.