User Tools

Site Tools


HPL/SQL is included to Apache Hive since version 2.0



COPY FROM FTP statement allows to copy files from a FTP server to local or any Hadoop compatible file system. Using this statement you can easily copy FTP subdirectories into HDFS i.e.

The NEW option helps you build a ETL process and download only new files from FTP.


COPY FROM FTP host [USER user [PWD password]] [DIR directory] [FILES files_wildcard] 
  [TO [LOCAL] target_directory] [options]
  SESSIONS num  


  • host, user and pwd specify the FTP host name, user name and password (identifier, string literal, variable or expression can be specified).
  • DIR option specifies the directory to get files, optional. If skipped, the current working FTP directory is used
  • FILES option specifies a wildcard (Java regular expression) to choose which files to transfer. By default, all files from the specified directory are transferred.
  • LOCAL keyword means that files are copied to the local file system. By default files are copied to HDFS compatible file system.
  • OVERWRITE means that the existing files will be overwritten, this is the default.
  • NEW means that only new files will be transferred, and existing files will be skipped.
  • SUBDIR option specifies to transfer files in sub-directories. The directory structure is recreated in the target. By default, the command transfers files only from the directory specified by DIR option.
  • SESSIONS specifies the number of concurrent FTP sessions to transfer the files. Each session transfers the whole file. By default, files are copied in the single session.


Copy new files including files in subdirectories from a FTP server to HDFS location using 3 concurrent connections:

copy from ftp '' user 'paul' pwd '***' dir data/sales/in subdir 
  files '.*' to /data/sales/raw sessions 3 new

Compatibility: HPL/SQL Extension

Version: HPL/SQL 0.3.17