User Tools

Site Tools

A PCRE internal error occured. This might be caused by a faulty plugin

Sidebar

**HPL/SQL is included to Apache Hive since version 2.0** * [[home|Home]]\\ * [[why|Why HPL/SQL]]\\ * [[features|Key Features]]\\ * [[start|Get Started]]\\ * [[doc|HPL/SQL Reference]]\\ * [[download|Download]]\\ * [[new|What's New]]\\ * [[about|About]]

copy-from-ftp

====== COPY FROM FTP Statement ====== COPY FROM FTP statement allows to copy files from a FTP server to local or any Hadoop compatible file system. Using this statement you can easily copy FTP subdirectories into HDFS i.e. The NEW option helps you build a ETL process and download only new files from FTP. **Syntax**: <code language=sql> COPY FROM FTP host [USER user [PWD password]] [DIR directory] [FILES files_wildcard] [TO [LOCAL] target_directory] [options] options: OVERWRITE | NEW SUBDIR SESSIONS num </code> Notes: * //host, user// and //pwd// specify the FTP host name, user name and password (identifier, string literal, variable or expression can be specified). * DIR option specifies the directory to get files, optional. If skipped, the current working FTP directory is used * FILES option specifies a wildcard (Java regular expression) to choose which files to transfer. By default, all files from the specified directory are transferred. * LOCAL keyword means that files are copied to the local file system. By default files are copied to HDFS compatible file system. * OVERWRITE means that the existing files will be overwritten, this is the default. * NEW means that only new files will be transferred, and existing files will be skipped. * SUBDIR option specifies to transfer files in sub-directories. The directory structure is recreated in the target. By default, the command transfers files only from the directory specified by DIR option. * SESSIONS specifies the number of concurrent FTP sessions to transfer the files. Each session transfers the whole file. By default, files are copied in the single session. **Example:** Copy new files including files in subdirectories from a FTP server to HDFS location using 3 concurrent connections: <code language="sql"> copy from ftp 'ftp.myserver.com' user 'paul' pwd '***' dir data/sales/in subdir files '.*' to /data/sales/raw sessions 3 new </code> **Compatibility**: HPL/SQL Extension **Version**: HPL/SQL 0.3.17