HPL/SQL - Procedural SQL on Hadoop, NoSQL and RDBMS

This is an old revision of the document!

A PCRE internal error occured. This might be caused by a faulty plugin

====== HPL/SQL - Get Started ====== Quick guide how to start using HPL/SQL. ===== Installation ===== You can install HPL/SQL by [[download|downloading]] .tar.gz or .zip file, or build it from the [[download|source code]]. ==== Requirements ==== * Java 1.6 or higher * Hadoop 1.x. and 2.x ==== Installing HPL/SQL from Binaries ===== **1. Download** [[download|Download]] a HPL/SQL release and uncompress to the preferred location, for example ~/hplsql/ directory. The HPL/SQL program directory includes the following files: * hplsql - Shell script to launch HPL/SQL on Linux * hplsql.cmd - Shell script to launch the tool on Windows * hplsql-x.x.x.jar - HPL/SQL executable (x.x.x contains the version) * hplsql-site.xml - HPL/SQL [[configuration|configuration]] file * antlr-runtime-4.5.jar - ANTLR parser runtime On Linux/UNIX make sure //hplsql// is an executable file (if you uncompress the tool from .zip file): <code> chmod +x <hplsql_dir>/hplsql </code> **2. Configure CLASSPATH** (Optional) For Cloudera distributions, you can edit //hplsql// file, remove all lines containing <code> export "HADOOP_CLASSPATH=..." </code> and add the following line <code> export "HADOOP_CLASSPATH=/opt/cloudera/parcels/CDH/jars/*" </code> For Hortonworks distributions check if Hadoop jars are located in /usr/hdp/x.x.x.x-x/ directory and change all paths in //hplsql// file accordingly. For other distributions check whether Hadoop jars are located in /usr/lib/, and make necessary changes in //hplsql// file. **3. Test installation** Run the following command to test HPL/SQL installation: <code> <hplsql_dir>/hplsql --version HPL/SQL x.x.x </code> If the version number is printed the tool is installed correctly. **4. Add to PATH variable** (Optional) You may add HPL/SQL directory to PATH variable: <code> export PATH=$PATH:<hplsql_dir> </code> Then you can invoke HPL/SQL by running: <code> hplsql <options> </code> ===== Configuration ===== HPL/SQL uses [[configuration|hplsql-site.xml]] configuration file located in the HPL/SQL program directory where hplsql.jar is located. To run Hive queries from HPL/SQL you may need to specify the YARN job queue, for example: <HTML> <property> <name>hplsql.conn.init.hive2conn</name> <value> set mapred.job.queue.name=dev; set hive.execution.engine=mr; use sales_db; </value> </property> </HTML> Note that [[configuration|hplsql-site.xml]] located in the current directory takes precedence over the configuration file in HPL/SQL program directory. ==== Running HPL/SQL ===== Now you can specify [[cli|options]] and run HPL/SQL, for example: <code> hplsql -e "CURRENT_DATE+1" hplsql -e "SELECT * FROM src LIMIT 1" </code> or <code> hplsql -f script.sql </code> ===== Use HPL/SQL in Shell Scripts ===== Get a value from HPL/SQL script: <code language="sql"> MDATE=$(hplsql -e "NVL(MIN_PARTITION_DATE(sales, local_dt, code='A'), '1970-01-01')") </code> <code language="sql"> START=$(hplsql -e 'CURRENT_DATE - 1') </code> Read [[doc|HPL/SQL Reference]] for more information how to use the tool.

HPL/SQL - Procedural SQL on Hadoop, NoSQL and RDBMS

User Tools

Site Tools

Sidebar

Page Tools