My Notes: October 2017

HPLSQL - procedural SQL on Hadoop

What:

Hive Hybrid Procedural SQL On Hadoop (HPL/SQL) is a tool that implements procedural SQL for Hive, Open Source(Apache License 2.0) , included in Hive 2.0

Who:

The author of the tool is Dmitry Tolpeko. Its supported by hive user community - user@hive.apache.org

Code available - https://github.com/apache/hive/tree/master/hplsql

Why:

To brought procedural capability to hive with pl/sql skills.

Overview

HPSQL Brings stored procedure programming to the Hadoop world.
HPL/SQL is an easiest and quickest way to migrate PL/SQL code
Make SQL-on-Hadoop More Dynamic
HPL/SQL allows you to work with multiple systems in a single script, so you can take the best of all worlds for different types of workloads and easily integrate them.
For developers coming from Oracle and SQL Server, these procedures will feel very familiar and will allow you to port a lot of your existing PL/SQL

Requirements

Java 1.6 or higher
Hadoop 2.x
included with Hive 2.0 but its not really need hive to run

Syntax

hplsql -e 'query' | -f file 
      [-main procname]
      [-d | --define | -hiveconf | -hivevar var=value ...] 
      [-version | --version]
      [-trace | --trace]
      [-H | --help]

if -main option is not specified, HPL/SQL starts executing all statements from the beginning of the script

Features

Built-in Functions
Temporary tables
Stored Procedures, Functions and Packages
Exception and Condition Handling
Processing Engine

MR
Tez
Spark(not supported by Mapr)

Exception and Condition Handling
On-the-fly SQL Conversion

Polyglot Persistence
UDF

HPL/SQL key features:

Build in function - http://www.hplsql.org/doc

Code sample -

creating sample procedure and function

/opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/hello_pawan.sql

Temporary tables

/opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/temp_table.sql

Cursor support

/opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/iterate_table.sql

sql generation on the fly

/opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/create_dept_table.sql

PLSQL style Packages support

/opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/pck.sql

Exception and Condition Handling

hplsql.onerror = exception | seterror | stop;

DECLARE <exception> CONDITION;

DECLARE EXIT HANDLER FOR < <exception> PRINT ‘<exception message>';

/opt/mapr/hive/hive-2.1/bin/hplsql -f \

/opt/mapr/hive/hive- 2.1/bin/TDA_demo_table_eh.sql

Working with multiple sources - mysql, hive, hdfs

Map Object

/opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/access_mysql.sql

UDF in hive

write function in /opt/mapr/hive/hive-2.1/bin/hplsqlrc file and just add the file in hive.

Steps below

Hive – start shell

ADD FILE /opt/mapr/hive/hive-2.1/bin/hplsqlrc

CREATE TEMPORARY FUNCTION hplsql AS 'org.apache.hive.hplsql.Udf';

use pawan_hplsql;

SELECT hello(name) FROM orders;

Challenges

HPL/SQL offers the fastest way to start working with Hadoop. Later you can re-design and implement advanced data processing workflows using Spark, Tez, Storm, Flink and other frameworks, but right now you can use your current skills and existing code to run your business logic on Hadoop.

Mapr hplsql only support mr and tez as processing engine.

its still a new product. Documentation is limited and there is not much help available on internet.

Database connectivity on mapr

Mysql worked
Oracle ??
Phoenix failed

My Notes

Sunday, October 22, 2017

HPLSQL - Procedural SQL on Hadoop

HPLSQL - procedural SQL on Hadoop

What:

Who:

Code available - https://github.com/apache/hive/tree/master/hplsql

Why:

Overview

Requirements

Syntax

Features

Built-in Functions

Code sample -

creating sample procedure and function

Temporary tables

Cursor support

sql generation on the fly

PLSQL style Packages support

Exception and Condition Handling

Working with multiple sources - mysql, hive, hdfs

UDF in hive

Challenges

Websphere Dummy certificate expired - DummyServerKeyFile.jks , DummyServerTrustFile.jks

Search This Blog

Sunday, October 22, 2017

HPLSQL - Procedural SQL on Hadoop

HPLSQL - procedural SQL on Hadoop

What:

Who:

Code available - https://github.com/apache/hive/tree/master/hplsql Why:

Overview

Requirements

Syntax

Features

Built-in Functions

Code sample -

creating sample procedure and function

Temporary tables

Cursor support

sql generation on the fly

PLSQL style Packages support

Exception and Condition Handling

Working with multiple sources - mysql, hive, hdfs

UDF in hive

Challenges

Websphere Dummy certificate expired - DummyServerKeyFile.jks , DummyServerTrustFile.jks

Code available - https://github.com/apache/hive/tree/master/hplsql

Why: