Sunday, October 22, 2017

HPLSQL - Procedural SQL on Hadoop



HPLSQL - procedural SQL on Hadoop


What:

Hive Hybrid Procedural SQL On Hadoop (HPL/SQL) is a tool that implements procedural SQL for Hive, Open Source(Apache License 2.0) , included in Hive 2.0 

Who:

 The author of the tool is Dmitry Tolpeko. Its supported by hive user community - user@hive.apache.org

Code available - https://github.com/apache/hive/tree/master/hplsql

Why:

 To brought procedural capability to hive with pl/sql skills.


Overview


  • HPSQL Brings stored procedure programming to the Hadoop world.
  • HPL/SQL is an easiest and quickest way to migrate PL/SQL code
  • Make SQL-on-Hadoop More Dynamic
  • HPL/SQL allows you to work with multiple systems in a single script, so you can take the best of all worlds for different types of workloads and easily integrate them.
  • For developers coming from Oracle and SQL Server, these procedures will feel very familiar and will allow you to port a lot of your existing PL/SQL 

Requirements


  • Java 1.6 or higher
  • Hadoop 2.x
  • included with Hive 2.0 but its not really need hive to run


Syntax


hplsql -e 'query' | -f file 
      [-main procname]
      [-d | --define | -hiveconf | -hivevar var=value ...] 
      [-version | --version]
      [-trace | --trace]
      [-H | --help]
if -main option is not specified, HPL/SQL starts executing all statements from the beginning of the script

Features

  • Built-in Functions

  • Temporary tables
  • Stored Procedures, Functions and Packages
  • Exception and Condition Handling
  • Processing Engine
    • MR
    • Tez
    • Spark(not supported by Mapr)
  • Exception and Condition Handling
  • On-the-fly SQL Conversion
    • Polyglot Persistence
    • UDF


    HPL/SQL key features:
    Build in function - http://www.hplsql.org/doc


    Code sample - 


     creating sample procedure and function

     
     
    /opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/hello_pawan.sql

    Temporary tables 

        

       /opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/temp_table.sql

    Cursor support

          

    /opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/iterate_table.sql


    sql generation on the fly

    /opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/create_dept_table.sql

    PLSQL style Packages support



    /opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/pck.sql
      

    Exception and Condition Handling


    hplsql.onerror = exception | seterror | stop;
      DECLARE <exception> CONDITION;  
       DECLARE EXIT HANDLER FOR < <exception> PRINT ‘<exception message>';

    /opt/mapr/hive/hive-2.1/bin/hplsql -f  \
    /opt/mapr/hive/hive- 2.1/bin/TDA_demo_table_eh.sql


    Working with multiple sources  - mysql, hive, hdfs

    Map Object
    /opt/mapr/hive/hive-2.1/bin/hplsql -f /opt/mapr/hive/hive-2.1/bin/access_mysql.sql

    UDF in hive

    write function in /opt/mapr/hive/hive-2.1/bin/hplsqlrc file and just add the file in hive. 
    Steps below



    •        Hive – start shell
    •       ADD FILE /opt/mapr/hive/hive-2.1/bin/hplsqlrc
    •      CREATE TEMPORARY FUNCTION hplsql AS 'org.apache.hive.hplsql.Udf';
    •       use pawan_hplsql;
    •       SELECT hello(name) FROM orders;




    Challenges 



    •    HPL/SQL offers the fastest way to start working with Hadoop. Later you can re-design and implement advanced data processing workflows using Spark, Tez, Storm, Flink and other frameworks, but right now you can use your current skills and existing code to run your business logic on Hadoop.
    •     Mapr hplsql only support mr and tez as processing engine.
    •     its still a new product. Documentation is limited and there is not much help available on internet.
    •    Database connectivity on mapr
      •      Mysql worked
      •      Oracle ??
      •      Phoenix failed



    Websphere Dummy certificate expired - DummyServerKeyFile.jks , DummyServerTrustFile.jks

    If you faced issue with ibm provided dummy certificate expired just like us and looking for the solution.  This blog is for you.  You can re...