Sunday, December 17, 2017

HUE - Hadoop User Experience - Hadoop UI



HUE -  developed by Cloudera  - yet to be Apache top level project 

What is it? 

  • Hue is an open source Web interface for analyzing data
  • User interface for Hadoop easier to use. Make hadoop to look as single entity than the complex ecosystem. you can browse the data, can write query and execute them using hue.
  • Provide working and standardized environment for HDFS, PIG, HIVE, IMPALA and setup workflow with Oozie  
  • The goal of Hue's Editor is to make data querying easy and productive. It focuses on SQL but also supports job submissions.
  • You can also write pig scripts and manage them using HUE
  • Intelligent query design and assistance
  • Also provide a nice tool to create oozie workflow


PS: Hue provides you the single view but you still need to learn HDFS, PIG,HIVE, IMPALA, Oozie. Please also note that it not much to write here as its GUI tool and you will learn more while using it. 


Hue provides various application integrated to work with pig, hive, impala, hbase, solr.

I will give a brief about Impala here as its not there in my blogs -

Impala - hive is popular but slow so cloudera came up with Impala

  • a high performance query engine.  
  • fast as it runs in memory
  • massive parallel processing. 
  • no map reduce - it has its own execution engine but each data node needs impala daemon.
  • Use data from hdfs, leverage metastore, read/write common hadoop format and can query hbase.
  • extensible, accessed via ODBC/JDBC

so impala a choose over hive? the answer is not always, impala is good for low or medium data processing queries as it runs in memory. Hive is better for processing large load. 



Hue also provide apps to use solr  - cloudera search(solr on hadoop)  for text-based anayltics


Automation and Scheduling - Hue with Oozie

Oozie is a workflow scheduler to manage hadoop jobs. 

Workflow - sequence of actions arranged in a DAG
Coordinator - program that trigger the workflow when certain condition is met
Bundle - set of coordinators can start/stop/suspend/held together. used for data pipeline

Hue provide Workflow Editor to create workflow and coordinators with few clicks.  





           

Websphere Dummy certificate expired - DummyServerKeyFile.jks , DummyServerTrustFile.jks

If you faced issue with ibm provided dummy certificate expired just like us and looking for the solution.  This blog is for you.  You can re...