HUE - developed by Cloudera - yet to be Apache top level project
What is it?
- Hue is an open source Web interface for analyzing data
- User interface for Hadoop easier to use. Make hadoop to look as single entity than the complex ecosystem. you can browse the data, can write query and execute them using hue.
- Provide working and standardized environment for HDFS, PIG, HIVE, IMPALA and setup workflow with Oozie
- The goal of Hue's Editor is to make data querying easy and productive. It focuses on SQL but also supports job submissions.
- You can also write pig scripts and manage them using HUE
- Intelligent query design and assistance
- Also provide a nice tool to create oozie workflow
PS: Hue provides you the single view but you still need to learn HDFS, PIG,HIVE, IMPALA, Oozie. Please also note that it not much to write here as its GUI tool and you will learn more while using it.
Hue provides various application integrated to work with pig, hive, impala, hbase, solr.
I will give a brief about Impala here as its not there in my blogs -
Impala - hive is popular but slow so cloudera came up with Impala
- a high performance query engine.
- fast as it runs in memory
- massive parallel processing.
- no map reduce - it has its own execution engine but each data node needs impala daemon.
- Use data from hdfs, leverage metastore, read/write common hadoop format and can query hbase.
- extensible, accessed via ODBC/JDBC
so impala a choose over hive? the answer is not always, impala is good for low or medium data processing queries as it runs in memory. Hive is better for processing large load.
Hue also provide apps to use solr - cloudera search(solr on hadoop) for text-based anayltics
Automation and Scheduling - Hue with Oozie
Oozie is a workflow scheduler to manage hadoop jobs.
Workflow - sequence of actions arranged in a DAG
Coordinator - program that trigger the workflow when certain condition is met
Bundle - set of coordinators can start/stop/suspend/held together. used for data pipeline
Hue provide Workflow Editor to create workflow and coordinators with few clicks.
Hue provide Workflow Editor to create workflow and coordinators with few clicks.
As the growth of Big data implementation services , it is essential to spread knowledge in people. This meetup will work as a burst of awareness.
ReplyDeleteWell written articles like yours renews my faith in today's writers. The article is very informative. Thanks for sharing such beautiful information.
ReplyDeleteBest Data Migration tools
Penetration testing companies USA
What is Data Lake
Artificial Intelligence in Banking
What is Data analytics
Big data Companies USA
What is Data Lake
What is Data Migration
What is Data Science
Flutter is an open source project that helps you build high-quality Mobile Appsfor iOS and Android from a single codebase. Flutter is optimized for todays, and tomorrows, mobile devices. It is free and open source under a BSD license.
ReplyDelete