Drill can be used to query hive tables. Please note that Drill doesn't use MapReduce as its for interactive purpose and not batch.
Drill currently does not support writing Hive tables. The approach I would use is to have Drill write files and read the files via Hive External tables.
Drill support ANSI SQL support basic datatypes. Please keep in mind that Drill does not support the following Hive types:
- LIST
- MAP
- STRUCT
- TIMESTAMP (Unix Epoch format)
- UNION
If Decimal datatype is diabled in Drill, you can enable it as below
set the
planner.enable_decimal_data_type
option to true
.
How to access hive tables in Drill - example sample
1. create external table in hive - store it as file
hive> CREATE EXTERNAL TABLE types_demo (
a bigint,
b boolean,
c DECIMAL(3, 2),
d double,
e float,
f INT,
g VARCHAR(64),
h date,
i timestamp
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE LOCATION '/mapr/demo.mapr.com/data/mytypes.csv';
2. Configure Storage plugin (Hive storage plugin connects Drill to the Hive metastore containing the data.)
{
"type": "hive",
"enabled": false,
"configProps": {
"hive.metastore.uris": "thrift://localhost:9083",
"javax.jdo.option.ConnectionURL": "jdbc:mysql:;databaseName=../sample-data/drill_hive_db;create=true",
"hive.metastore.warehouse.dir": "/tmp/drill_hive_wh",
"fs.default.name": "file:///",
"hive.metastore.sasl.enabled": "false"
}
}
3. Access hive table in Drill as below
0: jdbc:drill:> use hive;
0: jdbc:drill:> SELECT * FROM hive.`types_demo`;
Many machine learning solutions providers can help you in learning this technology. This article is also excellent for understanding the concept and essential things about machine learning.
ReplyDelete