1、在impala中建立无分区的表,例如gxzl_kgx_drw_NP
create table if not exists gxzl_kgx_drw_NP (mat_track_no string,materialcode string,id double,defectid double,mainno string,unitno string,side string,x double,y double,defectclass string,defectclasscode string,imagefile string,mat_act_width double,mat_act_len double,prod_end_time_zd string,reccreatetime string,equipmentcode string,num double,seq double,len_sum bigint,len_tot bigint,x_sum bigint,y_sum bigint,z_sum bigint,x_drw double,y_drw double,z_drw double) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';11
2、在impala中建立需要的有分区的表,例如gxzl_kgx_drw
create table if not exists gxzl_kgx_drw (materialcode string,id double,defectid double,mainno string,unitno string,side string,x double,y double,defectclass string,defectclasscode string,imagefile string,mat_act_width double,mat_act_len double,prod_end_time_zd string,reccreatetime string,equipmentcode string,num double,seq double,len_sum bigint,len_tot bigint,x_sum bigint,y_sum bigint,z_sum bigint,x_drw double,y_drw double,z_drw double) **partitioned by (mat_track_no string)** ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';11
3、将文本文件插入到无分区表中
load data inpath '/user/gxzl_kgx_drw.txt' into table gxzl_kgx_drw_NP;11
注:impala的load只能使用hdfs文件路径,如果你的数据放在本地上,要先上传到hdfs中。
4、利用insert into…select向分区表中插入数据
insert into table gxzl_kgx_drw PARTITION(mat_track_no) select materialcode,id,defectid,mainno,unitno,side,x,y,defectclass,defectclasscode,imagefile,mat_act_width,mat_act_len,prod_end_time_zd,reccreatetime,equipmentcode,num,seq,len_sum,len_tot,x_sum,y_sum,z_sum,x_drw,y_drw,z_drw,**mat_track_no** from gxzl_kgx_drw_NP; 11
我不会~~~但还是要微笑~~~:)
Hadoop|
Apache Pig|
Apache Kafka|
Apache Storm|
Impala|
Zookeeper|
SAS|
TensorFlow|
人工智能基础|
Apache Kylin|
Openstack|
Flink|
MapReduce|
大数据|
云计算|
用户登录
还没有账号?立即注册
用户注册
投稿取消
文章分类: |
|
还能输入300字
上传中....