Loading data in Hive
Loading data in Hive :

Loading data to internal table :

from local system  :


LOAD DATA LOCAL INPATH '/home/user/users_table.txt' INTO TABLE IT ;

from HDFS system  :


LOAD DATA INPATH '/home/user/users_table.txt' INTO TABLE IT ;

Loading data to external table :

 CREATE EXTERNAL TABLE TABLE1(
      firstname VARCHAR(64),
           lastname  VARCHAR(64),
         address   STRING,
              country   VARCHAR(64),
              city      VARCHAR(64),
             state     VARCHAR(64),
              )
        ROW FORMAT DELIMITED
        FIELDS TERMINATED BY ','
        LINES TERMINATED BY '\n'
       STORED AS TEXTFILE;
       LOCATION '/user/data/staging/'

Loading data using sqoop :


sqoop import --connect jdbc:teradata://192.168.25.25/Database=retail
--connection-manager org.apache.sqoop.teradata.TeradataConnManager --username dbc

--password dbc --table SOURCE_TBL --target-dir /user/hive/base_table -m 1

Incremental load using sqoop :


sqoop import --connect jdbc:teradata://192.168.25.25/Database=retail --connection-manager org.apache.sqoop.teradata.TeradataConnManager --username dbc --password dbc --table SOURCE_TBL --target-dir /user/hive/incremental_table -m 1 --check-column modified_date --incremental lastmodified --last-value '2017-01-01'


Using Query on source table:

sqoop import --connect jdbc:teradata://192.168.25.25/Database=retail
--connection-manager org.apache.sqoop.teradata.TeradataConnManager --username dbc
--password dbc --target-dir /user/hive/incremental_table -m 1
--query 'select * from SOURCE_TBL where modified_date > '2017-01-01' AND $CONDITIONS’

Incremental load in HDFS on Hive :