1.

What are the Table types available in Hive?

Answer»

There are two types of tables which HIVE supports.

  • Managed Table
  • External Table

Hive Managed Tables:
Hive Managed Table is also known as an internal table. When we will CREATE a table in Hive, by default Managed table will create and it manages the data as well. It means that Hive is storing the data into its warehouse directory. A managed table is stored under the hive.metastore.warehouse.dir path property and default location of table will be /apps/hive/warehouse/<db_name>.db/<table_name>. This path will be modifiable. If a managed table or partition is dropped, then the data and corresponding metadata of the table or partition are deleted. If you do not specify the PURGE option then the data is moved to a trash folder for a certain period, it will be deleted permanently after that.

Example:

1. Create Table

hive> create table univercity_db.school_table(name string, roll no int) row format delimited FIELDS terminated by ',';

OK

Time taken: 0.202 seconds

2. Describe table

hive> describe formatted univercity_db.school_table;

OK

you will get extra information like whether the table is managed or an external table. when the table is created, what kind of file format, Location of the data path in HDFS, whether the object is a table or view.

3. Load the data to table from the local path

hive>load data local inpath '/home/pbibhu/Desktop/blog/school' into table univercity_db.school_table;

After loading from the local path you can further use hive commands to select/count/describe etc    

Hive External Tables:
while creating an External table the location of the data path is not the usual warehouse path, you have to provide the HDFS path outside of the warehouse directory. While Creating an external table location is mandatory in the create syntax. By any chance structure or partitioning of an external table is changed then an MSCK REPAIR TABLE table_name statement can be USED to refresh metadata information. Basically, In External Table we cannot load the table from a local path. you have to load data from HDFS mentioning the path.   

Use external tables when files are present in the remote locations, and the files should REMAIN even if the external table is dropped.

Example:

1. Create Table

CREATE EXTERNAL TABLE IF NOT EXISTS univercity_db.school_table( student_ID INT, FirstName STRING, LastName STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE/ORC LOCATION 'hdfs/pbibhu/school';

2. Create partition Table

CREATE EXTERNAL TABLE IF NOT EXISTS univercity_db.school_table( student_ID INT, FirstName STRING, LastName STRING)  partitioned by (student_ID int) STORED AS ORC LOCATION 'hdfs/pbibhu/school';

3. insert the data to internal table from external table,data structure should be same for both the tables.

hive> CREATE TABLE IF NOT EXISTS office(EmployeeID INT,FirstName STRING, Title STRING,     State STRING, Laptop STRING)  STORED AS ORC;

OK

hive> CREATE EXTERNAL TABLE IF NOT EXISTS Office_text(    EmployeeID INT,FirstName STRING, Title STRING,    State STRING, Laptop STRING)    ROW FORMAT DELIMITED    FIELDS TERMINATED BY ','    STORED AS TEXTFILE    LOCATION '/user/pbibhu/office';

OK

hive> INSERT OVERWRITE TABLE office SELECT * FROM office_text;


Discussion

No Comment Found

Related InterviewSolutions