Support altering partition column type in Hive
AcceptedPublic

Press ? to show keyboard shortcuts.
Next Step
arc commit
Author
jingweilu
Reviewers
kevinwilfong
njain
sambavim
JIRA
Lint
Lint OK
Unit
No Unit Test Coverage
Branch
svn
Apply Patch
arc patch D6537
Arcanist Project
Restricted Arcanist Project
Subscribers
None
Projects
None
Summary

Currently, Hive does not allow altering partition column types. As we've discouraged users from using non-string partition column types, this presents a problem for users who want to change there partition columns to be strings, they have to rename their table, create a new table, and copy all the data over.
To support this via the CLI, adding a command like ALTER TABLE <table_name> PARTITION COLUMN (<column_name> <new type>);
The DDL will change one partition column type.

Test Plan

Positive test case:

create table with two partition key columns and both are string type.
 a. change one column from type string to int. Issue query on the table and filter on partition column.
 b. change other column from type string to double. Issue query on the table and filter on partition column. Use describe to check partition key column type has change.

Negative test case:

a. change a invalid column will fail.
b. change a partition key column to a invalid type name, such as time will fail.
c. change two partition key columns in same ddl will fail.
njain added inline comments.Via LegacyDec 6 2012, 3:20 PM
ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
1101

This should be caught at compile time - DDLSemanticAnalyzer

1129

remove these lines

ql/src/test/queries/clientpositive/alter_partition_coltype.q
16

Can you perform a explain extended on this query - and make sure only the right
partition is being selected ?

23

isn't this wrong ?
dt is of type int now

27

again, explain extended on the above queries to verify pruning

jingweilu added inline comments.Via LegacyDec 12 2012, 6:15 AM
ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
1101

I replaced with assert and add comment.

1129

removed.

ql/src/test/queries/clientpositive/alter_partition_coltype.q
16

Yes, added.

23

No this is fine. Partition column gets converted to string type without type check. So even dt is int type you can still insert any type as long as the converted string does not contain invalid character for file path.

27

Explain extended added.

jingweilu updated this revision.Via LegacyDec 12 2012, 6:19 AM

Fix based on comments

jingweilu updated this revision.Via LegacyJan 25 2013, 9:29 AM

Refresh to the latest code base.

njain added inline comments.Via LegacyJan 26 2013, 3:59 PM
ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableAlterPartDesc.java
25

Can you format it correctly ?
By default, we use 2 spaces

ql/src/test/queries/clientpositive/alter_partition_coltype.q
21

spelling mistake

37

issue a query which selects 2 different partitions of different datatypes.
What is the expected behavior ?

jingweilu updated this revision.Via LegacyJan 26 2013, 9:06 PM

Fix based on comments.

jingweilu updated this revision.Via LegacyJan 26 2013, 9:10 PM

Fix one more indentation issue.

njain added inline comments.Via LegacyJan 29 2013, 6:29 PM
ql/src/test/queries/clientpositive/alter_partition_coltype.q
36

Can you add a query like:

select key, value, dt, ts from alter_coltype where dt is not null;

  • A query where multiple partition values (of different datatypes) are being selected.

What happens if they are not compatible with each other ? Can you add a negative test
for that ?

jingweilu added inline comments.Via LegacyJan 30 2013, 12:11 AM
ql/src/test/queries/clientpositive/alter_partition_coltype.q
36

I have added the test case you requested.

what do you mean of "not compatible with each other"? Could you give me an example?

jingweilu updated this revision.Via LegacyFeb 1 2013, 7:28 AM

Update with one more test case to query partitions with different type.

njain added inline comments.Via LegacyFeb 3 2013, 6:29 PM
ql/src/test/results/clientpositive/alter_partition_coltype.q.out
793

This was the test I was talking about.

dt was a string (100x) and a int(10), and the select successfully converted it into
a string (common datatype).

For the negative test, say dt had 2 incompatible values: say int and binary
(with valid partitions). What happens to the above select ?

jingweilu updated this revision.Via LegacyFeb 5 2013, 10:54 PM

Add test for dt = '100x' or dt = '10'

ql/src/test/results/clientpositive/alter_partition_coltype.q.out
793

Partition column always convert to string during insert. During predicate evaluation the predicate value always convert to string first than compare to partition column value. Since any data type can convert to string there is only positive case. I have added one more test cases which select dt = '100x' and dt = '10'. Is this what you looking for?

njain accepted this revision.Via LegacyFeb 6 2013, 6:03 PM
jingweilu updated this revision.Via LegacyFeb 14 2013, 7:41 PM

Refresh and update the patch to latest.

jingweilu updated this revision.Via LegacyFeb 14 2013, 7:52 PM

Remove some log files.

jingweilu updated this revision.Via LegacyFeb 19 2013, 7:37 PM

Refresh and updated the test output.

Revision Update History

DiffIDBaseDescriptionCreatedLintUnit
BaseBase
Diff 12127930512Nov 6 2012, 7:33 PM
Diff 22353531608Fix based on commentsDec 12 2012, 6:19 AM
Diff 32636732339Refresh to the latest code base.Jan 25 2013, 9:29 AM
Diff 42647532339Fix based on comments.Jan 26 2013, 9:06 PM
Diff 52648132339Fix one more indentation issue. Jan 26 2013, 9:10 PM
Diff 62704532339Update with one more test case to query partitions with different type.Feb 1 2013, 7:28 AM
Diff 72726732339Add test for dt = '100x' or dt = '10'Feb 5 2013, 10:54 PM
Diff 82776532767Refresh and update the patch to latest.Feb 14 2013, 7:41 PM
Diff 92777132769Remove some log files. Feb 14 2013, 7:52 PM
Diff 102802332855Refresh and updated the test output.Feb 19 2013, 7:37 PM

Diff 28023

metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java

Loading...

ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java

Loading...

ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java

Loading...

ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g

Loading...

ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java

Loading...

ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableAlterPartDesc.java

Loading...

ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java

Loading...

ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java

Loading...

ql/src/test/queries/clientnegative/alter_partition_coltype_2columns.q

Loading...

ql/src/test/queries/clientnegative/alter_partition_coltype_invalidcolname.q

Loading...

ql/src/test/queries/clientnegative/alter_partition_coltype_invalidtype.q

Loading...

ql/src/test/queries/clientpositive/alter_partition_coltype.q

Loading...

ql/src/test/results/clientnegative/alter_partition_coltype_2columns.q.out

Loading...

ql/src/test/results/clientnegative/alter_partition_coltype_invalidcolname.q.out

Loading...

ql/src/test/results/clientnegative/alter_partition_coltype_invalidtype.q.out

Loading...

ql/src/test/results/clientpositive/alter_partition_coltype.q.out

Loading...

Add Comment