1.

Is There An Update Statement?

Answer»

Impala does not currently have an UPDATE statement, which would typically be used to change a single row, a small group of rows, or a specific column. The HDFS-based files used by typical Impala queries are optimized for bulk operations across many megabytes of data at a time, making traditional UPDATEoperations inefficient or impractical.

You can use the following techniques to achieve the same goals as the FAMILIAR UPDATE statement, in a way that preserves efficient FILE layouts for subsequent queries:

  • Replace the entire contents of a table or partition with updated data that you have already staged in a different location, either using INSERT OVERWRITE, LOAD DATA, or manual HDFS file operations followed by a REFRESH statement for the table. Optionally, you can use built-in functions and expressions in the INSERTstatement to transform the copied data in the same way you would normally do in an UPDATE statement, for example to turn a mixed-case STRING into all uppercase or all lowercase.
  • To update a single row, use an HBase table, and ISSUE an INSERT ... VALUES statement using the same key as the original row. Because HBase handles duplicate keys by only RETURNING the latest row with a particular key value, the newly inserted row effectively hides the previous one.

Impala does not currently have an UPDATE statement, which would typically be used to change a single row, a small group of rows, or a specific column. The HDFS-based files used by typical Impala queries are optimized for bulk operations across many megabytes of data at a time, making traditional UPDATEoperations inefficient or impractical.

You can use the following techniques to achieve the same goals as the familiar UPDATE statement, in a way that preserves efficient file layouts for subsequent queries:



Discussion

No Comment Found