1.

What Features From Relational Databases Or Hive Are Not Available In Impala?

Answer»
  • Querying streaming data.
  • Deleting individual rows. You delete data in bulk by overwriting an entire table or partition, or by dropping a table.
  • Indexing (not CURRENTLY). LZO-compressed text files can be indexed outside of Impala, as described in Using LZO-Compressed Text Files.
  • Full text search on text fields. The Cloudera Search product is appropriate for this use case.
  • CUSTOM HIVE Serializer/Deserializer classes (SerDes). Impala supports a set of common native file formats that have built-in SerDes in CDH.
  • Checkpointing within a query. That is, Impala does not save intermediate results to disk during long-running queries. Currently, Impala cancels a running query if any host on which that query is executing fails. When one or more hosts are down, Impala reroutes future queries to only use the available hosts, and Impala detects when the hosts come back up and begins using them again. Because a query can be submitted through any Impala node, there is no single point of failure. In the future, we will consider adding additional work allocation features to Impala, so that a running query would complete EVEN in the presence of host failures.
  • ENCRYPTION of data transmitted between Impala daemons.
  • Hive indexes.
  • Non-Hadoop data stores, such as relational databases.



Discussion

No Comment Found