InterviewSolution
| 1. |
What are the tools to extract Big Data? |
|
Answer» There are numerous tools available for Big Data extraction. For example, Flume, Kafka, Nifi, Sqoop, Chukwa, Talend, Scriptella, Morphlines, etc. Apart from data extraction, these tools also assist in modification and formatting the data. The Big Data extraction can be done in VARIOUS modes :
There are other issued also that needs to be addressed. The source and destination systems may have different I/O formats, different protocols, scalability, security issues, etc. So the data extraction and storage needs to be taken care of accordingly. Open source tools: Open source tools can be more suitable for budget-constrained users. They are supposed to have a sufficient knowledge base and the required supporting infrastructure in place. Some vendors do offer light or limited versions of their tools as open source.
For on-premise, closed environments, a batch extraction seems to be a good approach.
These tools offer an added advantage of data security and also takes care of any data compliance issues. So, an enterprise need not worry about these things. 'Talend Open Studio' is one of the good tools which offers data extraction as one of its features. It is one of the 'most powerful Data Integration' tools out there in the market.
'Scriptella' is one of the open-source ETL tools by Apache. It has various features related to data extraction, transformation, loading, database migration, etc. it can also execute the java scripts, SQL, Velocity, JEXL, etc. It also has interoperability with JDBC, LDAP, XML, and many other data sources. It is a very popular TOOL due to its ease of use and simplicity. Another best open-source tool is 'KETL'. It is best for data warehousing. It is BUILT on open, multi-threaded java oriented, XML based architecture. The major features of KETL are integration with 'security' and 'data management tools', scalable ACROSS multiple servers, etc. 'Kettle' - Pentaho Data Integrator. It is the default tool in 'Pentaho' Business-Intelligence Suite. There are other tools also such as Jaspersoft ETL, Clover ETL, Apatar ETL, GeoKettle, Jedox, etc. |
|