InterviewSolution
This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.
| 1. |
We Will Be Using Pdi Integrated In A Web Application Deployed On An Application Server. We’ve Created A Jndi Datasource In Our Application Server. Of Course Spoon Doesn’t Run In The Context Of The Application Server, So How Can We Use The Jndi Data Source In Pdi? |
|
Answer» If you look in the PDI MAIN directory you will see a sub-directory “simple-jndi”, which contains a file CALLED “jdbc.properties”. You should change this file so that the JNDI INFORMATION matches the one you use in your application server. After that you set in the connection TAB of Spoon the “Method of access” to JNDI, the “Connection type” to the type of database you’re using. And “Connection name” to the name of the JDNI datasource (as USED in “jdbc.properties”). If you look in the PDI main directory you will see a sub-directory “simple-jndi”, which contains a file called “jdbc.properties”. You should change this file so that the JNDI information matches the one you use in your application server. After that you set in the connection tab of Spoon the “Method of access” to JNDI, the “Connection type” to the type of database you’re using. And “Connection name” to the name of the JDNI datasource (as used in “jdbc.properties”). |
|
| 2. |
I’ve Got A Transformation That Doesn’t Run Fast Enough, But It Is Hard To Tell In What Order To Optimize The Steps. What Should I Do? |
|
Answer» Transformations stream data through their STEPS:
Transformations stream data through their steps: |
|
| 3. |
Why Can’t I Duplicate Fieldnames In A Single Row? |
|
Answer» You can’t. PDI will complain in most of the cases if you have DUPLICATE fieldnames. Before PDI v2.5.0 you were ABLE to force duplicate fields, but ALSO only the first value of the duplicate fields could ever be used. You can’t. PDI will complain in most of the cases if you have duplicate fieldnames. Before PDI v2.5.0 you were able to force duplicate fields, but also only the first value of the duplicate fields could ever be used. |
|
| 4. |
How Do You Duplicate A Field In A Row In A Transformation? |
|
Answer» SEVERAL solutions exist: Use a “Select Values” step renaming a field while selecting also the original one. The result will be that the original field will be duplicated to another name. It will look as FOLLOWS: This will DUPLICATE fieldA to fieldB and fieldC. Use a calculator step and use e.g. The NLV(A,B) operation as follows: This will have the same effect as the first solution: 3 fields in the output which are copies of each other: fieldA, fieldB, and fieldC. Use a JavaScript step to copy the field: This will have the same effect as the previous solutions: 3 fields in the output which are copies of each other: fieldA, fieldB, and fieldC. Several solutions exist: Use a “Select Values” step renaming a field while selecting also the original one. The result will be that the original field will be duplicated to another name. It will look as follows: This will duplicate fieldA to fieldB and fieldC. Use a calculator step and use e.g. The NLV(A,B) operation as follows: This will have the same effect as the first solution: 3 fields in the output which are copies of each other: fieldA, fieldB, and fieldC. Use a JavaScript step to copy the field: This will have the same effect as the previous solutions: 3 fields in the output which are copies of each other: fieldA, fieldB, and fieldC. |
|
| 5. |
Define Multi-dimensional Cube? |
|
Answer» It is a cube to VIEW data where we can slice and dice the data. It have time DIMENSION, LOCATIONS and figures. It is a cube to view data where we can slice and dice the data. It have time dimension, locations and figures. |
|
| 6. |
Define Mdx? |
|
Answer» MDX is multi- DIMENSIONAL EXPRESSION which is a MAIN QUERY language IMPLEMENTED by the Mondrains. MDX is multi- dimensional expression which is a main query language implemented by the Mondrains. |
|
| 7. |
What Are Various Tools In Etl? |
|
Answer» ABINITIO,DATASTAGE, INFORMATICA, Cognos Decision STREAM, ETC Abinitio,DataStage, Informatica, Cognos Decision Stream, etc |
|
| 8. |
Wha Is Xml? |
|
Answer» XML is an extensiable markup language which DEFINES a SET of rule for encoding documents in both FORMATS which is human READABLE and machine readable. XML is an extensiable markup language which defines a set of rule for encoding documents in both formats which is human readable and machine readable. |
|
| 9. |
Differentiate Between Etl Tool And Olap Tool? |
|
Answer» ETL TOOL is used for extracting data from the legecy system and LOAD it into specified database with some PROCESSING of cleansing data. OLAP Tool is used for reporting process . Here data is available in MULTIDIMENSIONAL model hence we can write simple query to extract data from database. ETL Tool is used for extracting data from the legecy system and load it into specified database with some processing of cleansing data. OLAP Tool is used for reporting process . Here data is available in multidimensional model hence we can write simple query to extract data from database. |
|
| 10. |
What Do You Understand By Three Tier Data Warehouse? |
|
Answer» A DATA warehouse is said to be a three-tier SYSTEM where a middle system provides usable data in a secure WAY to END users. Both SIDE of this middle system are the end users and the back-end data stores. A data warehouse is said to be a three-tier system where a middle system provides usable data in a secure way to end users. Both side of this middle system are the end users and the back-end data stores. |
|
| 11. |
Define Mapplet? |
|
Answer» It CREATES and CONFIGURE the SET of TRANSFORMATION. It creates and configure the set of transformation. |
|
| 12. |
Explain Session? |
|
Answer» It is a set of INSTRUCTION which tell when and how to MOVE DATA from RESPECTIVE SOURCE to target. It is a set of instruction which tell when and how to move data from respective source to target. |
|
| 13. |
Define Mapping? |
|
Answer» DATAFLOW from SOURCE to TARGET is CALLED as MAPPING. Dataflow from source to target is called as mapping. |
|
| 14. |
Data Staging Is Actually A Group Of Procedures Used To Prepare Source System Data For Loading A Data Warehouse.? |
Answer»
|
|
| 15. |
What Is Data Staging? |
|
Answer» DATA staging is actually a group of procedures used to prepare SOURCE system data for LOADING a data warehouse. Data staging is actually a group of procedures used to prepare source system data for loading a data warehouse. |
|
| 16. |
What Are Snapshots? |
|
Answer» SNAPSHOTS are read-only COPIES of a master table located on a remote node which can be periodically refreshed to REFLECT CHANGES made to the master table. Snapshots are read-only copies of a master table located on a remote node which can be periodically refreshed to reflect changes made to the master table. |
|
| 17. |
What Is Etl Process? Write The Steps Also? |
|
Answer» ETL is extraction , transforming , LOADING process the STEPS are : 1 – define the source ETL is extraction , transforming , loading process the steps are : 1 – define the source |
|
| 18. |
Explain Why We Need Etl Tool? |
|
Answer» ETL Tool is used to get data from MANY source system like RDBMS, SAP, ETC. and CONVERT them based on the user requirement. It is REQUIRED when data float across many systems. ETL Tool is used to get data from many source system like RDBMS, SAP, etc. and convert them based on the user requirement. It is required when data float across many systems. |
|
| 19. |
What Do You Mean By Repository? |
|
Answer» Repository is a storage LOCATION where we can STORE the DATA SAFELY without any harmness. Repository is a storage location where we can store the data safely without any harmness. |
|
| 20. |
Explain Encrypting File System? |
|
Answer» It is the technology which enables files to be transparently ENCRYPTED to SECURE personal data from attackers with PHYSICAL ACCESS to the computer. It is the technology which enables files to be transparently encrypted to secure personal data from attackers with physical access to the computer. |
|
| 21. |
What Are The Steps To Decrypt A Folder Or File? |
| Answer» | |
| 22. |
What Do You Understand By Hierarchical Navigation? |
|
Answer» A HIERARCHICAL navigation menu ALLOWS the user to come directly to a section of the site SEVERAL LEVELS below the TOP. A hierarchical navigation menu allows the user to come directly to a section of the site several levels below the top. |
|
| 23. |
What Do You Un Derstand By The Term Etl? |
|
Answer» It is an ENTRI LEVEL TOOL for DATA MANIPULATION. It is an entri level tool for data manipulation. |
|
| 24. |
Brief About Pentaho Report Designer? |
|
Answer» It is a VISUAL, banded report writer. It has VARIOUS FEATURES lilke using subreports, CHARTS and GRAPHS etc. It is a visual, banded report writer. It has various features lilke using subreports, charts and graphs etc. |
|
| 25. |
Define Pentho Data Mining? |
|
Answer» Pentaho Data Mining USED the Waikato Environment for Information ANALYSIS to SEARCH data for patterns. It have functions for data processing, regression analysis, classification METHODS, etc. Pentaho Data Mining used the Waikato Environment for Information Analysis to search data for patterns. It have functions for data processing, regression analysis, classification methods, etc. |
|
| 26. |
Define Pentaho Schema Workbench? |
|
Answer» PENTAHO SCHEMA Workbench offers a GRAPHICAL edge for designing OLAP CUBES for Pentaho ANALYSIS. Pentaho Schema Workbench offers a graphical edge for designing OLAP cubes for Pentaho Analysis. |
|
| 27. |
What Is The Use Of Pentaho Reporting? |
|
Answer» Pentaho Reporting ALLOWS organizations to easily ACCESS, format and deliver INFORMATION to EMPLOYEES, customers and partners. Pentaho Reporting allows organizations to easily access, format and deliver information to employees, customers and partners. |
|
| 28. |
What Do You Understand By The Term Pentaho Dashboard? |
|
Answer» PENTAHO Dashboards give BUSINESS users the CRITICAL INFORMATION they need to UNDERSTAND and improve organizational performance. Pentaho Dashboards give business users the critical information they need to understand and improve organizational performance. |
|
| 29. |
What Are The Applications Of Pentaho? |
|
Answer» 1. SUITE Pentaho 2. All BUILD under Java platform 1. Suite Pentaho 2. All build under Java platform |
|
| 30. |
Differentiate Between Arguments And Variables? |
|
Answer» Arguments are command LINE arguments that we would normally specify during batch processing . Variables are environment or PDI variables that we would normally set in a PREVIOUS TRANSFORMATION in a job. Arguments are command line arguments that we would normally specify during batch processing . Variables are environment or PDI variables that we would normally set in a previous transformation in a job. |
|
| 31. |
What Are The Benefits Of Pentaho? |
|
Answer» 1. Open Source 1. Open Source |
|
| 32. |
Why Can’t We Duplicate Fieldnames In A Single Row? |
|
Answer» we can’t. if we have DUPLICATE fieldnames. Before PDI v2.5.0 we were able to force duplicate FIELDS, but ALSO only the FIRST value of the duplicate fields could EVER be used. we can’t. if we have duplicate fieldnames. Before PDI v2.5.0 we were able to force duplicate fields, but also only the first value of the duplicate fields could ever be used. |
|
| 33. |
By Default All Steps In A Transformation Run In Parallel, How Can We Make It So That 1 Row Gets Processed Completely Until The End Before The Next Row Is Processed?. |
|
Answer» This is not possible as in PDI transformations all the steps run in PARALLEL. So we can’t sequentialize them. This WOULD require architectural changes to PDI and sequential PROCESSING also result in very slow processing. This is not possible as in PDI transformations all the steps run in parallel. So we can’t sequentialize them. This would require architectural changes to PDI and sequential processing also result in very slow processing. |
|
| 34. |
How Do You Insert Booleans Into A Mysql Database, Pdi Encodes A Boolean As ‘y’ Or ‘n’ And This Can’t Be Insert Into A Bit(1) Column In Mysql.? |
|
Answer» BIT is not a standard SQL data type. It’s not even standard on MySQL as the meaning (core definition) CHANGED from MySQL version 4 to 5. Also a BIT USES 2 bytes on MySQL. That’s why in PDI we made the SAFE choice and went for a CHAR(1) to store a boolean. There is a simple workaround available: change the data type with a Select Values step to “INTEGER” in the metadata tab. This converts it to 1 for “true” and 0 for “false”, just like MySQL expects. BIT is not a standard SQL data type. It’s not even standard on MySQL as the meaning (core definition) changed from MySQL version 4 to 5. Also a BIT uses 2 bytes on MySQL. That’s why in PDI we made the safe choice and went for a char(1) to store a boolean. There is a simple workaround available: change the data type with a Select Values step to “Integer” in the metadata tab. This converts it to 1 for “true” and 0 for “false”, just like MySQL expects. |
|
| 35. |
How We Can Use Database Connections From Repository? |
|
Answer» We can CREATE a NEW conversion or close and re-open the ONES we have loaded in SPOON. We can Create a new conversion or close and re-open the ones we have loaded in Spoon. |
|
| 36. |
How To Sequentialize Transformations? |
|
Answer» it is not possible as in PDI transformations all of the STEPS RUN in PARALLEL. So we can’t sequentialize them. it is not possible as in PDI transformations all of the steps run in parallel. So we can’t sequentialize them. |
|
| 37. |
How To Do A Database Join With Pdi? |
| Answer» | |
| 38. |
Differentiate Between Transformations And Jobs? |
| Answer» | |
| 39. |
What Kind Of Data, Cube Contain? |
|
Answer» The Cube will contain the following data:
The Cube will contain the following data: |
|
| 40. |
Define Tuple? |
|
Answer» FINITE ORDERED list of elements is CALLED as TUPLE. Finite ordered list of elements is called as tuple. |
|
| 41. |
Explain Mdx? |
|
Answer» MULTIDIMENSIONAL Expressions (MDX) is a query language for OLAP DATABASES, much like SQL is a query language for RELATIONAL databases. It is also a CALCULATION language, with syntax similar to spreadsheet formulas. Multidimensional Expressions (MDX) is a query language for OLAP databases, much like SQL is a query language for relational databases. It is also a calculation language, with syntax similar to spreadsheet formulas. |
|
| 42. |
What Is The Pentaho Reporting Evaluation? |
|
Answer» Pentaho Reporting Evaluation is a PARTICULAR package of a subset of the Pentaho Reporting capabilities, DESIGNED for typical first-phase evaluation activities such as accessing sample data, creating and editing reports, and VIEWING and INTERACTING with reports. Pentaho Reporting Evaluation is a particular package of a subset of the Pentaho Reporting capabilities, designed for typical first-phase evaluation activities such as accessing sample data, creating and editing reports, and viewing and interacting with reports. |
|
| 43. |
How Does Pentaho Metadata Work? |
|
Answer» With the help of Pentaho’s open source METADATA capabilities, ADMINISTRATORS can outline a layer of ABSTRACTION that presents DATABASE information to business users in familiar business terms. With the help of Pentaho’s open source metadata capabilities, administrators can outline a layer of abstraction that presents database information to business users in familiar business terms. |
|
| 44. |
What Do You Understand By Pentaho Metadata? |
|
Answer» Pentaho METADATA is a PIECE of the Pentaho BI Platform designed to MAKE it easier for users to access INFORMATION in business terms. Pentaho Metadata is a piece of the Pentaho BI Platform designed to make it easier for users to access information in business terms. |
|
| 46. |
Which Platform Benefits From The Pentaho Bi Project? |
Answer»
|
|
| 47. |
What Major Applications Comprises Of Pentaho Bi Project? |
|
Answer» The Pentaho BI Project encompasses the following major APPLICATION areas:
The Pentaho BI Project encompasses the following major application areas: |
|
| 48. |
Define Pentaho Bi Project? |
|
Answer» The Pentaho BI Project is an current EFFORT by the OPEN Source communal to PROVIDE GROUPS with best-in-class solutions for their initiative Business Intelligence (BI) NEEDS. The Pentaho BI Project is an current effort by the Open Source communal to provide groups with best-in-class solutions for their initiative Business Intelligence (BI) needs. |
|
| 49. |
Mention Major Features Of Pentaho? |
|
Answer» Direct Analytics on MONGODB: It authorizes BUSINESS analysts and IT to ACCESS, analyze, and visualize MongoDB data. Science Pack: Pentaho’s Data Science Pack operationalizes analytical modeling and machine learning while ALLOWING data scientists and developers to unburden the labor of data preparation to Pentaho Data Integration. Full YARN Support for Hadoop: Pentaho’s YARN mixing enables ORGANIZATIONS to exploit the full computing power of Hadoop while leveraging existing skillsets and technology investments. Direct Analytics on MongoDB: It authorizes business analysts and IT to access, analyze, and visualize MongoDB data. Science Pack: Pentaho’s Data Science Pack operationalizes analytical modeling and machine learning while allowing data scientists and developers to unburden the labor of data preparation to Pentaho Data Integration. Full YARN Support for Hadoop: Pentaho’s YARN mixing enables organizations to exploit the full computing power of Hadoop while leveraging existing skillsets and technology investments. |
|
| 50. |
Explain Pentaho? |
|
Answer» It addresses the BLOCKADES that block the organization’s ABILITY to GET value from all our data. Pentaho is DISCOVERED to ensure that each member of our team from developers to business users can easily convert data into value. It addresses the blockades that block the organization’s ability to get value from all our data. Pentaho is discovered to ensure that each member of our team from developers to business users can easily convert data into value. |
|