17 + Interview Questions in Teradata Interview Questions for Freshers in Teradata Interview Questions Page 1 InterviewSolution

1.	Explain channel driver.
Answer» As the name implies, a channel DRIVER acts as a means of communication between PEs and the applications that run on channels connected to CLIENTS. The TERADATA Gateway acts in much the same way as a channel driver, ACTING as a conduit between the Parse ENGINE and applications connected to network clients.

Discussion

2.	What do you mean by caching in Teradata?
Answer» In simplest terms, caching is a benefit of USING Teradata that involves storing frequently used data and information in cache memory so that, when the next time the data is needed, it can be retrieved DIRECTLY from memory instead of requiring the APPLICATION to generate it again. In Teradata, caching remains in the same order, which means that it does not change very often. In fact, CACHES are typically shared among several APPLICATIONS.

Discussion

3.	What are the benefits of using ETL tools over Teradata?
Answer» Extract, Transform, Load (ETL) means three distinct tasks for managing databases. ETL tools offer some ADVANTAGES over Teradata, including: Support for multiple heterogeneous destinations and sources of data. The ETL tools provide a full-featured GUI that simplifies the DEBUGGING process for managing databases. ETL tools offer the advantage of being able to reuse components. Thus, if the main server is UPDATED, all corresponding applications CONNECTED to the server are AUTOMATICALLY updated. ETL tools can be used to pivot (transform rows into columns) and de-pivot (transform columns into rows).

3.

What are the benefits of using ETL tools over Teradata?

Answer»

Extract, Transform, Load (ETL) means three distinct tasks for managing databases. ETL tools offer some ADVANTAGES over Teradata, including:

Support for multiple heterogeneous destinations and sources of data.
The ETL tools provide a full-featured GUI that simplifies the DEBUGGING process for managing databases.
ETL tools offer the advantage of being able to reuse components. Thus, if the main server is UPDATED, all corresponding applications CONNECTED to the server are AUTOMATICALLY updated.
ETL tools can be used to pivot (transform rows into columns) and de-pivot (transform columns into rows).

Discussion

4.	Is Teradata an ETL tool or a database?
Answer» Teradata is not an ETL (Extract, Transform and Load) tool. Teradata is an open-source RDBMS (relational database management system) that runs on different OPERATING systems, including Windows, UNIX, and Linux. Teradata is a relational database management system capable of HANDLING data loads in TERABYTES. The system is capable of handling large-scale data warehouse APPLICATIONS.

Discussion

5.	Mention a few ETL (Extract, Transform and Load) Tools that come under Teradata.
Answer» There are SEVERAL ETL (Extract, TRANSFORM, and Load) tools that are commonly used in TERADATA as follows: DataStage Informatica SSIS (SQL SERVER Integration Services)

5.

Mention a few ETL (Extract, Transform and Load) Tools that come under Teradata.

Answer»

There are SEVERAL ETL (Extract, TRANSFORM, and Load) tools that are commonly used in TERADATA as follows:

DataStage
Informatica
SSIS (SQL SERVER Integration Services)

Discussion

6.	Explain nodes in Teradata.
Answer» The Teradata System consists of nodes, which are the BASIC UNITS. Nodes are INDIVIDUAL SERVERS in a Teradata system. Each node comprises a separate operating system, CPU, memory, Teradata RDBMS software, and DISK space.

Discussion

7.	What do you mean by Spool space in Teradata? Write its usage.
Answer» Spool space refers to the unused space in the system in which intermediate results from a SQL query are stored. Query execution is not POSSIBLE for USERS without spool space. Amount of spool space is divided based on the number of AMPs, but each AMP has only a fraction of the space available. In the event that the per AMP limit is exceeded, the user will RECEIVE a warning that they have run out of spool space. Example: Let's say the user was assigned a spool space of 200000000 bytes. This is the maximum space the user is allowed to use and it is distributed evenly across all AMPs as shown below.

7.

What do you mean by Spool space in Teradata? Write its usage.

Answer»

Spool space refers to the unused space in the system in which intermediate results from a SQL query are stored. Query execution is not POSSIBLE for USERS without spool space. Amount of spool space is divided based on the number of AMPs, but each AMP has only a fraction of the space available. In the event that the per AMP limit is exceeded, the user will RECEIVE a warning that they have run out of spool space.

Example: Let's say the user was assigned a spool space of 200000000 bytes. This is the maximum space the user is allowed to use and it is distributed evenly across all AMPs as shown below.

Discussion

8.	What is Skewness in Teradata?
Answer» As a statistical CONCEPT, "skewness" refers to the row distribution on AMPs (ACCESS Module PROCESSORS). In data distribution, the Skew Factor refers to the distribution of table data among AMPs. Skewed factor of 0 INDICATE that the data is evenly distributed among the AMP's. In the case of highly skewed data, it means that some AMPs have more rows and some have very few, i.e., the distribution is not even. The Skew Factor is high in this case (unequal distribution of data), affecting performance and Teradata's parallelism. Choosing the right index can control the skewness of the data distribution. Ideally, you should choose a Primary Index that contains as many unique values as possible so as to AVOID skewness.

8.

What is Skewness in Teradata?

Answer»

As a statistical CONCEPT, "skewness" refers to the row distribution on AMPs (ACCESS Module PROCESSORS). In data distribution, the Skew Factor refers to the distribution of table data among AMPs. Skewed factor of 0 INDICATE that the data is evenly distributed among the AMP's. In the case of highly skewed data, it means that some AMPs have more rows and some have very few, i.e., the distribution is not even. The Skew Factor is high in this case (unequal distribution of data), affecting performance and Teradata's parallelism.

Choosing the right index can control the skewness of the data distribution. Ideally, you should choose a Primary Index that contains as many unique values as possible so as to AVOID skewness.

Discussion

9.	What is performance tuning and why is it important?
Answer» Specifically, Teradata Performance tuning involves identifying all bottlenecks in the database and resolving them. The bottleneck does not cause errors, but it certainly causes DELAYS in data retrieval from the database. A company without performance tuning for its database COULD suffer from imperfect responses to queries, causing unnecessary difficulties in accessing its data and other issues. The reasons for consistent performance tuning are as follows: Increasing the speed of data retrieval: A database that isn't optimized can cause retrieval of data to be slower if you have a lot of data. Performance tuning allows you to build indexes and fix problems that could delay the retrieval of your data. Avoiding CODING loops: Coding loops (repeating a certain block of code repeatedly until a certain condition is met) can increase the load on your database. SQL queries are executed multiple times if PRESENT in a loop, however if you move them out of the loop, performance can be improved since they are executed once instead of many times. Optimize SQL query performance: In order to enhance query performance, it is best to avoid correlated subqueries, to avoid overusing select (and instead declare each column individually), and to avoid TEMPORARY tables if possible. The purpose of performance tuning is generally to reduce the response time for the end user or the latency.

9.

What is performance tuning and why is it important?

Answer»

Specifically, Teradata Performance tuning involves identifying all bottlenecks in the database and resolving them. The bottleneck does not cause errors, but it certainly causes DELAYS in data retrieval from the database. A company without performance tuning for its database COULD suffer from imperfect responses to queries, causing unnecessary difficulties in accessing its data and other issues.

The reasons for consistent performance tuning are as follows:

Increasing the speed of data retrieval: A database that isn't optimized can cause retrieval of data to be slower if you have a lot of data. Performance tuning allows you to build indexes and fix problems that could delay the retrieval of your data.
Avoiding CODING loops: Coding loops (repeating a certain block of code repeatedly until a certain condition is met) can increase the load on your database. SQL queries are executed multiple times if PRESENT in a loop, however if you move them out of the loop, performance can be improved since they are executed once instead of many times.
Optimize SQL query performance: In order to enhance query performance, it is best to avoid correlated subqueries, to avoid overusing select (and instead declare each column individually), and to avoid TEMPORARY tables if possible. The purpose of performance tuning is generally to reduce the response time for the end user or the latency.

Discussion

10.	What is the process of restarting MLOAD Teradata Server after execution?
Answer» In general, the PROCESS begins from the last known checkpoint, and after the MLOAD SCRIPT is EXECUTED, the server is restarted.

Discussion

11.	What is the process of restarting MLOAD Client System after its failure?
Answer» Teradata MultiLoad jobs that failed or were aborted because of client system FAILURE can be restarted depending on whether they stopped during the application PHASE (apply all DML (Data Manipulation Language) operations). When the Teradata MultiLoad job was stopped prior to or after the application phase, you should restart the job as it is, without making any changes to the SCRIPT. Terradata MultiLoad uses the entries from the restart log table to determine its stopping POINT and begins processing at that point. ' When a Teradata MultiLoad job is aborted or the client system FAILS during the application stage, resolve the issue associated with failure and then restart the job again.

11.

What is the process of restarting MLOAD Client System after its failure?

Answer»

Teradata MultiLoad jobs that failed or were aborted because of client system FAILURE can be restarted depending on whether they stopped during the application PHASE (apply all DML (Data Manipulation Language) operations).

When the Teradata MultiLoad job was stopped prior to or after the application phase, you should restart the job as it is, without making any changes to the SCRIPT. Terradata MultiLoad uses the entries from the restart log table to determine its stopping POINT and begins processing at that point. '
When a Teradata MultiLoad job is aborted or the client system FAILS during the application stage, resolve the issue associated with failure and then restart the job again.

Discussion

12.	Why Multi-load doesn't support USI (Unique Secondary Index) instead of NUSI (Non-Unique Secondary Index)?
Answer» Teradata allows all AMP (ACCESS Module Processors) to OPERATE independently. With USI, the index subtable would have to be PRESENT on multiple AMPs, which would require COMMUNICATION between AMPs. But with NUSI, the index subtable would be present on the same AMP as the data row, which would allow that AMP to be handled independently. This is why NUSI is SUPPORTED by multi-load.

Discussion

13.	Explain different string manipulation operators and functions associated with Teradata.
Answer» The Teradata String functions are used to manipulate STRINGS and are also compatible with the ANSI STANDARD. Additionally, it supports some standard string functions as WELL as the Teradata extensions to those functions. SUBSTRING: It EXTRACTS a selective portion of the long string (ANSI standard). Example: Consider a string “InterviewBit” from a table. SELECT SUBSTRING('Interviewbit' FROM 1 FOR 5); Output: Inter POSITION: An individual character in a string (ANSI standard) can be located. Example: SELECT POSITION("r" IN "InterviewBit"); Output: 5 TRIM: Removes (trims) blank space from a specified string. Example: SELECT TRIM(" InterviewBit "); Output: InterviewBit UPPER: The string is converted to uppercase. Example: SELECT UPPER("InterviewBit"); Output: INTERVIEWBIT LOWER: The string is converted to lowercase. Example: SELECT LOWER("INTERVIEWBIT"); Output: interviewbit

13.

Explain different string manipulation operators and functions associated with Teradata.

Answer»

The Teradata String functions are used to manipulate STRINGS and are also compatible with the ANSI STANDARD. Additionally, it supports some standard string functions as WELL as the Teradata extensions to those functions.

SUBSTRING: It EXTRACTS a selective portion of the long string (ANSI standard).

Example: Consider a string “InterviewBit” from a table.

SELECT SUBSTRING('Interviewbit' FROM 1 FOR 5);

Output:

Inter

POSITION: An individual character in a string (ANSI standard) can be located.

Example:

SELECT POSITION("r" IN "InterviewBit");

Output:

5

TRIM: Removes (trims) blank space from a specified string.

Example:

SELECT TRIM(" InterviewBit ");

Output:

InterviewBit

UPPER: The string is converted to uppercase.

Example:

SELECT UPPER("InterviewBit");

Output:

INTERVIEWBIT

LOWER: The string is converted to lowercase.

Example:

SELECT LOWER("INTERVIEWBIT");

Output:

interviewbit

Discussion

14.	What do you mean by Teradata utilities and write different types of Teradata utilities?
Answer» Data can be loaded into Teradata databases as well as exported from Teradata databases to client applications using Teradata utilities. There are many Teradata utilities available, including: BASIC Teradata QUERY (BTEQ): Teradata provides the BTEQ utility, which can be used in batch (executing a collection of statements as a script) or interactive mode (executing statements one by one). You can also use it to execute any DDL (Data Definition LANGUAGE) or DML statement, macro, or stored procedure. With BTEQ, you can import data into Teradata tables as well as export data to FILES and reports. Fastload: The FastLoad utility loads data into tables quickly. There will be no duplicate rows even if the target table is a Multiset table (stores duplicate records). Multiload: MultiLoad is able to load data into multiple tables simultaneously and can perform different types of tasks such as INSERT, UPDATE, DELETE, and UPSERT. It is best suitable for operations such as bulk update, delete, upsert and as well as complex interface manipulations. Fastexport: This utility exports Teradata data into flat files. Alternatively, the utility can export the data as reports. Using joins, the utility can extract data from several tables at one time Teradata Parallel Data Pump(TPump): It helps maintain Teradata Databases by updating, deleting, inserting, and upserting data in databases. Multiple changes can be made at the same time. Teradata Parallel Transport (TPT): This is an all-in-one tool to load and export data into/from Teradata databases. Among existing utilities like fastload, fastexport, multiload, and TPUMP, Teradata recommends TPT.

14.

What do you mean by Teradata utilities and write different types of Teradata utilities?

Answer»

Data can be loaded into Teradata databases as well as exported from Teradata databases to client applications using Teradata utilities. There are many Teradata utilities available, including:

BASIC Teradata QUERY (BTEQ): Teradata provides the BTEQ utility, which can be used in batch (executing a collection of statements as a script) or interactive mode (executing statements one by one). You can also use it to execute any DDL (Data Definition LANGUAGE) or DML statement, macro, or stored procedure. With BTEQ, you can import data into Teradata tables as well as export data to FILES and reports.
Fastload: The FastLoad utility loads data into tables quickly. There will be no duplicate rows even if the target table is a Multiset table (stores duplicate records).
Multiload: MultiLoad is able to load data into multiple tables simultaneously and can perform different types of tasks such as INSERT, UPDATE, DELETE, and UPSERT. It is best suitable for operations such as bulk update, delete, upsert and as well as complex interface manipulations.
Fastexport: This utility exports Teradata data into flat files. Alternatively, the utility can export the data as reports. Using joins, the utility can extract data from several tables at one time
Teradata Parallel Data Pump(TPump): It helps maintain Teradata Databases by updating, deleting, inserting, and upserting data in databases. Multiple changes can be made at the same time.
Teradata Parallel Transport (TPT): This is an all-in-one tool to load and export data into/from Teradata databases. Among existing utilities like fastload, fastexport, multiload, and TPUMP, Teradata recommends TPT.

Discussion

15.	Explain Teradata Architecture.
Answer» The following diagram illustrates Teradata's architecture: Teradata's architecture is based on MPP (massively parallel processing). In general, it can be divided into two parts i.e., storage architectures and RETRIEVAL architectures. In total, the architecture consists of FOUR components, namely, a parsing engine, BYNET, AMPs, and disks. The first two components make up the storage architecture, and the last two are retrieval architecture components. Parsing Engine (PE): The Parsing Engine receives client queries and prepares an EFFICIENT execution plan to execute/run SQL queries. As soon as a user executes a SQL query, it is first connected to the PE (Parsing Engine). The PE PERFORMS the following functions: Verifies whether the queries have syntax errors. Determines whether or not the objects utilized by the SQL query exist. Prepares execution plans for these queries and then sends them to BYNET. Get the results of the SQL query from the AMPs and send it to the client. Access Module Processors (AMPs): It is a virtual processor that is connected to PE via BYNET. Each AMP contains its own disk allowing it to read and write data from disks. As such, it is referred to as SHARED nothing architecture. After receiving the data and execution plan from Parsing Engine, AMPs perform any data type conversion, aggregation, filtering, and sorting, and further write (store) data to the corresponding disks. When the query is fired, all AMPs work together to give back the data. BYNETs: BYNET functions as a communication channel between PEs and AMPs. This component receives the execution plan from the parsing engine and passes it on to the AMPs. Teradata has two BYNETs, named BYNET 0 and BYNET 1, but we refer to them as a single system. Reason for having 2 BYNETs: The second BYNET can take over if the first BYNET fails. Both BYNETs can be made functional when data volume is large, which will enhance communication between PEs and AMPs, and therefore speed up the process. Disks: These are Virtual Disks offered by Teradata for each AMP. Vdisk, or Virtual Disk, is the storage area of each AMP.

15.

Explain Teradata Architecture.

Answer»

The following diagram illustrates Teradata's architecture:

Teradata's architecture is based on MPP (massively parallel processing). In general, it can be divided into two parts i.e., storage architectures and RETRIEVAL architectures. In total, the architecture consists of FOUR components, namely, a parsing engine, BYNET, AMPs, and disks. The first two components make up the storage architecture, and the last two are retrieval architecture components.

Parsing Engine (PE): The Parsing Engine receives client queries and prepares an EFFICIENT execution plan to execute/run SQL queries. As soon as a user executes a SQL query, it is first connected to the PE (Parsing Engine). The PE PERFORMS the following functions:
- Verifies whether the queries have syntax errors.
- Determines whether or not the objects utilized by the SQL query exist.
- Prepares execution plans for these queries and then sends them to BYNET.
- Get the results of the SQL query from the AMPs and send it to the client.
Access Module Processors (AMPs): It is a virtual processor that is connected to PE via BYNET. Each AMP contains its own disk allowing it to read and write data from disks. As such, it is referred to as SHARED nothing architecture. After receiving the data and execution plan from Parsing Engine, AMPs perform any data type conversion, aggregation, filtering, and sorting, and further write (store) data to the corresponding disks. When the query is fired, all AMPs work together to give back the data.
BYNETs: BYNET functions as a communication channel between PEs and AMPs. This component receives the execution plan from the parsing engine and passes it on to the AMPs. Teradata has two BYNETs, named BYNET 0 and BYNET 1, but we refer to them as a single system. Reason for having 2 BYNETs:
- The second BYNET can take over if the first BYNET fails.
- Both BYNETs can be made functional when data volume is large, which will enhance communication between PEs and AMPs, and therefore speed up the process.
Disks: These are Virtual Disks offered by Teradata for each AMP. Vdisk, or Virtual Disk, is the storage area of each AMP.

Discussion

16.	What are the newly developed features of Teradata?
Answer» Teradata offers the following features: Unlimited Parallelism: Teradata is based on Massive Parallel Processing (MPP), which allows it to divide large tasks (related to data processing) into smaller tasks and run them in parallel. Shared Nothing Architecture: As this database is BUILT on a shared-nothing architecture, the disks, Teradata nodes, and AMPs (Access Module Processors) are all independent and do not share resources with others, resulting in an optimized performance for the given task. Linear Scalability: Teradata Systems are linearly scalable and can HANDLE large volumes of data in the most efficient manner. Connectivity: Teradata is best in terms of connectivity as it can CONNECT with channel-attached systems, such as mainframes or networks. Mature Optimizer: Teradata offers a mature optimizer (which provides the most efficient method for retrieving the data required by a SQL query) that can handle up to 64 joins (combining data from different tables in a database) per SQL query. SQL (Structured Query Language): This standard language is supported by Teradata as a means of interacting with data stored in tables. It also offers its own extensions. Robust Utilities: As part of Teradata's robust UTILITY suite, Teradata provides utilities for importing and exporting data from/to Teradata systems such as Fastload, Fastexport, Multiload, and TPT (Teradata Parallel Transporter), and many more. Load & Unload utilities: Teradata features load & unload utilities, which allow users to move data into and out of the Teradata system. Automatic DISTRIBUTION: Teradata automatically distributes data evenly among the disks, requiring no manual intervention. Low TCO (Total cost of ownership): Due to its ease of setup, administration, and maintenance, it has a low TCO.

16.

What are the newly developed features of Teradata?

Answer»

Teradata offers the following features:

Unlimited Parallelism: Teradata is based on Massive Parallel Processing (MPP), which allows it to divide large tasks (related to data processing) into smaller tasks and run them in parallel.
Shared Nothing Architecture: As this database is BUILT on a shared-nothing architecture, the disks, Teradata nodes, and AMPs (Access Module Processors) are all independent and do not share resources with others, resulting in an optimized performance for the given task.
Linear Scalability: Teradata Systems are linearly scalable and can HANDLE large volumes of data in the most efficient manner.
Connectivity: Teradata is best in terms of connectivity as it can CONNECT with channel-attached systems, such as mainframes or networks.
Mature Optimizer: Teradata offers a mature optimizer (which provides the most efficient method for retrieving the data required by a SQL query) that can handle up to 64 joins (combining data from different tables in a database) per SQL query.
SQL (Structured Query Language): This standard language is supported by Teradata as a means of interacting with data stored in tables. It also offers its own extensions.
Robust Utilities: As part of Teradata's robust UTILITY suite, Teradata provides utilities for importing and exporting data from/to Teradata systems such as Fastload, Fastexport, Multiload, and TPT (Teradata Parallel Transporter), and many more.
Load & Unload utilities: Teradata features load & unload utilities, which allow users to move data into and out of the Teradata system.
Automatic DISTRIBUTION: Teradata automatically distributes data evenly among the disks, requiring no manual intervention.
Low TCO (Total cost of ownership): Due to its ease of setup, administration, and maintenance, it has a low TCO.

Discussion

17.	What is the importance of using Teradata?
Answer» The following are reasons why TERADATA is important: The system has the capability of handling (storing and processing) large volumes of data, more than 50 petabytes. You can integrate it with various business intelligence (BI) tools. This software supports OLAP (online analytical processing), enabling users to PERFORM complex analytics on data. Teradata offers a comprehensive set of services (full) concerning data warehousing like cloud-based and hardware-based data warehousing, business analytics, etc. SQL (Structured Query Language) is supported by Teradata as a means of interacting with data stored in tables. The application consists of diverse queries that offer flexibility to users. Using the Teradata platform, companies can consolidate their core business objectives by organizing their analytical capabilities. Teradata is based on Massive PARALLEL Processing (MPP), which MAKES it possible to run multiple tasks simultaneously and efficiently, resulting in FAST processing speeds.

17.

What is the importance of using Teradata?

Answer»

The following are reasons why TERADATA is important:

The system has the capability of handling (storing and processing) large volumes of data, more than 50 petabytes.
You can integrate it with various business intelligence (BI) tools.
This software supports OLAP (online analytical processing), enabling users to PERFORM complex analytics on data.
Teradata offers a comprehensive set of services (full) concerning data warehousing like cloud-based and hardware-based data warehousing, business analytics, etc.
SQL (Structured Query Language) is supported by Teradata as a means of interacting with data stored in tables. The application consists of diverse queries that offer flexibility to users.
Using the Teradata platform, companies can consolidate their core business objectives by organizing their analytical capabilities.
Teradata is based on Massive PARALLEL Processing (MPP), which MAKES it possible to run multiple tasks simultaneously and efficiently, resulting in FAST processing speeds.

Discussion

Explore topic-wise InterviewSolutions in .

Explain channel driver.

What do you mean by caching in Teradata?

What are the benefits of using ETL tools over Teradata?

Is Teradata an ETL tool or a database?

Mention a few ETL (Extract, Transform and Load) Tools that come under Teradata.

Explain nodes in Teradata.

What do you mean by Spool space in Teradata? Write its usage.

What is Skewness in Teradata?

What is performance tuning and why is it important?

What is the process of restarting MLOAD Teradata Server after execution?

What is the process of restarting MLOAD Client System after its failure?

Why Multi-load doesn't support USI (Unique Secondary Index) instead of NUSI (Non-Unique Secondary Index)?

Explain different string manipulation operators and functions associated with Teradata.

What do you mean by Teradata utilities and write different types of Teradata utilities?

Explain Teradata Architecture.

What are the newly developed features of Teradata?

What is the importance of using Teradata?