Explore topic-wise InterviewSolutions in .

This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.

1.

What do we need to do to create temporary tables?

Answer»

In the CREATE TABLE DDL, SPECIFY the TEMPORARY keyword (or the TEMP abbreviation) to create a temporary table. The following syntax must be used to create temporary tables:

Create temporary table mytable (id number, creation_date date);Conclusion

Among the leading cloud data warehouse solutions is Snowflake due to its innovative features such as separating computing and storage, facilitating data sharing and cleaning, and supporting popular programming languages such as Java, Go, .Net, Python, etc. Several technology giants, including Adobe Systems, Amazon Web Services, Informatica, Logitech, and Looker, are building data-intensive applications using the Snowflake platform. Snowflake PROFESSIONALS are therefore always in demand.  

Have trouble preparing for a Snowflake job INTERVIEW? Don't worry, we have compiled a list of top 30+ Snowflake interview QUESTIONS and answers that will help you.

Useful Interview Resources:

  • Data Warehouse Tools
  • Big Data Interview Questions
  • Cloud Computing Interview Questions 
2.

What is the best way to remove a string that is an anagram of an earlier string from an array?

Answer»

For instance, an ARRAY of strings arr is GIVEN. The task is to remove all strings that are anagrams of an earlier string, then print the remaining array in sorted order. 

  • Examples:  
    Input: arr[] = { “Scaler”, “Lacers”, “Accdemy”, “Academy” }, N = 4 

    Output: [“Scaler”, “Academy”,]  

    Explanation: “Listen” and “Silent” are anagrams, so we remove “Silent”. Similarly, “Scaler” and “Lacers” are anagrams, so we remove “Lacers”. 
     
  • Code Implementation
import java.util.*; class InterviewBit{ // Function to remove the ANAGRAM String static void removeAnagrams(String arr[], int N) { // vector to store the FINAL result Vector ans = NEW Vector(); // A data structure to keep track of previously encountered Strings HashSet found = new HashSet (); for (int i = 0; i < N; i++) { String word = arr[i]; // Sort the characters of the current String word = sort(word); // Check if the current String is not present in the hashmap // Then, push it into the resultant vector and insert it into the hashmap if (!found.contains(word)) { ans.add(arr[i]); found.add(word); } } // Sort the resultant vector of Strings Collections.sort(ans); // Print the required array for (int i = 0; i < ans.size(); ++i) { System.out.print(ans.get(i)+ " "); } } static String sort(String inputString) { // convert input string to char array char tempArray[] = inputString.toCharArray(); // sort tempArray Arrays.sort(tempArray); // return new sorted string return new String(tempArray); } // Driver code public static void main(String[] args) { String arr[]= { "Scaler", "Lacers", "Accdemy", "Academy" }; int N = 4; removeAnagrams(arr, N); } }
  • Output:
Scaler Academy
3.

Explain what do you mean by data shares in Snowflake?

Answer»

Data sharing via SNOWFLAKE allows organizations to share data quickly and securely between Snowflake accounts. Database OBJECTS that are shared between snowflake accounts are only readable and can't be changed or MODIFIED. The three types of sharing are as follows: 

  • Data sharing between management units 
  • Data sharing between FUNCTIONAL units 
  • Sharing data between geographically dispersed areas.
4.

What do you mean by zero-copy cloning in Snowflake?

Answer»

Zero-copy cloning is one of the great features of Snowflake. It basically allows you to duplicate the source object without making a physical copy of it or adding additional storage costs to it. A snapshot of the data in a source object is taken when a clone (cloned object) is created, and it is made available to the cloned object. Cloned objects are independent of the source object and are therefore writable, and any changes made to either object are not reflected in the other. The keyword CLONE allows you to copy tables, SCHEMAS, databases without actually copying any data. 

Zero copy cloning syntax in Snowflake

  • In ORDER to clone an entire production DATABASE for development purposes:
CREATE DATABASE Dev CLONE Prod;
  • In order to clone a schema
CREATE SCHEMA Dev.DataSchema1 CLONE Prod.DataSchema1;
  • In order to clone a SINGLE table:
CREATE TABLE C CLONE Dev.public.C;
5.

What are different snowflake editions?

Answer»

Snowflake offers MULTIPLE editions to MEET your organization's specific needs. In every subsequent EDITION, either new features are introduced or a higher level of service is provided. It's easy to switch editions as your organization's needs change. 

The following are some of the Snowflake Editions: 

  • Standard Edition: This is Snowflake's entry-level offering, which allows full access to all of Snowflake's standard features. This edition strikes a good balance between features, level of support, and price.
  • Enterprise Edition: With Enterprise Edition, you get all the features and services of Standard Edition, but with some extra features designed to meet the needs of large-scale enterprises.
  • Business-critical Edition: Previously known as Enterprise for Sensitive Data (ESD), Business Critical Edition offers even more advanced levels of data security for organizations that deal with highly sensitive data. This edition includes all the features and services of the Enterprise Edition, plus enhanced security and data protection.
  • Virtual Private Snowflake (VPS): With this edition, organizations with strict security REQUIREMENTS, like financial institutions and companies that collect, analyze, and share highly sensitive data, can get the highest level of security.
6.

Explain Snowflake caching and write its type.

Answer»

CONSIDER an example where a query takes 15 minutes to RUN or execute. Now, if you were to repeat the same query with the same frequently used data, later on, you would be doing the same work and wasting resources.
Alternatively, Snowflake caches (stores) the results of each query you run, so whenever a new query is submitted, it checks if a matching query ALREADY exists, and if it did, it uses the cached results rather than running the new query again. DUE to Snowflake's ability to FETCH the results directly from the cache, query times are greatly reduced. 

Types of Caching in Snowflake 

  • Query Results Caching: It stores the results of all queries executed in the past 24 hours.
  • Local Disk Caching: It is used to store data used or required for performing SQL queries.  It is often referred to as
  • Remote Disk Cache: It holds results for long-term use.

The following diagram visualizes the levels at which Snowflake caches data and results for subsequent use. 

7.

Explain how data compression works in Snowflake and write its advantages.

Answer»

An important aspect of data compression is the encoding, restructuring, or other modifications necessary to minimize its size. As soon as we input data into Snowflake, it is systematically compacted (COMPRESSED). Compressing and storing the data in Snowflake is achieved through modern data compression ALGORITHMS. What makes snowflake so great is that it CHARGES customers by the size of their data after compression, not by the exact data. 

Snowflake Compression has the following advantages: 

  • Compression lowers storage costs compared with original cloud storage.
  • On-disk caches do not incur storage costs.
  • In general, data sharing and CLONING involve no storage expenses.
8.

Could AWS glue connect to Snowflake?

Answer»

YES, you can connect the Snowflake to AWS glue. AWS glue fits seamlessly into Snowflake as a data WAREHOUSE service and presents a comprehensive managed ENVIRONMENT. Combining these two SOLUTIONS makes data ingestion and transformation EASIER and more flexible.

9.

Can you explain how Snowflake differs from AWS (Amazon Web Service)?

Answer»

Cloud-based DATA warehouse platforms like Snowflake and Amazon Redshift provide excellent performance, scalability, and business intelligence tools. In terms of CORE FUNCTIONALITY, both platforms provide similar capabilities, such as RELATIONAL management, security, scalability, cost efficiency, etc. There are, however, several differences between them, such as pricing, user experience and deployment options. 

  • There is no maintenance required with Snowflake as it is a complete SaaS (Software as a Service) offering. In contrast, AWS Redshift clusters require manual maintenance.
  • The Snowflake security model uses always-on encryption to enforce strict security checks, while Redshift uses a flexible, customizable approach.
  • Storage and computation in Snowflake are completely independent, meaning the storage costs are approximately the same as those in S3. In contrast, AWS bypasses this problem with a Red Shift spectrum and lets you QUERY data that is directly available in S3. Despite this, it is not flawless like Snowflake.
10.

Explain what is fail-safe.

Answer»

SNOWFLAKE offers a default 7-day PERIOD during which HISTORICAL data can be retrieved as a fail-safe feature. Following the expiration of the Time Travel data retention period, the fail-safe default period begins. Data recovery through fail-safe is performed under best-effort conditions, and only after all other recovery options have been exhausted. Snowflake may use it to recover data that has been lost or DAMAGED due to extreme operational failures. It may take several HOURS to several days for Fail-safe to complete data recovery. 

11.

What is Data Retention Period in Snowflake?

Answer»

The DATA retention period is a critical component of SNOWFLAKE Time TRAVEL. When data in a table is modified, such as when data is deleted or objects containing data are removed, Snowflake preserves the state of that data before it was updated. Data retention specifies how many days historical data will be preserved, enabling Time Travel operations (SELECT, CREATE, CLONE, UNDROP, etc.) to be performed on it.  

All Snowflake ACCOUNTS have a default retention period of 1 day (24 hours). By default, the data retention period for standard objectives is 1 day, while for enterprise editions and higher accounts, it is 0 to 90 days.  

12.

Explain what is Snowflake Time travel and Data Retention Period.

Answer»

Time travel is a Snowflake feature that gives you access to historical data present in the Snowflake data warehouse. For example, SUPPOSE you accidentally DELETE a table named Employee. Using time travel, it is possible to go BACK five minutes in time to retrieve the data you lost. Data that has been altered or deleted can be accessed via Snowflake Time Travel at any point within a defined period. It is capable of performing the following tasks within a specific/defined period of time: 

  • Analyzing data manipulations and usage over a specified period of time.
  • Restoring data-related OBJECTS (tables, schemas, and databases) that are accidentally lost (dropped)/
  • BACKUP and duplication of data (clones) at or before specific points in the past.

As soon as the defined/specific period of time (data retention period) expires, the data moves into Snowflake Fail-safe and these actions/tasks cannot be performed. 

13.

State difference between Star Schema and Snowflake Schema.

Answer»

Schemas LIKE Star and Snowflake serve as a logical description of the entire database, or how data is ORGANIZED in a database. 

  • Star Schema: A star schema typically consists of one fact table and SEVERAL associated dimension tables. The star schema is so named because the structure has the appearance of a star. Dimensions of the star schema have been denormalized. When the same values are repeated within a table, it is considered denormalization.
  • Snowflake Schema: In a snowflake schema, the center has a fact table, which is associated with a number of dimension tables. Those dimension tables are in turn associated with other dimension tables. Snowflake schemas provide fully normalized data structures. Separate dimensional tables are used for the various levels of hierarchy (CITY &GT; country > region).
14.

Explain Schema in Snowflake.

Answer»

The Snowflake Schema describes how data is organized in Snowflake. Schemas are basically a logical grouping of database objects (such as tables, views, etc.). Snowflake schemas consist of one fact table linked to many dimension tables, which link to other dimension tables via many-to-one relationships. A fact table (stores quantitative data for analysis) is surrounded by its associated dimensions, which are related to other dimensions, FORMING a snowflake PATTERN. Measurements and facts of a business process are contained in a Fact Table, which is a key to a Dimension Table, while attributes of measurements are stored in a Dimension Table. Snowflake offers a complete set of DDL (Data Definition Language) commands for CREATING and maintaining DATABASES and schemas. 

As shown in the above diagram, the snowflake schema has one fact table and two-dimension tables, each with THREE levels. Snowflake schemas can have an unlimited number of dimensions, and each dimension can have an infinite number of levels. 

15.

How is data stored in Snowflake? Explain Columnar Database.

Answer»

After data is loaded into Snowflake, it automatically reorganizes the data into a compressed, optimized, columnar format (micro-partitions). The optimized data is then stored in the cloud storage. Snowflake manages all aspects of storing these data, including file structure, size, statistics, compression, metadata, etc. Snowflake data objects aren't visible to CUSTOMERS or users. Users can only access data by performing SQL queries on Snowflake. Snowflake uses a columnar format to optimize and store data within the storage layer. With it, data is stored in columns instead of rows, allowing for an ANALYTICAL querying method and improving database performance. With Columnar databases, business intelligence will be easier and more ACCURATE. COMPARED to row-level operations, column-level operations are faster and use FEWER resources than row-level operations. 

Above is an example of a table with 24 rows divided into four micro-partitions, arranged and sorted by column. As the data is divided into micro-partitions, Snowflake can first remove those micro-partitions not relevant to the query, followed by pruning the remaining micro-partitions by column. The result is fewer records traversed, resulting in significantly faster response times.