Explore topic-wise InterviewSolutions in .

This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.

1.

What are some common mistakes that people make while writing programs in SAS?

Answer»

The following are some of the most common programming errors in SAS:  

  • If a semicolon is missing from a statement, SAS will misinterpret not only that statement but potentially several that follow.
  • A number of errors will result from unclosed quotes and unclosed comments because SAS may fail to read the SUBSEQUENT statements correctly.
  • Data and procedure steps have very different functions in SAS, so statements that are VALID in ONE will probably cause errors in the other.
  • Data is not SORTED before using a statement that requires a sort
  • Submitted programs are not checked for LOG entries.
  • The quotation marks are not matched.
  • The dataset option is invalid or the statement option is invalid.
  • Debugging techniques are not used.
2.

What are different ways to exclude or include specific variables in a dataset?

Answer»

DROP and KEEP statements can be used to exclude or include specific variables from a data set. 

  • Drop Statement: This instructs SAS which variables to remove from the data set.
  • Keep Statement: The variables in the data set to be retained are specified using this statement.

Example: Consider the following data set: 

DATA outdata; INPUT gender $ section score1 score2; DATALINES; F A 17 20 F B 25 17 F C 12 15 M D 21 25 ; proc print; run;

The following DROP statement instructs SAS to drop variables score1 and score2. 

data readin; set outdata; totalsum = sum(score1,score2); drop score1, score2; run;

Output:  

Gender Section totalsum F A 37 F B 42 F C 27M D 46

The following KEEP statement instructs SAS to RETAIN score1 in the data set. 

data readin1; set readin; keep score1; run;

Output:  

Gender Section score1 totalsum F A 17 37 F B 25 42 F C 12 27 M D 21 46
3.

Explain _N_ and _ERROR_ in SAS.

Answer»

In a SAS DATA Step, there are two variables that are automatically created, namely, the _ERROR_ variable and the _N_ variable. 

  • _N_: Typically, this variable is used to keep track of the number of times a data step has been iterated. It is set to 1 by default. The variable _N_ increases every time the data step of a data statement is iterated.
  • _ERROR_: The VALUE is 0 by default and gives INFORMATION about any errors that OCCUR during execution. Whenever there is an error, such as an input data error, a math error, or a conversion error, the value is set to 1. This variable can be used to locate errors in data records and to display an error message in the SAS log.
4.

What do you mean by the "+" operator and sum function?

Answer»

In SAS, summation or addition is performed either with the “sum” FUNCTION or by using the “+” operator. Function "Sum" returns the sum of arguments that are present (non-missing arguments), WHEREAS "+" operator returns a missing value if one or more arguments are not present or missing. 

Example: Consider a data set containing three variables a, b, and c. 

data variabledata; input a b c; cards; 1 2 3 34 3 4 . 3 2 53 . 3 54 4 . 45 4 2 ; run;

There are missing VALUES for all variables and we wish to compute the sum of all variables.

data sumofvariables; set variabledata; X=sum(a,b,c); y=a+b+c; run;

Output: 

x y 6 6 41 41 5 . 56 . 58 . 51 51

The value of y is missing for the 3rd, 4th, and 5th observations in the output. 

5.

Name different data types that SAS support.

Answer»

SAS supports two data types, i.e., Character and NUMERIC. Dates are ALSO CONSIDERED CHARACTERS despite the fact that there are IMPLICIT functions that can be performed on them. 

6.

State the difference between using the drop = data set option in the set statement and data statement.

Answer»

In SAS, the drop= option is used to exclude variables from processing or from the OUTPUT data set. This option tells SAS which variables you wish to remove from a data set.

  • The drop= option in the set statement can be used if you do not wish to process certain variables or do not want to have them included in the new data set.
  • However, if you want to process certain variables but don't want them to be included in the new data set, then choose drop= in the data statement.

Syntax: DROP=variable(s);

In this case, variable(s) lists one or more names of variables. Variables can be listed in any format SAS supports. 

Example: Consider the following data set: 

DATA outdata; INPUT GENDER $ section score1 score2; DATALINES; F A 17 20 F B 25 17 F C 12 15 M D 21 25 ; proc print; run;

The following DROP= data set option command SAS to drop variables score1 and score2. 

data READIN; set outdata (drop = score1 score2); TOTALSUM = sum(score1, score2); run;

Output:  

Gender Section score1 score2 totalsum  F A . . .  F B . . . F C . . . M D . . .
7.

What is the meaning of STOP and OUTPUT statements in SAS?

Answer»
  • STOP STATEMENT: Using STOP, SAS immediately stops processing the current DATA step and resumes processing statements after the current DATA step ends. In other words, the STOP statement halts the execution of all statements containing it, including DO statements and looping statements.

Syntax:  STOP;

Example: As demonstrated in this example, STOP is used to avoid an infinite loop when using a random access method within a DATA step: 

data sample; do developerobs=1 to engineeringobs by 10; set master.research point=developerobs nobs=engineeringobs; output; END; stop; run;
  • OUTPUT Statement: Output tells SAS to write the current observation immediately to a SAS data set, not at the end of the DATA step. The current observation will be written to all data sets NAMED in the DATA statement if there is no data set name specified in the OUTPUT statement.

Syntax: OUTPUT <data-set-name(s)>;

Example: Each line of INPUT data can be used to CREATE two or more observations. As given below, for each observation in the data set Scaler, three observations are created in the SAS data set Result. 

data Result(drop=time4-time6); set Scaler; time=time4; output; time=time5; output; time=time6; output; run;
8.

Explain what is first and last in SAS?

Answer»

SAS Programming always uses the BY and SET statements to group data based on the order of grouping. When both BY and SET statements are used together, SAS automatically creates two temporary variables, FIRST. and LAST. 'SAS' identifies the first and last observations of a group based on the values of the FIRST. and LAST. variables. These variables are always 1 or 0, depending on the following CONDITIONS

  • FIRST.VARIABLE = 1 if an observation of a group is the first ONE in a BY group.
  • FIRST.variable = 0 if observation of group is not the first one in a BY group.
  • LAST.variable = 1 if observation of group is the last one in a BY group.
  • LAST.variable = 0 if observation of group is not the last one in a BY group.

Essentially, SAS stores FIRST.variable and LAST.variable in a PROGRAM data vector (PDV). As a result, they become available for DATA step processing. However, SAS will not add them to the output data set since they are temporary. 

Example: In the following example, ID is a grouping variable containing duplicate entries. When FIRST.variable = 1 and LAST.variable = 1, it means that there is only a SINGLE value in the group like ID=4, ID=6 and ID=8 as shown below: 

9.

Consider the following expression stored in the variable address: 9/4 Infantry Marg Mhow CITY, MP, 453441

Answer»

In the FOLLOWING scenario, what WOULD the scan function RETURN?  

x=scan(address,3);

In the above program, we have used the scan function to read the 3RD word in the address string. The following output will the returned by the scan function:  

x=Marg;

10.

What do you mean by the Scan function in SAS and write its usage?

Answer»

The SCAN() function is typically used to extract words from a value marked by delimiters (characters or special signs that separate words in a TEXT string). The SCAN function selects individual words from text or variables containing text and stores them in new variables. 

SYNTAX

scan(argument,n,delimiters)

In this case, 

  • Argument: It specifies the character variable or text to be scanned.
  • N: The number n indicates which word to read.
  • Delimiters: These are characters values or special signs in a text string.

Example:  

Consider that we would like to extract the first word from a sentence 'Hello, Welcome to Scaler!'. In this case, the delimiter used is a blank. 

data _null_; string="Hello, Welcome to Scaler!"; first_word=scan(string, 1, ' ' ); put first_word =; run;

First_word returns the word 'hello' SINCE it's the first word in the above sentence. Now, consider that we would like to extract the last word from a sentence 'Hello, Welcome to Scaler!'. In this case, the delimiter used is a blank. 

data _null_; string="Hello, Welcome to Scaler!"; last_word=scan(string, -1, ' ' ); put last_word =; run;

Last_word returns 'Scaler!' As Scaler is the last word in the above sentence.  

11.

State difference between Missover and Truncover in SAS.

Answer»
  • Missover: The INPUT STATEMENT does not jump to the next line when the Missover option is used on the INFILE statement. If the INPUT statement cannot read the entire field specified due to the field length, it will set the value to missing. The variables with no values assigned are set to missing when an INPUT statement reaches the end of an input data record.

Example: An external file with variable-length records, for example, contains the FOLLOWING records: 

1 22 333 4444 55555

Following are the steps to create a SAS data set using these data. The numeric informat 5 is used for this data step and the informatted length of the variable NUM is matched by only one input record. 

data readin; infile 'external-file' missover; input NUM 5.; run; PROC print data=readin; run;

Output: 

Obs ID 1 . 2 . 3 . 4 . 5 55555

Those values that were read from input records that were too short have been set to missing. This problem can be corrected by using the TRUNCOVER option in the INFILE statement:

  • Truncover: This option assigns the raw data value to the variable, even if it is shorter than what the INPUT statement expects. 

Example:  

An external file with variable-length records, for example, contains the following records: 

1 22 333 444455555

Following are the steps to create a SAS data set using these data. The numeric informat 5 is used for this data step.

data readin; infile 'external-file' truncover; input NUM 5.; run; proc print data=readin; run;

Output: 

Obs ID 1 1 2 22 3 333 4 4444 5 55555

Those values that were read from input records that were too short are not set to missing.

12.

What is PDV (Program Data Vector)?

Answer»

LOGICAL areas of memory where SAS builds data sets, one OBSERVATION at a time are called Program data vectors (PDVs). Whenever a program is executed, SAS usually reads data values from the input buffer or generates them based on SAS language STATEMENTS and assigns these data values to specific or RESPECTIVE VARIABLES in the program data vector. The program data vector also includes two automatic variables i.e., _N_ and _ERROR_ variable. 

13.

What is the use of Retain in SAS?

Answer»

SAS, at the start of each iteration of the data step, reads the data STATEMENT and puts the missing values of variables (assigned either through an INPUT statement or VIA an assignment statement within the data step) into the program data vector (logical areas of memory). RETAIN statements OVERRIDE this default. In other words, a RETAIN statement instructs SAS not to set variables to missing when moving from ONE iteration of the data step to another. The variables are instead retained.

Syntax:

RETAIN variable1 variable2 ... variablen;

There are no limits to the number of variables you can specify. When you do not specify variable names, SAS retains the values of EVERY variable that was created in INPUT or assignment statement by default.

14.

Write down some capabilities of SAS Framework.

Answer»

SAS Framework has the following four capabilities:

  • Access Data: Data accessibility is a powerful SAS capability. In other words, data can be accessed from different sources including raw databases, excel files, Oracle databases, SAS datasets, ETC.
  • Manage Data: SAS offers additional capabilities including data management. Data accessed from a variety of sources can thus be managed easily in order to generate useful insights. The process of managing data can include creating variables, validating data, cleaning data, creating subsets, etc. SAS manages the existing data to provide the data that you need.
  • Analyze Data: SAS will analyze the data once it has been managed to perform simple evaluations like frequency and AVERAGES, along with more complex evaluations like forecasting, regression, etc.
  • Present: The ANALYZED data can be saved and stored as a GRAPHIC report, a list, and overall statistics that can be printed or published. They can also be saved into a data FILE.
15.

What are the essential features of SAS?

Answer»

SAS has the following essential features:

  • SAS offers extensive support for programmatically transforming and analyzing data in comparison to other BI (Business Intelligence) tools,
  • SAS offers extensive support for programmatically transforming and analyzing data in comparison to other BI (Business Intelligence) tools.
  • Furthermore, SAS is a platform-independent software, which means it can run on almost any OPERATING system, including Linux, Windows, Mac, and Ubuntu.
  • It provides very fine control over data manipulation and analysis, which is its USP.
  • The SAS package provides a complete data analysis solution, ranging from simple figures to advanced analysis. One of the best features of SAS software is its Inbuilt Library, which contains all the necessary packages for data analysis and reporting.
  • The reports can be visualized in the form of graphs that RANGE from simple SCATTER plots and bar graphs to complex multi-page CLASSIFICATION panels.
  • Another FEATURE of SAS is its support for multiple data formats. With SAS, you can read data from a variety of file types, formats, and even from files with missing data.
  • Since SAS is a 4GL (4 Generation Programming Language), it has an easy-to-learn syntax, which makes it an essential feature.
16.

Why choose SAS over other data analytical tools?

Answer»

Listed below are a few reasons to choose SAS over other data analysis tools:  

  • Learning and using SAS is very easy as compared to other analytics software tools. It has a better and more stable Graphic USER Interface (GUI) and OFFERS an easy option (PROC SQL) for users who are already familiar with SQL.
  • Every day, data is growing and securing data becomes more complicated. SAS is very capable of storing and organizing large amounts of data smoothly and reliably.
  • In the corporate world and large companies, SAS is often USED, as it is more professional and easier to use compared to other languages. SAS jobs abound all over the market.
  • SAS provides adequate graphical functionality. However, it provides limited customization options.
  • Since SAS is LICENSED software and its updates are released in a controlled environment, all of its features have been thoroughly tested. As a result, there are fewer chances of errors.
  • The customer service and technical support provided by SAS are outstanding. In any case, if a user runs into technical difficulties during installation, they will receive immediate assistance from the team.
  • With its high level of security in TERMS of data privacy, SAS is a recognized and trusted name in the enterprise market.