1.

Brief about the few optimizing techniques for the Hive performance.

Answer»

LDAP and Active Directory are providing a centralized security system for MANAGING both servers and users, It is managing for all user accounts and associated privileges for your employee. Kerberos is handled Authentication it means when a user trying to connect any Hadoop services, Kerberos will authenticate the user first then it will authenticate service too. when you are considering AD, LDAP and Kerberos in this scenario Kerberos will only provide authentication, all Identity Management is handled outside of Kerberos that is in AD and LDAP.    

In the high level when a new employee joins, his/her id has to be added in Active directory first then LDAP and Kerberos because AD is a directory service, owned by Microsoft and AD supports several standard protocols such as LDAP and Kerberos. 

LDAP and AD communicating with each other based on what user ID BELONGS to which group, for example, user Bibhu is a member of which groups and what kind of access permission he is having in different directories or files. These are the information is managed differently in AD and Linux system. In Windows, we have a concept called SID or Window security identifiers and in Linux, we do have a User ID or Group ID. SSSD can use the SID of an AD user to algorithmically generate POSIX IDs in a process called ID mapping. ID mapping creates a MAP between SIDs in AD and UID/GID on Linux.

AD can create and store POSIX attributes such as  uidNumber, gidNumber, unixHomeDirectory, or login Shell

There are two ways to mapping these SID and UID/GID using SSSD.

  • To connect AD and LDAP using SSSD you can add the below line in sssd.conf file

ldap_id_mapping = true 

  • POSIX permissions are the standards that define how Unix interacts with applications. POSIX stands for Portable Operating System Interface. we need to configure for each user in AD-related to each UID/GID of LDAP using POSIX. especially when we have several domains. In this case, we write following in sssd.conf  to Disable ID Mapping in SSSD

ldap_id_mapping = False

Below are few concepts need to know to understand the Integration of AD/LDAP/Kerberos

  • There is no such built-in authentication mechanism in LINUX. You can find password details in /etc/passwd file.
  • There are two important modules that are having an important role in providing security features at Linux level  1. PAM 2.NSS

PAM: PAM stands for pluggable authentication Module, which allows integration of authentication technology such as Unix, Linux, LDAP, etc into system services such as password, login, ssh, etc. alternatively When you're prompted for a password, that's usually PAM's doing. PAM provides an API through which authentication requests are mapped into technology-specific actions. This kind of mapping is done by PAM configuration files. Authentication mechanism is providing for each service.

NSS: NSS uses a common API and a configuration file (/etc/nsswitch.conf) in which the name service providers for every supported database are specified. Here Names include hostnames, usernames, group names such as /etc/passwd, /etc/group, and /etc/hosts.

  • Below are the few components related to Kerberos.
    • Key Distribution Center(KDC): It's a Kerberos server which contains encrypted database where it stores all the principal entries related to the user, hosts, and services including domain or Realm information
    • Authentication Server: Once a user successfully authenticates the Authenticate server,  an Authenticate server grants TGT to the client. The principal will use the TGT and request access for the Hadoop service
    • Ticket granting server: Ticket granting server validates a TGT in return grant the service ticket to the client, which the client can use to access the Hadoop service
    • Keytab File: It's a secure file which contains the password of all the service principal in a domain
    • Realm: A realm is the Domain name which has to mention in Upper case letters. For example HADOOP.COM
    • Principal: A principal may be a user or service or host which is part of the Realm. For example pbibhu@HADOOP.COM
  • NTP(Network Time Protocol): It is an internet protocol that is used for synchronizing the computer clock time in a network alternatively NTP client initiates a time request exchange with the NTP server.
  • PAM_SSS: Kerberos(pam_sss): One of the design principles of SSSD’s PAM module pam_sss was that it should not do any decisions on its own but let SSSD do them. pam_sss cannot decide which type of password prompt should be shown to the user but must ask SSSD first. Currently, the first communication between pam_sss and SSSD’s PAM responder happens after the user entered the password. Hence a new request, a pre-authentication request, to the PAM responder must be added before the user is prompted for the password.
  • NSS_SSS: LDAP(nss_sss_) SSSD provides a new NSS module as nss_sss, so that you can configure your system to use SSSD to retrieve user information.  

Below are 3 ways of integrating Linux with AD for Authentication

  1. Using LDAP/Kerberos PAM and NSS Module
  2. Using Winbind
  3. Using SSSD that is system services daemon for Integrating with Active Directory

Let’s understand clearly:

1. Using LDAP/Kerberos PAM and NSS Module:

PAM is configured to use Kerberos for authentication and NSS is to use the LDAP protocol for querying UID or GID information.  nss_ldap, pam_ldap, and pam_krb5 modules are available to support.

Here Problem is no caching of the credentials and there is no such offline support available here. 

2. Using Winbind:

Samba Winbind was a traditional or USUAL way of connecting Linux systems to AD. Basically, Winbind copy a Windows client on a Linux system and is able to communicate to AD servers alternatively we have winbind daemon which will receive calls from PAM and NSS, Once it is received it will translate into corresponding Active directory calls using either LDAP, KERBEROS or Remote protocol(RPC) depending on the requirement. The current versions of the System Security Services Daemon (SSSD) closed a feature gap between Samba Winbind and SSSD so Samba Winbind is no longer the first choice in general. 

3. Using SSSD that is system services daemon for Integrating with Active Directory:

The System Security Services Daemon (SSSD)  is an intermediary between local clients and any Remote Directories and Authentication Mechanism. The local clients connect to SSSD and then SSSD contacts the external providers that are AD, LDAP server. So here SSSD is working as a Bridge which will help you to Access the AD, LDAP.

Basically System authentication is configured locally which means initially services check with a local user store to determine users and credentials. SSSD allows a local service to check with local cache in SSSD so Local cache information might have taken from an LDAP directory or AD or Kerberos Realm. 

Below are the few advantages related to SSSD

  • It will reduce the load on identification/authentication servers. Despite connecting AD or LDAP directly, all of the local clients can contact SSSD which can connect to the identification server or check its cache.
  • Permitting offline authentication. SSSD also caches those users and credentials, so that user credentials are still available if the local system or the identity provider goes offline.
  • Using a single user account. Usually, Remote users have two (or even more) user accounts, such as one for their local system and one for the organizational system. In this scenario, It is necessary to connect to a virtual private network (VPN). Because SSSD supports caching and offline authentication, remote users can connect to network resources simply by authenticating to their local machine and then SSSD maintains their network credentials. 

sssd daemon provides different services for different purposes. We have a configuration file called sssd.conf which determines what tasks sssd can do. The file has 2 main parts as we can see here:

[sssd]

domains = WIN.EXAMPLE.COM
services = nss, pam
config_file_version = 2

[domain/WINDOWS]

id_provider = ad
auth_provider = ad
access_provider = ad

In the first part, we have clearly mentioned that what services on the system must use sssd, here in the above example nss and Pam has mentioned. The second part, domain/WINDOWS defines directory services also called identity provider for example AD, LDAP server. SSSD connecting AD/LDAP for querying the information, authentication, password change, etc. 

In brief below are the steps how SSSD is working or brief about the above diagram

  • Once user id and password have provided LIBC opens the nss_sss module as per the nsswitch.conf and passes the request
  • The nss_sss memory-mapped cache is consulted first to check the user id and corresponding password
  • If not found in nss_sss cache the request is passed to the sssd_nss module or SSSD
  • Then sssd_nss checks the SSSD on-disk LDB cache, All cache files are named for the domain. For example, for a domain named example LDAP, the cache file is named cache_exampleldap.ldb. If the data is present in the cache and valid, the nss responder returns it
  • If the data is not present in the LDB cache or if it is expired then it connects to the remote server and runs the search, here Remote server indicates AD or LDAP
  • The sssd.conf is configured with multiple domains; “domains = AD, LDAP”.
  • Active Directory is searched first, and if not found
  • LDAP searched next
  • When the search is finished the LDB cache is updated
  • The sssd_be or SSSD back end control provide signals back to the NSS responder to check the cache again.
  • The sssd_nss responder returns the cached data. If there is no data in the cache then no data is RETURNED.


Discussion

No Comment Found

Related InterviewSolutions