This document describes how Kerberos security is set up on Hadoop clusters with Centrify DirectControl Agent.
Hortonworks clusters are managed by Apache Ambari. The following instructions assume Apache Ambari 1.x or 2.x is available.
Hortonworks recommends the cluster node with NameNode role be the master node.
Enabling Kerberos security with Active Directory on Hortonworks clusters without Centrify can be painful, as illustrated in [1].
To automate creation of Hadoop service principals when enabling Kerberos security:
Join all cluster nodes to Active Directory using Centrify DirectControl Agent.
Get the CSV file from Apache Ambari. On Ambari UI, click [Admin] -> [Security] -> [Enable Security]. Follow the steps and download the CSV file.
Configure hadoop.conf for the automation script, e.g.:
hadoop.service.container: ou=Hortonworks,ou=Hadoop
hadoop.cluster.shortname: hdp1
Run the automation script on a cluster node (the master node is highly recommended):
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --create
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --deploy
On Ambari UI, complete the Enable Security operation. Hadoop services will be restarted.
To automate cleanup of Hadoop service principals after disabled Kerberos security:
On the same cluster node where Hadoop service principals were created, run the automation script:
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --undeploy
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --delete
Cloudera clusters can be managed by Cloudera Manager. The following instructions assume Cloudera Manager is available.
Cloudera Manager manages all per-host service principals and their Kerberos keytab files. For instance, per-host service principals are generated automatically when Kerberos is enabled. The only service principal needs to be generated manually is the hdfs principal which shared to all cluster nodes.
Cloudera Manager has a wizard to help enable Kerberos security [1]. Customers can also configure Kerberos security manually on Cloudera Manager without wizard [2]. Note that there is no wizard provided to disable Kerberos security. From Cloudera community site [3], customers will need to work their way back manually on Cloudera Manager.
For clusters not managed by Cloudera Manager, the automation script should be able to help create and distribute Kerberos keytab files for all per-host service principals. But this requires the CSV file to specify the Kerberos keytab files required. Moreover, each Hadoop service will need to be manually configured to enable Kerberos security.
To automate creation of Hadoop service principal hdfs when enabling Kerberos security:
Join all cluster nodes to Active Directory using Centrify DirectControl Agent.
Prepare CSV file manually, e.g.:
cdh1-cent64-1.example.com,HDFS User,hdfs@EXAMPLE.COM,hdfs.keytab,/etc/security/keytabs,hdfs,hadoop,440
cdh1-cent64-2.example.com,HDFS User,hdfs@EXAMPLE.COM,hdfs.keytab,/etc/security/keytabs,hdfs,hadoop,440
Configure hadoop.conf for the automation script, e.g.:
hadoop.service.container: ou=Cloudera,ou=Hadoop
hadoop.cluster.shortname: cdh1
Run the automation script on a cluster node (the master node is highly recommended):
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --create
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --deploy
Note that the existence of Kerberos credential cache /tmp/krb5cc_cm_agent might interfere adkeytab. Please refer to doc/FAQ for detail.
Stop all Hadoop services and Cloudera management services.
Create symlinks for the following LDAP CLIs (only on the cluster node with Cloudera Manager installed):
/usr/bin/ldapmodify -> <centrifydc-install-path>/bin/ldapmodify
/usr/bin/ldapsearch -> <centrifydc-install-path>/bin/ldapsearch
Because Cloudera Manager will call these LDAP CLIs to execute Import Kerberos Account Manager Credentials and Generate Credentials operations in [Administration] -> [Kerberos].
Note that Cloudera Manager requires LDAP over SSL (also known as LDAPS and LDAP over TLS) to execute the Kerberos operations mentioned above. LDAP over SSL is supported by LDAP CLIs (e.g. ldapmodify, ldapsearch) shipped with Centrify DirectControl Agent 5.2.2 or later. To enable LDAP over SSL, please refer to the procedures shown below. The commands in detail can be found in section Enabling encrypted communication of chapter Using Centrify OpenLDAP proxy service in Centrify Server Suite Administrator's Guide for Linux and UNIX.
Active Directory:
Cluster node with Cloudera Manager:
Run wizard from Cloudera Manager web UI to enable Kerberos. The wizard will ask for information like realm name and the required credential (e.g. Administrator@REALM). Here is an example:
KDC Type: Active Directory
Active Directory Suffix: ou=cloudera,ou=hadoop,DC=example,DC=com
Kerberos Security Realm: EXAMPLE.COM
Active Directory Account Prefix: cdh1-
Also uncheck the Manage krb5.conf through Cloudera Manager option as Centrify DirectControl Agent will manage krb5.conf.
Click Generate Credentials button on Cloudera Manager web UI to generate all per-host service principals and accounts. AD objects will be created for Hadoop services (e.g. cdh1-AGOjxmWYHQ).
Also, Kerberos keytab files will be found managed by Cloudera Manager Agent on all cluster nodes, e.g.:
/var/run/cloudera-scm-agent/process/357-cloudera-mgmt-HOSTMONITOR/hue.keytab
/var/run/cloudera-scm-agent/process/356-cloudera-mgmt-SERVICEMONITOR/hue.keytab
/var/run/cloudera-scm-agent/process/354-cloudera-mgmt-REPORTSMANAGER/hdfs.keytab
/var/run/cloudera-scm-agent/process/351-oozie-OOZIE_SERVER/oozie.keytab
...
Start all Hadoop services and Cloudera management services.
To automate cleanup of Hadoop service principal hdfs after disabled Kerberos security:
On the same cluster node where Hadoop service principal hdfs was created, run the automation script:
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --undeploy
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --delete
Note that the existence of Kerberos credential cache /tmp/krb5cc_cm_agent might interfere adkeytab. Please refer to doc/FAQ for detail.
MapR clusters are managed by MapR Control System (MCS). However, MCS web UI does not provide a way to enable Kerberos security. Therefore each Hadoop service will need to be manually configured.
Moreover, MapR has its own security architecture for users and core services (e.g. CLDB, MapR file system, YARN). Thus Kerberos security is available for some Hadoop services only (e.g. HBase). Please refer to section Security Protocols Listed by Component in [1] for Hadoop services which can enable Kerberos security.
To automate creation of Hadoop service principals when enabling Kerberos security:
Join all cluster nodes to Active Directory using Centrify DirectControl Agent.
Shut down your cluster. Please refer to [2].
Decide Hadoop services which require Kerberos security and prepare CSV file manually.
To decide which Hadoop services require Kerberos security, please refer to [1]. Kerberos security will be enabled on CLDB and HBase in this example.
To prepare CSV file, please refer to the documents on configuring Kerberos for each service. For CLDB, please refer to [3]. For HBase, please refer to [4]. Here is an example:
mpr1-cent64-1.example.com,CLDB,mapr/mymapr1@EXAMPLE.COM,cldb.keytab,/opt/mapr/conf,mapr,mapr,400
mpr1-cent64-1.example.com,HBase,mapr/mpr1-cent64-1.example.com@EXAMPLE.COM,hbase.keytab,/opt/mapr/conf,mapr,mapr,400
mpr1-cent64-2.example.com,HBase,mapr/mpr1-cent64-2.example.com@EXAMPLE.COM,hbase.keytab,/opt/mapr/conf,mapr,mapr,400
Modify configuration files for each service which requires Kerberos authentication on corresponding nodes in the cluster. For CLDB, please refer to [3]. For HBase, please refer to [4].
Configure hadoop.conf for the automation script, e.g.:
hadoop.service.container: ou=MapR,ou=Hadoop
hadoop.cluster.shortname: mpr1
Run the automation script on a cluster node (the master node is highly recommended):
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --create
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --deploy
Enable security features on the cluster (Please refer to [5]).
To automate cleanup of Hadoop service principals after disabled Kerberos security:
On the same cluster node where Hadoop service principals were created, run the automation script:
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --undeploy
perl kerberos_security_setup.pl --input host-principal-keytab-list.csv --delete